Datopian Tech 👩💻
We are experts in data management and engineering
This is an overview of our technology
# Data Management Systems
A Data Management System (DMS) is a framework. It can be used to create a variety of solutions such as Data Portals, Data Catalogs, Data Lakes etc. We have developed two DMS stacks that share a set of underlying core components:
- CKAN: the open source data management system we created in 2007 and that we continue to develop and maintain. The main information on CKAN is at https://ckan.org/ (opens new window). Here we have some specific notes on how we develop and deploy CKAN as well as our thoughts on the next generation of CKAN (v3).
- DataHub: our version of a next generation of CKAN that powers DataHub.io (opens new window). DataHub and CKAN v3 share many of the same core components. We are actively working on a DataHub v2 and an outline can be found here https://github.com/datopian/datahub-next (opens new window)
You can use a DMS to build many kinds of specific solutions
- Data Portals are gateways to data. That gateway can be big or small, open or restricted. For example, data.gov (opens new window) is open to everyone, whilst an enterprise “intra” data portal is restricted to its personnel.
- Data Catalog: see https://ckan.org/ (opens new window)
- Metadata manager: see Publishing
- Data Lake: you can use a DMS to rapidly create a data lake using existing infrastructure. For example, using the DMS’ catalog and storage gateway with existing cloud storage and data processing capabilities.
- Data Engineering: you can use components of the DMS to rapidly create, orchestrate and supply data pipelines.
A DMS has a variety of features. This section provides an overview and links to specific feature pages that include details of how they work in CKAN and CKAN v3 / DataHub.
There are many ways to break down features and this is just one framing. We are thinking about others and if you have thoughts please get in touch.
- Discovering and showcasing data (catalog and presenting)
- Views on data including visualizing and previewing data as well Data Explorers and Dashboards
- Publishing data
- Data API DataStore
- Permissions and Authentication
A DMS has the following key components:
- Data Flows and Factory
The Frictionless approach to data. See https://frictionlessdata.io/ (opens new window)
Our team created this whilst at Open Knowledge Foundatioin and continue to co-steward it.
# Developer Experience
Service Reliability Engineering (SRE) and Developer Experience (DX) for our CKAN cluster technology.