Talk at @OpenDataWeek in Marseille focused on how technology can power discoverability and interoperability and why they are important. Showcases CKAN's search and discovery functionality, harvesting abilities and data catalog interoperability protocol.
5. The point is re-use, democratization and
enabling individuals to drive:
● innovation & the economy
● government transparency &
accountability
Context:
● Explosion of digital information
● Ever improving information technology
Open data is not a hobby
it's a movement
6. Valuable core datasets not open
Data often buried in government web pages
Not machine readable, bad quality data
Hard to find, connect and link data
What problems are we facing?
7. Open Solution
Step 1: get the data openly licensed
Step 2: make it accessible - metadata,
formats, portal (ckan.org)
Step 3: start building, linking and turning data
into something more - information
8. CKAN at a glance
Metadata catalog
Data management platform
Open Source - everything developed on GitHub
Free community and paid support available
Extensible flexible componentized architecture
Developer friendly
Rich JSON API
Interoperable
9. CKAN serves two main use cases
1. search and discoverability for re-users of data
2. data management tools for publishers
11. Search and discovery
Online home for data
Central keyword search
Facet by tags, location, format, licence...
Browse by groups, keywords, publishers
Link to datasets or data directly
Previews and data exploration
16. Harvesting and normalization
Get metadata from external catalogs and
endpoints
CKAN will parse, validate and normalise to
create metadata records that look the same to
end users no matter where they came from
We can currently harvest: other CKAN
catalogs, CSW endpoints and WAFs serving
ISO 19139 documents + others
23. More generally: federation
Search across numerous different catalogs in
aggregator sites (such as publicdata.eu)
Data Catalog Interoperability Protocol: http:
//spec.datacatalogs.org/