This document summarizes the Rapid Assembly of Geo-Centred Linked Data Applications (RAGLD) project. RAGLD is building tools to enable developers to make greater use of linked data by integrating and transforming geospatial and statistical data. The project involves developing components for data integration, visualization, spatial and statistical querying, and workflow management. Feedback is being gathered to refine the design of these components and services.
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Introduction to RAGLD
1. Rapid Assembly of Geo-Centred Linked Data
Applications
Rapid Assembly of Geo-Centred Linked Data
Applications
Lucy Diamond, Research Scientist, Ordnance Survey
18/04/2012
2. About RAGLD
• A collaborative project between Ordnance Survey, the University of
Southampton and Seme4
• Part-funded by the Technology Strategy Board‘s “Harnessing Large
and Diverse Sources of Data” programme
• 18 month long project. Started Oct 2011. Due to complete March
2013
• Building tools to enable developers to make greater use of linked
data
3. RAGLD builds on
See UK (http://apps.seme4.com/see-uk/ )
(and the sameAs service (www.sameas.org ))
4. Feedback wanted!
• The designs of RAGLD components and services make use of the
user requirements that we gathered from our questionnaire
responses. User requirement summaries from the questionnaire
responses can be seen on www.ragld.com
• The purpose of this presentation is to run through the basic
descriptions for the components and services that will be built for
RAGLD, and to get feedback on their potential usefulness or
applicability for linked data projects and activities
5. Project Milestones
Text in pink for Work Packages completed or well under way (08/05/2012)
• WP 1 - User requirements survey, development of
design principles, identification of data
sources, high-level architecture designs
• WP 2 – Data integration components and services
• WP 3 – Data-enabled components and services
• WP 4 – Development of a technical demonstrator
based on UK crime data analysis
• WP 5 – Engagement (stakeholder interviews and
design feedback), dissemination and
exploitation
6. RAGLD High Level Component Architecture
Accessing Data
SPARQL Endpoint Normalisation Identity Management
Tools/Services Tools/Services
Resolvable URIs Linked Data API
Infrastructure Services
Data Enhancement Visualisation Relationship
Tools/Services Tools/Services Tools/Services
Data Sources
Aggregation Interpolation Spatial Operations Orchestration Mediation Metrics
All of the components are able to interface with each other through a common interface
specification, and can therefore be orchestrated by the infrastructure service to create
workflows to fulfil the use cases identified from the user requirements analysis.
8. Reconciliation Service
• Reconciliation service for spreadsheets/Google Refine to recognise
common codes/identifiers and translate to appropriate Linked Data
URIs
• In order to get into the Linked Data world, it is necessary to get from
strings that identify things roughly to URIs that identify things
properly.
9. 48M URIs
17M Distinct
The Web of Data has many
equivalent URIs. This
sameAs service helps you
manage co-refs between
different data sets.
10. Relationship management services
• sameAs - Enter a known URI, get back list of equivalent URIs
• differentFrom - This is a partner service to sameAs - when
anything is retracted from the sameAs service, it should be asserted
into the differentFrom service. Before asserting into the sameAs
service, the differentFrom should be consulted, and a warning or
rejection given.
• Co-reference identification services - Link-finding and co-
reference, discovering relationships between datasets
• More Generic Relationships - Generalisation of sameAs type
service to store other kinds of relationship. Examples may include
Contains, Within, Touches, Part Of, Overlaps, Near
12. Spatial Query Services
• Bounding box containment - An index service which will allow
efficient queries to be made identifying coordinate points that reside
within a given Bounding box. More sophisticated version potentially
understands types of entities, e.g. “Find me the postcodes in this area, or the
wards in this area”, etc.
• Geometric queries - E.g. Coordinates to wards. Queries involving
one type of geometry (e.g. point) to another (e.g. polygon), nearest
services, co-ordinate data points that lies within given polygon
intersection of two polygons, transect line
“Which areas are contained within / touch / overlap with another given area?”
“Is this point within any other spatial area?”
“What is within this area of interest that I have defined?”
“Can I generalise a larger area from these smaller?”
• Free text search - This provides an auxiliary service, that indexes the
store, and then enables pure text searches to be done over the
external service.
13. Dataset transformation services
• Co-ordinate transformation - E.g. convert lat long to National Grid
• Statistical Transformations - Transformations such as changing
units, and normalisations e.g. by population or area for regions.
Aggregation and interpolation of region-based statistics
“I have dataset expressed as X per Y but want it as X per Z”
“I have population for wards but want to find population by school
catchment area “
• Geography to Geography (one set of abstract areas into
another)
- Convert between different levels of geography e.g. “Give me all the
deaths in Hampshire if I know the deaths in all of the settlement
regions”.
- An enactment of statistical transformation operations.
14. Visualisation
• Map showing regions of a specific type (e.g. Ward, LSOA)
• Map showing coordinate points from a dataset as pins
• Map showing region-based statistics with appropriate colouring
• Area selection widget (in conjunction with geometric query)
• Graph-based visualisations of dataset features
• Pop-up “Info box” - give generic information about a point/area
• Display custom regions / boundaries on map
15. Workflow management
• Enactment engine
Workflow style activity, using scripts that are manageable, editable and run-able at
will. Means to extract data for the RAGLD services from other services
• Dashboard
(display components and services available, invocating sequence of services)
This component gives the user a web view that enables them to observe and manage
the RAGLD installation, configuring components, where does the data come from, go
to, what services should be used, trouble spots...
• File/resource management
(interacts with the Dashboard)
RAGLD will move data between services, stores and files, transforming it as it goes.
Therefore, the user of a RAGLD installation needs the system to keep track of where
the data is, data versions, and data relationships.
16. RAGLD contact for further information
Mark Pendlington,
Project Leader, RAGLD
mark.pendlington@ordnancesurvey.co.uk
Research
Ordnance Survey
Adanac Drive
SOUTHAMPTON
United Kingdom
SO16 0AS
Phone: +44 (0)2380055771