1. Exploiting the UNSDI Spatial Identifier Reference
Framework (SIRF) in Australia:
Innovations and Policy Implications
Paul Box, Rob Atkinson, David Lemon & Laura Kostanski
CSIRO
Canberra, 20 - 22 November 2012
CSIRO DIGITAL PRODUCTIVITY AND SERVICES FLAGSHIP
2. Overview
•
•
•
•
•
Why? – project drivers
What? – the solution
Where? – Indonesia and Australia
How? – Innovations
Where next? - Policy implications
2 | UNSDI Spatial Identifier Reference Framework | Paul Box
3. The problem
• Large scale complex interwoven challenges
7,910
• ‘Big data’ - the information tsunami
Exabytes ( 1m Gb)
1,227
• ‘Glocalisation’
130
• Rapid Information integration
2005
2010
2015
• Highly spatially & temporally
(Source: EMC)
variable phenomena
Over the next
decade, the number
of "files,“ or containers
for Information will grow by
75x
(source: EMC)
3 | UNSDI Spatial Identifier Reference Framework | Paul Box
4. Social Protection in Indonesia
• Multi-sectoral information
•
•
•
•
•
Reliable
Uptodate
Timely
Integrated
Useable
Social Protection
preventing, managing, and overcoming situations that
adversely affect people’s well being[1]
- policies & programs to reduce poverty / vulnerability
- reducing exposure, enhancing capacity to manage risks
1United Nations
Research Institute For Social Development
4 | UNSDI Spatial Identifier Reference Framework | Paul Box
5. Integration realities - Information Silos
System
3
System
7
Everything
Happens Somewhere
System
1
System
n
System
2
$
$
$
System
5
$
System
4
$
$ $
Use
Discover
Access
Extract, Transform, Load
Time and effort
5 | UNSDI Spatial Identifier Reference Framework | Paul Box
Understand
6. Spatial identifiers describe ‘the Somewhere’
Geospatial information
Statistical information
(Implicitly geospatial)
Spatial
identifiers
BPS-ID
GER ‘08
Tpop’10
003
Bureau of Stats - 003
Name
Nusa Tenggara Barat
111.08
1,318,840
005
Nusa Tenggara Timur
112.09
335,805
Spatial Data Infrastructure (SDI)
• Fundamental component of spatial datasets
• Used to reference data
6 | UNSDI Spatial Identifier Reference Framework | Paul Box
West Nusa Tenggara
7. One real world feature - multiple representations
Geospatial information
Gazetteer ID - 002234
Spatial
Identifier
Reference
Framework
Statistical information
(Implicitly geospatial)
UNSTATS
Name
GRP’08 $
IND03
NTB
8,080
IND05
NTT
4,769
BPS-ID
GER ‘08
Tpop’10
003
Bureau of Stats - 003
Name
Nusa Tenggara Barat
111.08
1,318,840
005
Nusa Tenggara Timur
112.09
335,805
Spatial Data Infrastructure (SDI)
Multiple - names, identifiers, geometries, versions
7 | UNSDI Spatial Identifier Reference Framework | Paul Box
8. Gazetteer – a special case of Spatial Identifier
GAZETTEER
ID, placename(s), feature type, location
• Official list of names
• Related to mapping process (toponymic)
• Used for map lookup
Melbourne – locality – Victorian Gazetteer –Official
• Names are ambiguous
• One name – many places
• One place - many names
• Australia, Australie,
• Wollongong, ‘the gong’
• Sydney, City of Sydney
Melbourne – municipal council boundaries – official
9. Spatial Identifiers – an index for SDI
Foundation spatial
data themes
• Addressing
• Administrative Boundaries
• Positioning
• Place Names
• Land parcel & Property
• Imagery
• Transport
• Water
• Elevation and Depth
• Land cover
SPATIAL IDENTIFIER SETS
Post codes, locality names
Admin area codes/names
Trig points
Gazetteer
Plots and Parcels
Image tile index
Roads and Bridges
AHGF (Geofabric) features
Sensor networks
Cell towers
9 | UNSDI Spatial Identifier Reference Framework | Paul Box
10. Spatial identifiers
•
•
•
•
•
Overlap and duplication
Heterogeneity – everyone does them differently
Fragile, unreliable
Limited access mechanisms
Used out of context
• Disconnected from underlying geospatial data
• Limited metadata provenance/authority
10 | UNSDI Spatial Identifier Reference Framework | Paul Box
11. The solution
• Infrastructure to register, link, and deliver spatial identifiers
Spatial Identifier Reference Framework
fundamental, systemic improvement in information
integration, enabling more effective and cost-efficient service
delivery
Use
Discover Access, Extract Transform Load
Discover
Access
Extract, Transform, Load
Use
Understand
Time and effort
Understand
• Leverage national SDI efforts
• Governance
• Information
• Technologies
11 | UNSDI Spatial Identifier Reference Framework | Paul Box
Provide
stable SI
Link
information
resource
Link
multiple
reps
12. SIRF – spatial identifiers for the geo-semantic web
Agency A
Statistics
Agency B
Treasury
Agency C
Welfare
API
API
API
Statistical
Information
4. Deliver as URI and build into
3.Enable access ’Spatial crossUsers
Downloaded identifiers
6.5.Mint identifiers & embedded in
2.Harvestseamless integration of
1.
spatial identifiers from
Bookmarks’ & can connect back
the Linked Data Web statistical to
system and spatial and back
statisticalto reference information
walks
geospatial data sets
to underlying interfaces)
SDI (standard data
ininformation Web
Linked Data
User
Linked
Data
Web
http://id.data.gov.au/id/AusGaz2010/ NSW56500
Same as
http://linkedgeodata.org/triplify/node26469586
Regional Spatial
Data Infrastructure
Spatial
Information
SIRF
12 | UNSDI Spatial Identifier Reference Framework | Paul Box
National
Spatial Data
Infrastructure
14. Where are we SIRFing?
In Indonesia
• Badan Informasi Geospasial (BIG)
• Ina SDI - ESRI geoportal
• Harvesting
• Publishing into InaSDI portal
• OpenStreetMap
In Australia
• OSP, GA, CGNA, ANDS
• MyMaps Australia Gazetteer
Globally
• UNSDI
• UNGEGN
14 | UNSDI Spatial Identifier Reference Framework | Paul Box
15. Innovation – the right information….
• Common reference – unambiguously reference a place using URI
• Granularity - moving from dataset to feature level
– Better discover, explore, understand then download/use the bits you need
15 | UNSDI Spatial Identifier Reference Framework | Paul Box
16. Innovation – delivered in the right way
• Linked data
• ‘spatial bookmarks’ for the web
• Interwoven feature level metadata with data - authority, licence,
• The role of spatial identifiers to link information
• SI as index to underlying data in SDI
• Linking multiple representations of the same real world feature
• Linking information to locations across systems
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
16 | UNSDI Spatial Identifier Reference Framework | Paul Box
17. Innovation – handling the social dimension
• Linking - not choosing a winner
• Registering and cross walking
• Evolution not revolution
• Providers – no need to change underlying
business process/systems
• Users – continue to use preferred SI sets
• Convergence
• Building a social network graph of information resources
• FOAF graph of info resources and their usage
• The power of the crowd
17 | UNSDI Spatial Identifier Reference Framework | Paul Box
18. Key enablers – Openness
• Open data
• ‘freemium’ for spatial identifiers
PLUS
• Links to underlying data
• With various price & licence models
• Open standards
• Open source software
• Open Government Indonesia
• Innovating
• Working with the crowd
• A legislative framework
18 | UNSDI Spatial Identifier Reference Framework | Paul Box
19. Implications for spatial policy and practice
• Openness
• Governance – to steer or row
• Policy (legal?) framework
• Community behaviour – tin hugging, information modelling & design for
reuse
• Formal and informal (VGI data sources)
• Linked data - Spatial feature identifier governance
• Information custodianship and access
• From supply to demand driven information
• Stability and predictability - improved change management
• Engaging with the broader community
• Identity not geometry
‘Thinking outside the polygon’
19 | UNSDI Spatial Identifier Reference Framework | Paul Box
20. Thank you
CSIRO Land and Water
Paul Box
USIRF Project Leader
t +61 2 9325 3122
m +61 406 256006
e paul.j.box@csiro.au
w www.csiro.au/gazetteer
DIGITAL PRODUCTIVITY AND SERVICES FLAGSHIP
Editor's Notes
Title of project changedProject CSIRO and AusAID working in Indonesia but applying to Australia
Why?What the project is aboutHow?- innovationEnablersWhere Implications for policy and practice
Large scale complex interwoven challenges – cross domain problem solving - multi-agencyNeed to rapidly access and integrate near real time information to enable rapid response. UN Global Pulse being driven by the UN Secretary General, aims to leverage crowd sourced information and ‘Big data’ to monitoring and rapidly respond to the needs of vulnerable populations. - Glocalisation - accurate uptodate locally relevant and producedinfrormation to be scaled globally ‘everything happens somewhere’ – It is said that 80% of gov data has spatial dimension. Geography is a key mechanism for integrating, analysis and interpretation of information from different systems. Information Tsunami - There is an increasing need to work across scientific and other domain boundaries and to rapidly distil meaning from an increasingly overwhelming volume of information. With the emergent of crowd sourcing and outputs from BIG data analysis the challenge is enormous. However, for any one real world location there are multiple representations, identifiers and place names in useOne name many places - Place names are ambiguous one name may refer to multiple places Integration is inefficient When trying to integrate information from multiple sources using geography, an enormous amount of time and effort is wasted in trying to find, acces, extract transform, load and understand data before it can be integrated with other data and usedThe UNSDI Gazetteer framework project funded by CSIRO and AusAID is an attempt to improve the use of spatial identifiers (gazetteers) that are used to refer to places in information systems. The project focuses on Social Protection in Indonesia and is providing support to the UN Global Pulse – provides the spatial framework and improved approaches to delivery and integration of formal, government data - Part of a global UN information infrastructure activity and is supportign national SDI efforts in Indonesia and Australia
Integration is inefficient, time consuming & expensiveEverything somewhere how do we refer to places – spatial identifiers
Turn to a special kind of identifier set One name – Melbourne locality municipal council boundary, Melbs in US and Canada , Sri Lanka One place many namesEndonym / exonymoffical and vernacular variants
Heterogeneity – different structures, location types used, different delivery formats management regimes Used within system boundaries not across – internal ids – not stable reliable overlapping – same feature different codes Limited access mechanisms – download – now3 web services – Australian context Used out of context disconnect from underlying features aggregated – no provenance infomration no usage
Build mechanisms to register, link, and deliver spatial identifiersII about social and technical infrastructure Leverage national SDI effortsGovernanceInformationTechnologies
The gazetteer framework provides the scalable geographic dimension to the Linked Data Web. It is DNS for ‘where’Users access- Register data source- Harvest from WFS- Model transformation – Solid Ground- Connect back to underklying data set to access geometry for underlying feature- Operational provenance – where it came from but link to underlying geometry
Here’s the portal But its really about the services under the hoodOpen search service
Harvesting SI from ESRI WFSDelvivering si back into the portal
Common reference – unambiguously reference a place using URI so we know we are talking about the same placeReference individual places not the whole datasetLinked data Project aims at supporting the geo-semantic web by developing a means to index, integrate, link and deliver spatial identifiers across national and global systems of systems. Essentially DNS for where persistent URI for each spatial identifier delivered into linked data web allows others to reliably reference cite, link information to placesCloud computing – massively scalable storage, access and processing of global datasetsLocally relevant, global data sets – improving creation and curation by data custodians in standardized ways that can be integrated at scale from local to global - Crowd- sourcing - Leveraging the power of the crowd – feeding back crowd sourced information to formal data custodian (government)Open data in context of heterogeneous business models – gazetteer is freemium viewpoint – name id and point for each featureOpen data and the freemium access model - Aim is to open up closed data holdings using a afreemium model Gazetteer is freemium view – free and open basic information about the existence and identity of spatial objects Provide links back to underlying data source for each feature. This enables integration of information across a highly heterogeneous pricing and licencing landscape drives business back to data providers and advertises their underlying data Open standards (ISO, OGC W3C) - for information content and technology – standardisation of delivery enables development of reusable tools that operate on gazetteer information Evolutionary approach - users to continue using existing gazetteers (for now) and framework links them together. Importantly information referenced using different gazetteers can be integrated as framework maintains cross-walks between gazetteers. In longer term, these cross walks and information about which gazetteers are being used to reference which statistical data can be used to consolidate gazetteers Providers do not need to change underlying data structures / business systems. Web service on top of data to deliver gazetteer information in standard way (structure and format). The gazetteer information delivered is a lightweight view of underlying heterogeneous data) based on an agreed information model – structure and semantics Building an institutional infrastructure – We are developing an information infrastructure. This is as much a social as technical undertaking. Solutions requires: a deep understanding of institutional and governance realities of infromation communities at variety of scales.- leveraging existing governance mechanisms – UN working with UN SDI (40+ Un Agencies that create and use spatial information) led by the UNN Chief Information and Technology Office an Assistant Secretary General. In Indonesia – partnering with BIG the national mapping agency – BIG is leading national efforts to build an Indonesian SDI. IN Australia partnering with GA – national mapping agency, Office for Spatial Policy.
Linked data Project aims at supporting the geo-semantic web by developing a means to index, integrate, link and deliver spatial identifiers across national and global systems of systems. Essentially DNS for where persistent URI for each spatial identifier delivered into linked data web allows others to reliably reference cite, link information to placesCloud computing – massively scalable storage, access and processing of global datasetsLocally relevant, global data sets – improving creation and curation by data custodians in standardized ways that can be integrated at scale from local to global - Crowd- sourcing - Leveraging the power of the crowd – feeding back crowd sourced information to formal data custodian (government)Open data in context of heterogeneous business models – gazetteer is freemium viewpoint – name id and point for each featureOpen data and the freemium access model - Aim is to open up closed data holdings using a afreemium model Gazetteer is freemium view – free and open basic information about the existence and identity of spatial objects Provide links back to underlying data source for each feature. This enables integration of information across a highly heterogeneous pricing and licencing landscape drives business back to data providers and advertises their underlying data Open standards (ISO, OGC W3C) - for information content and technology – standardisation of delivery enables development of reusable tools that operate on gazetteer information Evolutionary approach - users to continue using existing gazetteers (for now) and framework links them together. Importantly information referenced using different gazetteers can be integrated as framework maintains cross-walks between gazetteers. In longer term, these cross walks and information about which gazetteers are being used to reference which statistical data can be used to consolidate gazetteers Providers do not need to change underlying data structures / business systems. Web service on top of data to deliver gazetteer information in standard way (structure and format). The gazetteer information delivered is a lightweight view of underlying heterogeneous data) based on an agreed information model – structure and semantics Building an institutional infrastructure – We are developing an information infrastructure. This is as much a social as technical undertaking. Solutions requires: a deep understanding of institutional and governance realities of infromation communities at variety of scales.- leveraging existing governance mechanisms – UN working with UN SDI (40+ Un Agencies that create and use spatial information) led by the UNN Chief Information and Technology Office an Assistant Secretary General. In Indonesia – partnering with BIG the national mapping agency – BIG is leading national efforts to build an Indonesian SDI. IN Australia partnering with GA – national mapping agency, Office for Spatial Policy.
Leveraging existing infrastructure – tech and governance harvesting from WFS – low entry barrier register and cross walk the Si sets in useEvolution not revoluton
Freemium = light weight view Open Government Indonesia. It is part of the Open Government Partnership, a multilateral initiative that aims to make governments better. In Indonesia, Open Government Indonesia focuses on pushing transparency, public participation, and innovation across government departments.