1. Why What How
The Why, What, and How of
Geo-Information Observatories
Krzysztof Janowicz
STKO Lab
University of California, Santa Barbara, USA
GeoRich 2014 Keynote, Snowbird, Utah, June 2014
The Why, What, and How of Geo-Information Observatories K. Janowicz
2. Why What How
Whyis this interesting?
The Why, What, and How of Geo-Information Observatories K. Janowicz
3. Why What How
Astronomical Observatories
The Griffith Observatory
Griffith donated funds and land to build the observatory to make astronomy accessible to
the public. This was in clear contrast to the prevailing idea of locating observatories on
remote mountaintops and restrict them to scientists. Today, our society is willing to invest
billions to study phenomena that may not even exist anymore (e.g., the Pillars of Creation).
The Why, What, and How of Geo-Information Observatories K. Janowicz
4. Why What How
Astronomical Observatories
Observatories and Their Sensors
Whether on land or in space, observatories and their sensors serve
different purposes and are most useful when they work together.
The Why, What, and How of Geo-Information Observatories K. Janowicz
5. Why What How
Astronomical Observatories
Spectral Signatures, Bands, and Remote Sensing
Spectral signatures are the combination of emitted, reflected or absorbed
electromagnetic radiation at varying wavelengths (bands) that uniquely
identify a feature type.
Spectral libraries, the idea of sharing spectral signatures, has
revolutionized remote sensing.
The Why, What, and How of Geo-Information Observatories K. Janowicz
6. Why What How
Astronomical Observatories
Astronomical Breakthrough: Hubble Deep Field
The Why, What, and How of Geo-Information Observatories K. Janowicz
7. Why What How
Astronomical Observatories
Astronomical Breakthrough: Hubble Deep Field
The universe is
(mostly)
Homogenous
Isotropic
We will do such an experiment in a few minutes.
The Why, What, and How of Geo-Information Observatories K. Janowicz
8. Why What How
Observatories In Other Sciences
Observatories In Other Sciences
What do these observatories have in common? Why are they useful?
Physical location to phenomenon, collaboration between observatories, tangible.
Observatories beyond Astronomy
Ocean observatories initiative
Volcano observatories
Meteorological observatories
Geological observatories
The Why, What, and How of Geo-Information Observatories K. Janowicz
9. Why What How
Towards Information Observatories
Towards Information Observatories
Web Science Trust: A web observatory is a system that gives public access to
some specific aspects of the WWW and provides the infrastructure and
visualization techniques to support monitoring, analysis, and experiments.
Web Science Trust wants to establish a network of observatories.
The information universe has entered a phase of exponential growth but its
foundations are still purely understood.
We need observatories that are tangible (physical) installations; remember
Griffith’s will.
{Web, Data, Information, Knowledge, Virtual Earth} Observatory?
The Why, What, and How of Geo-Information Observatories K. Janowicz
10. Why What How
Towards Information Observatories
How Does This Differ From the Digital Earth and CyberGIS?
The Digital Earth is a data archive to access
and visualize data layers on a digital globe.
CyberGIS is mostly concerned with creating
online workbenches for scientists to ease the
storage of data on the cloud and to do
complex spatial analysis on the cloud.
Recall Griffith’s vision of making observatories
available to the public, not just scientists.
A way to handle some common sampling bias
and quality arguments.
Most examples will relate the information
universe back to the physical universe.
However, it is important to note that the
information universe can also be studied in
its own rights.
The Why, What, and How of Geo-Information Observatories K. Janowicz
11. Why What How
Towards Information Observatories
Towards Information Observatories
Essentially, all models are wrong, but
some are useful. (George E. P. Box)
What we know is an artifact of the
technical infrastructure we use (e.g.,
sensors) and the models we develop.
The physical universe is governed by
physical laws, constants, elementary
particles, and so forth.
What about the information universe?
Are there laws of information?
Complex sociotechnical interactions.
Physical-Cyber- Social systems (cf.
Sheth 2013).
The Why, What, and How of Geo-Information Observatories K. Janowicz
12. Why What How
Whatwould we observe?
The Why, What, and How of Geo-Information Observatories K. Janowicz
13. Why What How
Is the Information Universe Homogenous amd Isotropic?
Spatial Distribution of Data on the Social Web
The Why, What, and How of Geo-Information Observatories K. Janowicz
14. Why What How
Is the Information Universe Homogenous amd Isotropic?
Spatial Distribution of Data on the Social Web
In terms of geospatial distribution the Social (media) Web is neither
homogenous nor isotropic.
The Why, What, and How of Geo-Information Observatories K. Janowicz
15. Why What How
Is the Information Universe Homogenous amd Isotropic?
The Idealized Linked Data Cloud
A highly popular visualization of the Linked Data Cloud by Cyganiak and Jentzsch
from Sept 2011. Is the LOD Cloud homogenous, isotropic?
The Why, What, and How of Geo-Information Observatories K. Janowicz
16. Why What How
Is the Information Universe Homogenous amd Isotropic?
A Linear Cluster Map Of The LOD Cloud
Credit: Gueret, Schlobach, Wang, Groth, van Harmelen (2011)
In terms of link structure, the Linked Data web is neither homogenous
nor isotropic.
The Why, What, and How of Geo-Information Observatories K. Janowicz
17. Why What How
Are there Laws of the Information Universe?
A Law Of The Information Universe?
Terminological knowledge is orders of magnitude smaller than factual
knowledge. (cf. van Harmelen, ISWC 2011)
The Why, What, and How of Geo-Information Observatories K. Janowicz
18. Why What How
Are there Laws of the Information Universe?
What are the "Elementary Particles", "Constants" and "Laws"
Governing the Information Universe?
Interestingly, the power law applies to terminological and factual knowledge.
The Why, What, and How of Geo-Information Observatories K. Janowicz
19. Why What How
Early Geo-Information Observatories
The Urban Observatory
’Urban Observatory – a live museum with a data pulse.’ (urbanobservatory.org)
The Why, What, and How of Geo-Information Observatories K. Janowicz
20. Why What How
Early Geo-Information Observatories
POI Pulse: Point Of Interest Information Observatory
Analyze (zoom, change time, select categories, etc.) the pulse of a city via its
Points of Interest and user behavior on social media (http://poipulse.com/).
The Why, What, and How of Geo-Information Observatories K. Janowicz
21. Why What How
Early Geo-Information Observatories
POI Pulse: Point Of Interest Information Observatory
Theory-driven upper-level categories and default behavior based on semantic signatures.
The Why, What, and How of Geo-Information Observatories K. Janowicz
22. Why What How
Early Geo-Information Observatories
POI Pulse: Point Of Interest Information Observatory
User interaction and fine-grained, data-driven categorization.
The Why, What, and How of Geo-Information Observatories K. Janowicz
23. Why What How
Early Geo-Information Observatories
POI Pulse: Point Of Interest Information Observatory
Burst mode adds real-time data; tweets [red circles] and Foursquare check-ins.
The Why, What, and How of Geo-Information Observatories K. Janowicz
24. Why What How
Early Geo-Information Observatories
Frankenplace
Credit: Adams & McKenzie (2012)
Frankenplace and thematic signatures support to study the
geo-indicativeness of text and sense of place.
Note how POI Pulse and Frankenplace allow for observational and
experimental research.
The Why, What, and How of Geo-Information Observatories K. Janowicz
25. Why What How
Howcould we do this?
The Why, What, and How of Geo-Information Observatories K. Janowicz
26. Why What How
Challenges for Information Observatories
Where Are The Information Observatories?
Prototypes Aside, Where Are The Information Observatories?
Well, it’s a difficult task
Data Publishing
Data Retrieval
Data Synthesis
Data Reuse
Sensemaking
Semantic Web technologies and ontologies aim at exactly those
challenges and we are beginning to see their wide scale adoption.
However, we need to work on approaches that combine data-driven
and theory-driven techniques.
The Why, What, and How of Geo-Information Observatories K. Janowicz
27. Why What How
Challenges for Information Observatories
The Data Retrieval Problem Is Real
Even the major data hubs such as Data.gov still rely on keyword-based search
and have unreliable, incomplete, and missing metadata. For this type of retrieval
problems, even a little semantics goes a long way (Hendler 1997).
The Why, What, and How of Geo-Information Observatories K. Janowicz
28. Why What How
Challenges for Information Observatories
Sensemaking is Difficult – Fitness for Puspose is Key
There is no shortage of data, but
finding data that is fit for a certain
purpose is difficult.
Data as statements (think RDF) not
as truth.
Heterogeneity is caused by cultural
differences, progress in science,
viewpoints, granularity, etc.
Alchemist Fallacy1; semantics
does not come for free.
Lack of provenance information
Sensemaking requires more
powerful semantic technologies and
ontologies (compared to IR).
1You cannot transmute base metals into gold and even if you could, gold would not be precious anymore.
The Why, What, and How of Geo-Information Observatories K. Janowicz
29. Why What How
Challenges for Information Observatories
Meaningful Analysis and Synthesis is Difficult
Ensuring that data is analyzed and
combined in a meaningful way is far
from trivial.
What if the information on how to
use the data would come together
with these data?
Focus on smart data instead of
(merely on) smart applications.
The purpose of ontologies is not to
agree on the meaning of terms but to
make the data provider’s intended
meaning explicit.
A little experiment: The statement all rivers flow into other water bodies
is not useful because it is "true"2, but because...?
2It is not; rivers can flow into the ground or just dry up entirely before reaching another water body.
The Why, What, and How of Geo-Information Observatories K. Janowicz
30. Why What How
Semantic Signatures
So What Are These Semantic Signatures?
Semantic signatures are an analogy to spectral signatures used
in remote sensing
Combine numerical and statistical models and data with ontologies
to derive local primitives (reifications)
Multiple spectral bands → multiple semantic bands
A shared semantic signatures library will hopefully have the same
impact that spectral signatures had on remote sensing.
The Why, What, and How of Geo-Information Observatories K. Janowicz
31. Why What How
Semantic Signatures
Semantic Signatures In POI Pulse
Semantic Signature
12 geospatial bands
based on geographic location
ANND (1)
Ripley’s K Bins (10)
J Measure (1)
168 temporal bands
based on geo-social check-Ins
24 Hours
7 Days
60 thematic bands
based on venue tips and reviews
LDA topics
Makes use of data
heterogeneity, social machines
The Why, What, and How of Geo-Information Observatories K. Janowicz
32. Why What How
Semantic Signatures
Semantic Signatures Example: Thematic Bands
A thematic band can be
computed out of unstructured
text using latent Dirichlet
allocation (LDA); data source
Wikipedia and travel blogs.
Non-georeferenced plain text is
often still geo-indicative
Different types (taken from
DBpedia) of geographic
features have different,
diagnostic topics associated to
them (out of 500 topics)
The Why, What, and How of Geo-Information Observatories K. Janowicz
33. Why What How
Semantic Signatures
Semantic Signatures Example: Thematic Bands
City topics: 204>450>104>282>267>497>443>484>277>97>...
Town topics: 425>450>419>367>104>429>266>69>204>308>...
Mountain topics: 27>110>5>172>208>459>232>398>453>183>...
The Why, What, and How of Geo-Information Observatories K. Janowicz
34. Why What How
Semantic Signatures
The IARPA Finder Challenge
Finder is like facial recognition for backgrounds ;-)
Estimate the location of pictures and videos without any explicit
geolocation information.
The Why, What, and How of Geo-Information Observatories K. Janowicz
35. Why What How
Semantic Signatures
The IM2GPS System
’Estimating geographic information from a single image’
’Purely data-driven scene matching’ (low-level features)
Big Data Check
Volume: 6 million (out of 6 billion) of Flickr photos
Velocity: in theory, new pictures every second
Variety: single type of data
The Why, What, and How of Geo-Information Observatories K. Janowicz
36. Why What How
Semantic Signatures
Our DiaLoc System: Exploiting Heterogeneity
Key Idea: Exploit the geo-indicativeness of thematic bands.
’market food street narrow dense populated asia economy air conditioning smog
fog humid warm building construction skyscrapers skyline shipping export
channel harbor transportation tram city advertisement’
Variety: Plain text, not image features as data source
The Why, What, and How of Geo-Information Observatories K. Janowicz
37. Why What How
Semantic Signatures
Estimation of Location And Type
0
0.1
0.2
0.3
0.4
0.5
Cape Norman
Santa Barbara
City
Lake
Valley
Mountain
HistoricPlace
Town
WorldHeritageSite
ProtectedArea
Village
Cave
Island
Museum
Stream
Park
Theatre
Lighthouse
Stadium
Hotel
Restaurant
Airport
Hospital
Volume: > 500,000 Wikipedia articles & travel blog entries.
Velocity: in theory, new travel blog entries every minute
IM2GPS and DiaLoc each exclude 99.9% of the land-surface of the
Earth, what if we combine them.
The Why, What, and How of Geo-Information Observatories K. Janowicz
38. Why What How
Semantic Signatures
Thematic Semantic Signatures for DBpedia Classes
The Why, What, and How of Geo-Information Observatories K. Janowicz
39. Why What How
Semantic Signatures
Geolocation APIs – Mapping Space to Place
Geolocation APIs map geographic coordinates, e.g., from a user’s
smartphone, to an ordered sets of nearby candidate POI.
These services typically return the n nearest POI within a certain radius and
use spatial distance to the provided coordinates to determine their order.
The Why, What, and How of Geo-Information Observatories K. Janowicz
40. Why What How
Semantic Signatures
Temporal Signatures: Combined Day + Hour Band for POI
When you are is what you are
Places can be semantically annotated based on geo-social check-ins.
Primitives: weekday vs. weekend, evening vs. morning, etc.
Sometimes day or hour bands alone are not indicative (e.g., university) but
jointly form a signature.
The Why, What, and How of Geo-Information Observatories K. Janowicz
41. Why What How
Semantic Signatures
Distort the POI Locations Based on Temporal Signatures
The likelihood of visiting a coffee shop, university, bakery, etc at 7pm is
rather low, while it is a peak hour for restaurants.
Modify the purely spatial ranking by pulling and pushing places based on
their check-in probability.
The Why, What, and How of Geo-Information Observatories K. Janowicz
42. Why What How
Semantic Signatures
Spatial-Semantic Bands and Signatures
POIs plotted by similarity to bar and post office in OSM data, London, UK
Local Reifications (Primitives): e.g., Uniform and Clumped
Bars (and similar features) tend to clump together
Post Offices (and similar features) are rather uniformly distributed
The Why, What, and How of Geo-Information Observatories K. Janowicz
43. Why What How
Semantic Signatures
Spatial-Semantic Bands and Signatures
Where you are is what you are
Dzero measures the likelihood of features of a certain type to co-occur
within a specific semantic and spatial range.
User support: generate recommendations, and clean up data based on
type likelihood. ’How likely is a post office directly next to an existing one?’
The Why, What, and How of Geo-Information Observatories K. Janowicz
44. Why What How
Backup Slides
When Do You Need Semantics?
The Why, What, and How of Geo-Information Observatories K. Janowicz
45. Why What How
Backup Slides
Observation-Driven Ontology Engineering
The Why, What, and How of Geo-Information Observatories K. Janowicz