SlideShare una empresa de Scribd logo
1 de 52
In search of lost knowledge: joining
the dots with Linked Data
Jon Blower

j.d.blower@reading.ac.uk
Department of Meteorology
Reading e-Science Centre
National Centre for Earth Observation
University of Reading
With thanks to all CHARMe project partners!
instrument
algorithm

platform

dataset
scientists

publication
How do you fill in the blanks?
• Hope the documentation contains pointers
• … to the right versions of things
• … and using consistent terminology
• In restricted communities, we can gather all
information into a central location and
harmonize it
• OR we hope that a Google search throws up the
right results
Problems
1. Crossing communities is very hard
www.fooddeserts.org
Problems
1. Crossing communities is very hard
2. Lots of useful stuff isn’t formally documented
Lots of useful negative results, which aren’t published
Nearly 1/3 of positive results are false

1000 hypotheses

100 are true

900 are false
Test accuracy 95%

95 true
positives

5 false
negatives

45 false
positives

855 true
negatives
Problems
1. Crossing communities is very hard
2. Lots of useful stuff isn’t formally documented
3. Unknown unknowns…
“There is a spurious decline in low-cloud fraction in
the ISCCP cloud database due to the viewing angle.
It’s in the literature, but you might not find it if
you’re not specifically looking for it. For example if
you’re using the data for model validation you may
not spot the problem.”
(Claire Barber, paraphrased)
Consequences
• We use the data that are most easily
available, not necessarily the best
• Constant re-discovery of the same issues
• Very hard to share information outside
communities
The “Web of Data”
instrument
algorithm

platform

dataset
scientists

publication
The Web is the
“Web of Documents”
What is being cited?
Why?
Fingers crossed!
• We hope that everything has a document
written about it, or we have nothing to cite
• … and we hope someone writes a new
document when “it” changes
• We hope that we are citing the right
document (i.e. the most authoritative etc)
“If … the Web made all the
online documents look like
one huge book, [the

Sir Tim Berners-Lee

Semantic Web] will
make all the data in the
world look like one huge
database.”

Photo by Susan Lesch, from Wikipedia
Why is the Semantic Web different?
• Focuses on things, not documents-aboutthings
• Links things together in a meaningful way
• Information is readable by computers
• Here’s how it works…
1. Give things unique identifiers
Must be globally unique and persistent
http://www.reading.ac.uk/people/jon.blower
might represent me

http://nasa.gov/instruments/MODIS
might represent the MODIS instrument
–(these are illustrative, not accurate!)
2. Express relationships between
things
The key concept is the triple
subject predicate object
E.g. “MODIS is carried by the Terra satellite”

all three things need to be unique identifiers
(we don’t use plain English!)
Examples of realistic triples
(identifiers are illustrative, not necessarily correct!)

http://www.reading.ac.uk/people/jon.blower
http://xmlns.com/foaf/0.1/knows
http://www.durham.ac.uk/staff/ed.llewellin
http://dx.doi.org/12345.678910
http://purl.org/spar/cito/citesAsDataSource
http://www.badc.ac.uk/datasets/sst
Aside: how to use the Semantic Web
to ruin jokes
“A hundred kilopascals go into a bar”
“100000”^^unit:Pascal
owl:sameAs
“1”^^unit:Bar
(Using OWL and QUDT ontologies)
3. Publish your triples
• Can be as simple as putting a document in the
right format on your website
– (triples can be expressed in many formats)

• … or as complex as publishing a multi-billiontriple database
– E.g. Ordnance Survey

• Either way, your data become part of the
Web, not just on the web
4. Profit!
• Everyone can publish their own “view of the world”
– Don’t need a single data structure “to rule them all”

• Use common identifiers and common vocabularies to link between
communities
• Different vocabularies can be mapped to each other
– E.g. map CF standard names to GRIB codes

• Gives a means to record provenance of information
– “Direct line of sight from decisions to data” - NASA

• Reasoning engines can traverse the graph of links and discover new
information
Example: SemantEco

Wildlife observations
Ecosystem impacts
Administrative areas
Pollution data and policies

“Find me all the sites where chloride pollution
exceeds the local policy limits”
“Show me the species over time in this region”
So what’s “Linked Data” then?
• The Web is the “web of documents”
• The Semantic Web is the “web of data”
• “Linked Data” is really a set of principles for
using the Semantic Web
– http://www.w3.org/DesignIssues/LinkedData.html

• (Many people use “Linked Data” and
“Semantic Web” somewhat interchangeably)
carries

produced

authored

produced

describes /
uses /
queries …

The “Web of Data”
Linking Open Data cloud diagram, by Richard
Cyganiak and Anja Jentzsch. http://lod-cloud.net/
CHARMe:

Sharing climate knowledge through
Linked Data
(Jan 2013 – Dec 2014)
From observations to decisions
Observations

Satellites

(obs. sometimes
used for model
initialization)

Climate
Record
Creation
and
Curation

Climate Data
Records

Reports
Analysis
and
applications

Decision
making

Decisions,
Policies

Model results
Climate
Modelling

(Adapted from Dowell et al., 2013),
“Strategy Towards an Architecture for
Climate Monitoring from Space”)
Where can users go for help?
• Scientific literature

– Huge, verbose and inaccessible to some communities
– Not well linked to source data

• Technical reports and conference proceedings
– Hard to find, scattered or inaccessible

• Data centres

– increasingly strong at providing some important metadata, but don’t
usually include community feedback
– Not all countries and communities have data centres!

• Websites and blogs

– From CEOS Handbook to a scientist’s blog
– Increasingly useful, but scattered
How can climate data users decide whether a
dataset is fit for their purpose?
(N.B. We consider that “data quality” and “fitness for purpose”
are the same thing)
Not specific to climate data!
“Commentary metadata”

37
Examples of commentary metadata
• Post-fact annotations, e.g. citations, ad-hoc comments and notes;
• Results of assessments, e.g. validation campaigns,
intercomparisons with models or other observations, reanalysis;
• Provenance, e.g. dependencies on other datasets, processing
algorithms and chain, data source;
• Properties of data distribution, e.g. data policy and licensing,
timeliness (is the data delivered in real time?), reliability;
• External events that may affect the data, e.g. volcanic eruptions, ElNino index, satellite or instrument failure, operational changes to
the orbit calculations.
General rule: information originates from users or external entities,
not original data providers
– However, sometimes information is not available from the data
provider!
How will this be done?
• CHARMe will create
connected repositories of
commentary information

Data provider
website

rd
33rdparty
party
system
system

– Stored as triples in
“CHARMe nodes”,

• Information can be read
and entered through
websites or Web Services
• Using principles of Open
Linked Data and the
Semantic Web

CHARMe
node

CHARMe
node

CHARMe
node
Open Annotation
•

We are using Open Annotation for
recording commentary

– World Wide Web Consortium standard

•

Associates a body with a target

•

E.g. a publication could be the body,
an EO dataset could be the target

•

Can record the motivation behind an
annotation
– Bookmarking, classifying, commenting,
describing, editing, highlighting,
questioning, replying… (lots more)
– Covers a lot of CHARMe use cases!

•

An annotation can have multiple
targets
– A key requirement from users
Where does CHARMe fit in?
Observations
Satellites

(obs. sometimes
used for model
initialization)

Climate
Record
Creation
and
Curation

Climate
Modelling
(e.g. CMIP5)

Climate Data
Records

Reports
Analysis
Analysis
and
and
applications
applications

Decision
making

Decisions,
Policies

Model results

Supports analysts and scientists in
production of information for
decision-makers
www.fooddeserts.org
Challenges
• Ensuring community adoption:

– We are “injecting” CHARMe capabilities into existing
websites used in the community
– We will “seed” the CHARMe system with information
to attract users (e.g. links between publications and
datasets)

• Ensuring quality of commentary metadata

– Moderation is a strong user requirement
– We will provide guides to creators of commentary –
what makes a comment helpful?
What CHARMe will enable
(some examples)

Users:
- “Find me all the documents that have been written about this dataset”
- “… in both peer-reviewed journals and the grey literature”
- “… and specifically about precipitation in Africa”
- “… in both NEODC’s and Astrium’s archives”
- “What factors might affect the quality of this dataset?”
e.g. upstream datasets, external events
- “What have other users already discovered about this dataset?”
- “I want to find information related to the dataset I’m looking at”
Data providers:
- “Who is using my dataset and what are they saying about it?”
- “Let me subscribe to new user comments and reply to them”
What CHARMe will not enable
• “Give me the best dataset on sea surface temperature”
– The “best” dataset depends on the application

• CHARMe will not provide a new “quality stamp” for
datasets

– But will be able to link to such things if other people publish
them, e.g. Bates maturity index, QA4EO certification

• CHARMe will not provide access to actual data
– (but will enable discovery of data)

• Not planning to create (another) “one-stop shop” for
information

– We want the information to appear where users are already
looking
Beware of 5-star
ratings…

(words of wisdom from xkcd.com)

TERRIBLE
How will you use CHARMe?

• Search for datasets on existing data provider website
• Use the “CHARMe button” to view or record
commentary on a particular dataset
“Significant events” viewer

•
•
•
•

Google Finance (above) matches stock prices with news events
CHARMe tool will match climate reanalysis data with “significant events”
– Algorithm changes, instrument failures, new data sources

Will allow user annotations on the data and events
Will be available on the Web
Fine-grained commentary

• Will allow creation and discovery about specific parts of
datasets
• E.g. variables, geographic locations, time ranges
• cf http://maphub.github.io
Jon Blower
Debbie Clifford
Tristan Quaife
Sam Williams

•
•

Using Linked Data to combine EO and socioeconomic data in 8 application areas
16-partner EU FP7 project, just started!
– Coordinated from Reading
Summary
•

Linked Data can join up information
from all over the Web

•

Makes disparate data sources
discoverable and processable

•

CHARMe is using Linked Data
techniques to help users of climate
data to connect with all the
experience in the community
– Techniques are not specific to
climate data

•

MELODIES project will combine
environmental data and
socioeconomic data to develop
real-world services using Linked
Data
Thank you!
j.d.blower@reading.ac.uk
@Jon_Blower
http://www.charme.org.uk
@MelodiesProject

Más contenido relacionado

La actualidad más candente

Keystone summer school 2015 paolo-missier-provenance
Keystone summer school 2015 paolo-missier-provenanceKeystone summer school 2015 paolo-missier-provenance
Keystone summer school 2015 paolo-missier-provenancePaolo Missier
 
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014Microsoft Azure for Research
 
Web analytics webinar
Web analytics webinarWeb analytics webinar
Web analytics webinarJim Jansen
 
Big Data Talent in Academic and Industry R&D
Big Data Talent in Academic and Industry R&DBig Data Talent in Academic and Industry R&D
Big Data Talent in Academic and Industry R&DUniversity of Washington
 
Knowledge Graph Semantics/Interoperability
Knowledge Graph Semantics/InteroperabilityKnowledge Graph Semantics/Interoperability
Knowledge Graph Semantics/InteroperabilityJames Hendler
 
The semanticweb may2001_timbernerslee
The semanticweb may2001_timbernersleeThe semanticweb may2001_timbernerslee
The semanticweb may2001_timbernersleegrknsfk
 
Blogs Logs Pods: Smart Labs
Blogs Logs Pods: Smart LabsBlogs Logs Pods: Smart Labs
Blogs Logs Pods: Smart LabsJeremy Frey
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceUniversity of Washington
 
Asteroid Observations - Real Time Operational Intelligence Series
Asteroid Observations - Real Time Operational Intelligence SeriesAsteroid Observations - Real Time Operational Intelligence Series
Asteroid Observations - Real Time Operational Intelligence SeriesStormBourne, LLC
 
Methods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of DataMethods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of DataCristina Sarasua
 
Social Network Analysis with Spark
Social Network Analysis with SparkSocial Network Analysis with Spark
Social Network Analysis with SparkGhulam Imaduddin
 
The Future(s) of the World Wide Web
The Future(s) of the World Wide WebThe Future(s) of the World Wide Web
The Future(s) of the World Wide WebJames Hendler
 
Evolution of Twitter Users and Behavior
Evolution of Twitter Users and BehaviorEvolution of Twitter Users and Behavior
Evolution of Twitter Users and BehaviorAli Babaoglan Blog
 
Perception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document ClusteringPerception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document ClusteringIRJET Journal
 
Lcwebinar rise of-the_databrarian_73961
Lcwebinar rise of-the_databrarian_73961Lcwebinar rise of-the_databrarian_73961
Lcwebinar rise of-the_databrarian_73961Sigaard
 
WEB SEARCH ENGINE BASED SEMANTIC SIMILARITY MEASURE BETWEEN WORDS USING PATTE...
WEB SEARCH ENGINE BASED SEMANTIC SIMILARITY MEASURE BETWEEN WORDS USING PATTE...WEB SEARCH ENGINE BASED SEMANTIC SIMILARITY MEASURE BETWEEN WORDS USING PATTE...
WEB SEARCH ENGINE BASED SEMANTIC SIMILARITY MEASURE BETWEEN WORDS USING PATTE...cscpconf
 
Search, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataSearch, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataNattiya Kanhabua
 
Machines are people too
Machines are people tooMachines are people too
Machines are people tooPaul Groth
 

La actualidad más candente (20)

Keystone summer school 2015 paolo-missier-provenance
Keystone summer school 2015 paolo-missier-provenanceKeystone summer school 2015 paolo-missier-provenance
Keystone summer school 2015 paolo-missier-provenance
 
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
 
Web analytics webinar
Web analytics webinarWeb analytics webinar
Web analytics webinar
 
Big Data Talent in Academic and Industry R&D
Big Data Talent in Academic and Industry R&DBig Data Talent in Academic and Industry R&D
Big Data Talent in Academic and Industry R&D
 
Science Data, Responsibly
Science Data, ResponsiblyScience Data, Responsibly
Science Data, Responsibly
 
Knowledge Graph Semantics/Interoperability
Knowledge Graph Semantics/InteroperabilityKnowledge Graph Semantics/Interoperability
Knowledge Graph Semantics/Interoperability
 
The semanticweb may2001_timbernerslee
The semanticweb may2001_timbernersleeThe semanticweb may2001_timbernerslee
The semanticweb may2001_timbernerslee
 
Blogs Logs Pods: Smart Labs
Blogs Logs Pods: Smart LabsBlogs Logs Pods: Smart Labs
Blogs Logs Pods: Smart Labs
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data Science
 
Asteroid Observations - Real Time Operational Intelligence Series
Asteroid Observations - Real Time Operational Intelligence SeriesAsteroid Observations - Real Time Operational Intelligence Series
Asteroid Observations - Real Time Operational Intelligence Series
 
Methods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of DataMethods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of Data
 
Social Network Analysis with Spark
Social Network Analysis with SparkSocial Network Analysis with Spark
Social Network Analysis with Spark
 
The Future(s) of the World Wide Web
The Future(s) of the World Wide WebThe Future(s) of the World Wide Web
The Future(s) of the World Wide Web
 
Evolution of Twitter Users and Behavior
Evolution of Twitter Users and BehaviorEvolution of Twitter Users and Behavior
Evolution of Twitter Users and Behavior
 
Perception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document ClusteringPerception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document Clustering
 
Lcwebinar rise of-the_databrarian_73961
Lcwebinar rise of-the_databrarian_73961Lcwebinar rise of-the_databrarian_73961
Lcwebinar rise of-the_databrarian_73961
 
Democratizing Data Science in the Cloud
Democratizing Data Science in the CloudDemocratizing Data Science in the Cloud
Democratizing Data Science in the Cloud
 
WEB SEARCH ENGINE BASED SEMANTIC SIMILARITY MEASURE BETWEEN WORDS USING PATTE...
WEB SEARCH ENGINE BASED SEMANTIC SIMILARITY MEASURE BETWEEN WORDS USING PATTE...WEB SEARCH ENGINE BASED SEMANTIC SIMILARITY MEASURE BETWEEN WORDS USING PATTE...
WEB SEARCH ENGINE BASED SEMANTIC SIMILARITY MEASURE BETWEEN WORDS USING PATTE...
 
Search, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataSearch, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving Data
 
Machines are people too
Machines are people tooMachines are people too
Machines are people too
 

Similar a In search of lost knowledge: joining the dots with Linked Data

Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13DataDryad
 
Broad Data (India 2015)
Broad Data (India 2015)Broad Data (India 2015)
Broad Data (India 2015)James Hendler
 
INSPIRE Hackathon Webinar Intro to Linked Data and Semantics
INSPIRE Hackathon Webinar   Intro to Linked Data and SemanticsINSPIRE Hackathon Webinar   Intro to Linked Data and Semantics
INSPIRE Hackathon Webinar Intro to Linked Data and Semanticsplan4all
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...AKSHAY BHAGAT
 
Accelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy ScienceAccelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy ScienceIan Foster
 
Web History 101, or How the Future is Unwritten
Web History 101, or How the Future is UnwrittenWeb History 101, or How the Future is Unwritten
Web History 101, or How the Future is UnwrittenBookNet Canada
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
Open Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | FutureOpen Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | FutureRoss Mounce
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...giuseppe_futia
 
Social Semantic (Sensor) Web
Social Semantic (Sensor) WebSocial Semantic (Sensor) Web
Social Semantic (Sensor) WebDavid Crowley
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodDuncan Hull
 
RDA, Data Citation, and PIDs for DataOne
RDA, Data Citation, and PIDs for DataOneRDA, Data Citation, and PIDs for DataOne
RDA, Data Citation, and PIDs for DataOneResearch Data Alliance
 
Infrastructure, relationships, trust, and RDA
Infrastructure, relationships, trust, and RDAInfrastructure, relationships, trust, and RDA
Infrastructure, relationships, trust, and RDAResearch Data Alliance
 
Getting to Know Your Data with R
Getting to Know Your Data with RGetting to Know Your Data with R
Getting to Know Your Data with RStephen Withington
 
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014James Powell
 
How the Web can change social science research (including yours)
How the Web can change social science research (including yours)How the Web can change social science research (including yours)
How the Web can change social science research (including yours)Frank van Harmelen
 

Similar a In search of lost knowledge: joining the dots with Linked Data (20)

Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13
 
Broad Data (India 2015)
Broad Data (India 2015)Broad Data (India 2015)
Broad Data (India 2015)
 
INSPIRE Hackathon Webinar Intro to Linked Data and Semantics
INSPIRE Hackathon Webinar   Intro to Linked Data and SemanticsINSPIRE Hackathon Webinar   Intro to Linked Data and Semantics
INSPIRE Hackathon Webinar Intro to Linked Data and Semantics
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
 
Accelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy ScienceAccelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy Science
 
Sensors1(1)
Sensors1(1)Sensors1(1)
Sensors1(1)
 
Semantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including AstrophysicsSemantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including Astrophysics
 
Web History 101, or How the Future is Unwritten
Web History 101, or How the Future is UnwrittenWeb History 101, or How the Future is Unwritten
Web History 101, or How the Future is Unwritten
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
 
Open Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | FutureOpen Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | Future
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
 
Social Semantic (Sensor) Web
Social Semantic (Sensor) WebSocial Semantic (Sensor) Web
Social Semantic (Sensor) Web
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
 
Zarneger "Supporting AI: Best Practices for Content Delivery Platforms"
Zarneger "Supporting AI: Best Practices for Content Delivery Platforms"Zarneger "Supporting AI: Best Practices for Content Delivery Platforms"
Zarneger "Supporting AI: Best Practices for Content Delivery Platforms"
 
RDA, Data Citation, and PIDs for DataOne
RDA, Data Citation, and PIDs for DataOneRDA, Data Citation, and PIDs for DataOne
RDA, Data Citation, and PIDs for DataOne
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Infrastructure, relationships, trust, and RDA
Infrastructure, relationships, trust, and RDAInfrastructure, relationships, trust, and RDA
Infrastructure, relationships, trust, and RDA
 
Getting to Know Your Data with R
Getting to Know Your Data with RGetting to Know Your Data with R
Getting to Know Your Data with R
 
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
 
How the Web can change social science research (including yours)
How the Web can change social science research (including yours)How the Web can change social science research (including yours)
How the Web can change social science research (including yours)
 

Último

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Último (20)

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

In search of lost knowledge: joining the dots with Linked Data

  • 1. In search of lost knowledge: joining the dots with Linked Data Jon Blower j.d.blower@reading.ac.uk Department of Meteorology Reading e-Science Centre National Centre for Earth Observation University of Reading With thanks to all CHARMe project partners!
  • 3.
  • 4.
  • 5. How do you fill in the blanks? • Hope the documentation contains pointers • … to the right versions of things • … and using consistent terminology • In restricted communities, we can gather all information into a central location and harmonize it • OR we hope that a Google search throws up the right results
  • 8. Problems 1. Crossing communities is very hard 2. Lots of useful stuff isn’t formally documented
  • 9. Lots of useful negative results, which aren’t published Nearly 1/3 of positive results are false 1000 hypotheses 100 are true 900 are false Test accuracy 95% 95 true positives 5 false negatives 45 false positives 855 true negatives
  • 10.
  • 11. Problems 1. Crossing communities is very hard 2. Lots of useful stuff isn’t formally documented 3. Unknown unknowns…
  • 12. “There is a spurious decline in low-cloud fraction in the ISCCP cloud database due to the viewing angle. It’s in the literature, but you might not find it if you’re not specifically looking for it. For example if you’re using the data for model validation you may not spot the problem.” (Claire Barber, paraphrased)
  • 13. Consequences • We use the data that are most easily available, not necessarily the best • Constant re-discovery of the same issues • Very hard to share information outside communities
  • 14. The “Web of Data”
  • 16. The Web is the “Web of Documents”
  • 17.
  • 18. What is being cited? Why?
  • 19. Fingers crossed! • We hope that everything has a document written about it, or we have nothing to cite • … and we hope someone writes a new document when “it” changes • We hope that we are citing the right document (i.e. the most authoritative etc)
  • 20. “If … the Web made all the online documents look like one huge book, [the Sir Tim Berners-Lee Semantic Web] will make all the data in the world look like one huge database.” Photo by Susan Lesch, from Wikipedia
  • 21. Why is the Semantic Web different? • Focuses on things, not documents-aboutthings • Links things together in a meaningful way • Information is readable by computers • Here’s how it works…
  • 22. 1. Give things unique identifiers Must be globally unique and persistent http://www.reading.ac.uk/people/jon.blower might represent me http://nasa.gov/instruments/MODIS might represent the MODIS instrument –(these are illustrative, not accurate!)
  • 23. 2. Express relationships between things The key concept is the triple subject predicate object E.g. “MODIS is carried by the Terra satellite” all three things need to be unique identifiers (we don’t use plain English!)
  • 24. Examples of realistic triples (identifiers are illustrative, not necessarily correct!) http://www.reading.ac.uk/people/jon.blower http://xmlns.com/foaf/0.1/knows http://www.durham.ac.uk/staff/ed.llewellin http://dx.doi.org/12345.678910 http://purl.org/spar/cito/citesAsDataSource http://www.badc.ac.uk/datasets/sst
  • 25. Aside: how to use the Semantic Web to ruin jokes “A hundred kilopascals go into a bar” “100000”^^unit:Pascal owl:sameAs “1”^^unit:Bar (Using OWL and QUDT ontologies)
  • 26. 3. Publish your triples • Can be as simple as putting a document in the right format on your website – (triples can be expressed in many formats) • … or as complex as publishing a multi-billiontriple database – E.g. Ordnance Survey • Either way, your data become part of the Web, not just on the web
  • 27. 4. Profit! • Everyone can publish their own “view of the world” – Don’t need a single data structure “to rule them all” • Use common identifiers and common vocabularies to link between communities • Different vocabularies can be mapped to each other – E.g. map CF standard names to GRIB codes • Gives a means to record provenance of information – “Direct line of sight from decisions to data” - NASA • Reasoning engines can traverse the graph of links and discover new information
  • 28. Example: SemantEco Wildlife observations Ecosystem impacts Administrative areas Pollution data and policies “Find me all the sites where chloride pollution exceeds the local policy limits” “Show me the species over time in this region”
  • 29. So what’s “Linked Data” then? • The Web is the “web of documents” • The Semantic Web is the “web of data” • “Linked Data” is really a set of principles for using the Semantic Web – http://www.w3.org/DesignIssues/LinkedData.html • (Many people use “Linked Data” and “Semantic Web” somewhat interchangeably)
  • 31. Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • 32.
  • 33. CHARMe: Sharing climate knowledge through Linked Data (Jan 2013 – Dec 2014)
  • 34. From observations to decisions Observations Satellites (obs. sometimes used for model initialization) Climate Record Creation and Curation Climate Data Records Reports Analysis and applications Decision making Decisions, Policies Model results Climate Modelling (Adapted from Dowell et al., 2013), “Strategy Towards an Architecture for Climate Monitoring from Space”)
  • 35. Where can users go for help? • Scientific literature – Huge, verbose and inaccessible to some communities – Not well linked to source data • Technical reports and conference proceedings – Hard to find, scattered or inaccessible • Data centres – increasingly strong at providing some important metadata, but don’t usually include community feedback – Not all countries and communities have data centres! • Websites and blogs – From CEOS Handbook to a scientist’s blog – Increasingly useful, but scattered
  • 36. How can climate data users decide whether a dataset is fit for their purpose? (N.B. We consider that “data quality” and “fitness for purpose” are the same thing) Not specific to climate data!
  • 38. Examples of commentary metadata • Post-fact annotations, e.g. citations, ad-hoc comments and notes; • Results of assessments, e.g. validation campaigns, intercomparisons with models or other observations, reanalysis; • Provenance, e.g. dependencies on other datasets, processing algorithms and chain, data source; • Properties of data distribution, e.g. data policy and licensing, timeliness (is the data delivered in real time?), reliability; • External events that may affect the data, e.g. volcanic eruptions, ElNino index, satellite or instrument failure, operational changes to the orbit calculations. General rule: information originates from users or external entities, not original data providers – However, sometimes information is not available from the data provider!
  • 39. How will this be done? • CHARMe will create connected repositories of commentary information Data provider website rd 33rdparty party system system – Stored as triples in “CHARMe nodes”, • Information can be read and entered through websites or Web Services • Using principles of Open Linked Data and the Semantic Web CHARMe node CHARMe node CHARMe node
  • 40. Open Annotation • We are using Open Annotation for recording commentary – World Wide Web Consortium standard • Associates a body with a target • E.g. a publication could be the body, an EO dataset could be the target • Can record the motivation behind an annotation – Bookmarking, classifying, commenting, describing, editing, highlighting, questioning, replying… (lots more) – Covers a lot of CHARMe use cases! • An annotation can have multiple targets – A key requirement from users
  • 41. Where does CHARMe fit in? Observations Satellites (obs. sometimes used for model initialization) Climate Record Creation and Curation Climate Modelling (e.g. CMIP5) Climate Data Records Reports Analysis Analysis and and applications applications Decision making Decisions, Policies Model results Supports analysts and scientists in production of information for decision-makers
  • 43. Challenges • Ensuring community adoption: – We are “injecting” CHARMe capabilities into existing websites used in the community – We will “seed” the CHARMe system with information to attract users (e.g. links between publications and datasets) • Ensuring quality of commentary metadata – Moderation is a strong user requirement – We will provide guides to creators of commentary – what makes a comment helpful?
  • 44. What CHARMe will enable (some examples) Users: - “Find me all the documents that have been written about this dataset” - “… in both peer-reviewed journals and the grey literature” - “… and specifically about precipitation in Africa” - “… in both NEODC’s and Astrium’s archives” - “What factors might affect the quality of this dataset?” e.g. upstream datasets, external events - “What have other users already discovered about this dataset?” - “I want to find information related to the dataset I’m looking at” Data providers: - “Who is using my dataset and what are they saying about it?” - “Let me subscribe to new user comments and reply to them”
  • 45. What CHARMe will not enable • “Give me the best dataset on sea surface temperature” – The “best” dataset depends on the application • CHARMe will not provide a new “quality stamp” for datasets – But will be able to link to such things if other people publish them, e.g. Bates maturity index, QA4EO certification • CHARMe will not provide access to actual data – (but will enable discovery of data) • Not planning to create (another) “one-stop shop” for information – We want the information to appear where users are already looking
  • 46. Beware of 5-star ratings… (words of wisdom from xkcd.com) TERRIBLE
  • 47. How will you use CHARMe? • Search for datasets on existing data provider website • Use the “CHARMe button” to view or record commentary on a particular dataset
  • 48. “Significant events” viewer • • • • Google Finance (above) matches stock prices with news events CHARMe tool will match climate reanalysis data with “significant events” – Algorithm changes, instrument failures, new data sources Will allow user annotations on the data and events Will be available on the Web
  • 49. Fine-grained commentary • Will allow creation and discovery about specific parts of datasets • E.g. variables, geographic locations, time ranges • cf http://maphub.github.io
  • 50. Jon Blower Debbie Clifford Tristan Quaife Sam Williams • • Using Linked Data to combine EO and socioeconomic data in 8 application areas 16-partner EU FP7 project, just started! – Coordinated from Reading
  • 51. Summary • Linked Data can join up information from all over the Web • Makes disparate data sources discoverable and processable • CHARMe is using Linked Data techniques to help users of climate data to connect with all the experience in the community – Techniques are not specific to climate data • MELODIES project will combine environmental data and socioeconomic data to develop real-world services using Linked Data

Notas del editor

  1. Here’s a simplified representation of the things we are interested in in Earth Observation
  2. And here’s a more complex version. The instrument produces “Level 0” data, which are then processed using a series of algorithms to produce something that the user community finds easier to use. Documents may of course be written about any of these steps. All these documents will be written by different people at different times, and possibly about different versions of the same thing
  3. Now let’s say you only know about a few of these things – a few publications, the algorithms you know already and the latest versions of a few datasets. Where do you go from here?
  4. This person is trying to make a good decision based on climate data from satellites and from computer simulations. She also needs to know about other factors such as population growth and changes in urbanization. It’s a pretty hard job to bring together information from all these communities and correlate them. Everyone will use different standards, formats and terminology
  5. If you are in a data-rich field that relies a lot on statistics, you may well be in this situation where the false positives outweigh the true ones. (95% test accuracy does not mean that there is a 95% chance that your hypothesis is true!) Obviously, you can play with the numbers here, but in general this will be significant if you’re in the case where most hypotheses are false. This may contribute to findings that results in the peer-reviewed literature can’t be reproduced – not necessarily due to poor experiments or tests.
  6. Sites such as FigShare allow publication of stuff that would not make it into the peer-reviewed literature, but that are still useful.
  7. Who here uses NASA instead of ESA because the data are easier to get hold of? NCEP instead of Met Office?
  8. Just a reminder of what our simplified situation in EO looks like
  9. This is how the Web represents our case: just a set of documents with simple links. The Web records no information about what the documents represent, or why they are linked.
  10. Lots of links here, which is good! However, you need context (and natural language understanding) to determine what the links mean – are they affiliations, projects Richard works on, links to “institutional” (not personal) stuff, or what? A computer can’t reason based on this information (although link density gives some indication of relevance and popularity).
  11. Here’s part of the citation list from one of my recent papers. Some of these are papers, some are standards, some are simple URLs. There is no information about why I’m citing them, unless you trawl the text and divine my meaning.
  12. These are some of the problems of using documents-about-things to talk about stuff
  13. These identifiers may look like web addresses (Uniform Resource Locators) but in fact are identifiers – just strings. You don’t expect to download me when you put my identifier in your web browser but you might get a representation of me (e.g. my home page)
  14. The bottom one is a paper citing a dataset as a data source. Note that we can say *why* we’re citing the data
  15. Your homework is to apply the same technique to the joke about the Dalai Lama going to a hot-dog stall and saying “make me one with everything”
  16. The point about being part of the web is important, but hard to grasp. Your data can be found by semantically-aware web crawlers, which might be able to do something interesting with it, rather than just finding a document that requires a human to read.
  17. So the Web of Data can more accurately record our situation, and can describe the nature of the links between things.
  18. Lots of data, particularly public sector data, is being released as Open Linked Data Met Office Ordnance Survey Office of National Statistics Data.gov.uk People need to publish not only their data, but the terms that they use and the relationships between them
  19. A close-up of some of the geospatial part. Note the Met Office.
  20. So what are we doing with Linked Data in the climate field? Here is a European project, about half way through,
  21. Simple “value chain” for climate data observed from space.
  22. See next slide for explanation of this
  23. This is basically the same as the previous slide in words instead of pictures! The last paragraph is crucial – we’re looking at stuff the data provider doesn’t already know. We’re not trying to turn the whole of climate data into Linked Data, but focusing on user-provided annotations
  24. Important to note that decision-makers will probably not use CHARMe directly
  25. So this person might be able to use CHARMe to find interesting information from all these communities, and find out what the experts in those communities think about the data. It will be easier to discover the common pitfalls.
  26. CHARMe is often compared with Amazon, Tripadvisor etc, but the user base will be smaller and more specialist, therefore it is important for the information to be accurate. Can’t rely on the wisdom of crowds!
  27. This is the most basic use case for CHARMe. In the next couple of slides we’ll look at more advanced uses. The fact that the annotations (i.e. the commentary information) will be machine-readable means that it’s possible to build all kinds of applications.
  28. Here’s an “advanced application” for using CHARMe information, which will be correlated with climate reanalyses and external events.
  29. Here’s another project that is just starting, which will use Linked Open Data to create new services and products. Many of the participants are small and medium-sized enterprises who will base business ideas on Open Data.