SlideShare una empresa de Scribd logo
1 de 36
Connecting the Smithsonian
American Art Museum to
the Linked Data Cloud
Pedro Szekely, Craig A. Knoblock, Fengyu Yang, Xuming Zhu,
Eleanor E. Fink, Rachel Allen, and Georgina Goodlander
University of Southern California, Los Angeles, California, USA
Nanchang Hangkong University, Nanchang, China
Smithsonian American Art Museum, Washington, DC, USA
http://www.isi.edu/integration/karma
The Smithsonian American Art
Museum is a museum in Washington,
D.C. which has one of the world's
largest and most inclusive collections
of art, from the colonial period to the
present, made in the United States.
Wikipedia
Big Picture
Pedro Szekely and Craig KnoblockUniversity of Southern California
Problem
SAAM
Data
What ontology to use?
Structure mismatches
Data consistency What to link to?
100% precision
How to enable museums to do this themselves?
Pedro Szekely and Craig KnoblockUniversity of Southern California
Steps to Create Linked Data
• Map data to RDF
… select ontologies
… define mappings
• Link to external resources
… identify the links
• Curate the Linked Data
… museums demand 100% correctness
Pedro Szekely and Craig KnoblockUniversity of Southern California
select ontologies
University of Southern California
Complicated
Many irrelevant classes
and properties
Incomplete
University of Southern California
edm:ProvidedCHO
aac:CulturalHeritageObject
dcterms:creator
ore:Aggregation
edm:EuropeanaAggregation
crm:E89_Propositional_Object
edm:WebResource
edm:aggregatedCHO
edm:hasView
edm:Agent/crm:E39_Actor, foaf:Person
aac:Person
rdaGr2:placeOfBirth rdaGr2:placeOfDeath
edm:Place/crm:E53_Place
aac:Place
aac:associatedPlace
schema:PostalAddress
schema:address
Pedro Szekely and Craig KnoblockUniversity of Southern California
edm:ProvidedCHO
aac:CulturalHeritageObject
skos:Concept
skos:Concept
edm:hasType
skos:narrower
skos:prefLabelskos:prefLabel
saam:objectId
dcterms:date
dcterms:provenance
dcterms:rights
dcterms:subject
dcterms:medium
dcterms:title
dcterms:description
dcterms:creator
ore:Aggregation
edm:EuropeanaAggregation crm:E89_Propositional_Object
edm:WebResource
edm:aggregatedCHO
edm:hasView
edm:Agent/crm:E39_Actor, foaf:Person
aac:Person
skos:altLabel
rdaGr2:dateOfDeath
rdaGr2:biographicalInformation
rdaGr2:placeOfBirth
rdaGr2:placeOfDeath
rdaGr2:dateAssociated
WithThePerson
edm:Place/crm:E53_Place
aac:Place
aac:associatedPlace
schema:PostalAddressschema:addressCountry
schema:addressLocality
schema:addressRegion
schema:address
skos:prefLabel
schema:Country
schema:name
dcterms:format
rdaGr2:dateOfBirth
skos:prefLabel
saam:objectNumber
saam:constituentId
dcterms:created
Pedro Szekely and Craig KnoblockUniversity of Southern California
mapping the data to
the ontologies
how to enable museums to do this themselves?
Pedro Szekely and Craig KnoblockUniversity of Southern California
Karma
Hierarchical
Sources
Services
Model
Karma
Tabular
Sources
Database
…
Interactive tool for rapidly extracting, cleaning, transforming,
integrating, and publishing data
Pedro Szekely and Craig KnoblockUniversity of Southern California
[ Knoblock, Szekely, et al. Semi-automatically mapping
structured sources into the semantic web. ISWC 2012 ]
specifying transformations and
mapping to properties with
Karma
Pedro Szekely and Craig KnoblockUniversity of Southern California
saam:person/2
aac-ont:Person
“George M. Aarons”
aac-ont:variantName
rdf:type
saam:person/15
“Alice Stanley Archeson”
aac-ont:marriedName
rdf:type
Pedro Szekely and Craig KnoblockUniversity of Southern California
Pedro Szekely and Craig KnoblockUniversity of Southern California
download the presentation to view the embedded video
mapping to object
properties using
Karma
Pedro Szekely and Craig KnoblockUniversity of Southern California
Pedro Szekely and Craig KnoblockUniversity of Southern California
download the presentation to view the embedded video
Evaluation of Data Mapping Using Karma
SAAM database
8 tables
29 columns
Ontologies
407 classes
105 data properties
229 object properties
# of times Karma’s top 4
suggestions contain the
correct semantic type
# of times Karma
correctly assigns object
properties
Time
(minutes)
Run 1:
no training
data
7 out of 29 (24%) 30 out of 35 (85%) 18
Run 2:
using Run 1
as training
27 out of 29 (93%) 32 out of 35 (91%) 8
Pedro Szekely and Craig KnoblockUniversity of Southern California
identifying and
curating links
Pedro Szekely and Craig KnoblockUniversity of Southern California
Pedro Szekely and Craig KnoblockUniversity of Southern California
Multiple “John Singer Sargent”
ima:Person_John_Singer_Sargent
a aac-ont:Person ;
dct:date "1856-1925" ;
foaf:name "John Singer Sargent" .
saam:Person_4253
a aac-ont:Person ;
aac-ont:associatedPlace
saam:SaamPlace_1357324439768t1r13950_0,
saam:SaamPlace_1357324439768t1r13951_0 ;
saam:constituentId "4253" ;
rdaGr2:biographicalInformation
“Painter. Sargent traveled …" ;
rdaGr2:dateAssociatedWithThePerson "1990-10-1”, "1995-5-8" ;
rdaGr2:dateOfBirth "1856-1-12" ;
rdaGr2:dateOfDeath "1925-4-15" ;
rdaGr2:placeOfBirth saam:SaamPlace_1357324439768t1r13952_0 ;
rdaGr2:placeOfDeath saam:SaamPlace_1357324439768t1r13953_0 ;
foaf:name "John S. Sargent" ;
skos:altLabel "John S. Sargent" ;
skos:prefLabel "John Singer Sargent" .
cb:Person_John_Singer_Sargent
a aac-ont:Person ;
ont0:dateOfBirth "1879", "1885" ;
ont0:dateOfDeath "1925" ;
foaf:name "John Singer Sargent" .
met:Person_John_Singer_Sargent
a aac-ont:Person ;
ont0:placeOfResidence
"North and Central America",
"United States" ;
foaf:name "John Singer Sargent" .
dallas:Person_John_Singer_Sargent
a aac-ont:Person ;
ont0:dateOfBirth "1856" ;
ont0:dateOfDeath "1925" ;
foaf:name "John Singer Sargent" .
Pedro Szekely and Craig KnoblockUniversity of Southern California
John Singer Sargent
ima:SaamPerson_John_Singer_Sargent
a saam:SaamPerson ;
dct:date "1856-1925" ;
foaf:name "John Singer Sargent" .
saam:SaamPerson_4253
a saam:SaamPerson ;
saam:associatedPlace
saam:SaamPlace_1357324439768t1r13950_0,
saam:SaamPlace_1357324439768t1r13951_0 ;
saam:constituentId "4253" ;
rdaGr2:biographicalInformation
“Painter. Sargent traveled …" ;
rdaGr2:dateAssociatedWithThePerson "1990-10-1”, "1995-5-8" ;
rdaGr2:dateOfBirth "1856-1-12" ;
rdaGr2:dateOfDeath "1925-4-15" ;
rdaGr2:placeOfBirth saam:SaamPlace_1357324439768t1r13952_0 ;
rdaGr2:placeOfDeath saam:SaamPlace_1357324439768t1r13953_0 ;
skos:altLabel "John S. Sargent" ;
skos:prefLabel "John Singer Sargent" .
cb:SaamPerson_John_Singer_Sargent
a saam:SaamPerson ;
ont0:dateOfBirth "1879", "1885" ;
ont0:dateOfDeath "1925" ;
skos:prefLabel "John Singer Sargent" .
met:SaamPerson_John_Singer_Sargent
a saam:SaamPerson ;
ont0:placeOfResidence
"North and Central America",
"United States" ;
foaf:name "John Singer Sargent" .
dallas:SaamPerson_John_Singer_Sargent
a saam:SaamPerson ;
ont0:dateOfBirth "1856" ;
ont0:dateOfDeath "1925" ;
foaf:name "John Singer Sargent" .
Linking “John Singer Sargent”
saam:Person_4253
owl:sameAs cb:Person_John_Singer_Sargent ;
owl:sameAs dallas:Person_John_Singer_Sargent ;
owl:sameAs ima:Person_John_Singer_Sargent ;
owl:sameAs met:Person_John_Singer_Sargent ;
owl:sameAs dbpedia:John_Singer_Sargent ;
owl:sameAs nytimes:N49129220686803623753 ;
owl:sameAs w-flick:John_Singer_Sargent ;
...
.
Pedro Szekely and Craig KnoblockUniversity of Southern California
Intuition
Estimate discrimination power of properties,
e.g., of name, birth and death dates
birth date death date # of people
… … …
1800 1820 147
1800 1821 284
1800 1822 213
… … …
every
combination
of dates
Song, D., Heflin, J.: Domain-independent entity coreference for linking ontology instances.
ACM Journal of Data and Information Quality (ACM JDIQ) (2012)
similar idea to
Pedro Szekely and Craig KnoblockUniversity of Southern California
Evaluation of Automatic Linking
Pedro Szekely and Craig KnoblockUniversity of Southern California
SAAM names starting with “A” matched by hand
 535 people  176 matches
Results of Automatic Linking
Getty ULAN® 2,110
Rijksmuseum 551
Geonames 3,068
DBPedia 2,194
New York Times 70
Pedro Szekely and Craig KnoblockUniversity of Southern California
estimate ≈ 30 missing
links to DBpedia
Pedro Szekely and Craig KnoblockUniversity of Southern California
Curating Links with Karma
Pedro Szekely and Craig KnoblockUniversity of Southern California
Linking with Karma
results of automated linking and
interactive curation recorded using
PROV
Pedro Szekely and Craig KnoblockUniversity of Southern California
owl:sameAs statements constructed
using SPARQL CONSTRUCT queries
over PROV records
deployment
Pedro Szekely and Craig KnoblockUniversity of Southern California
Pedro Szekely and Craig KnoblockUniversity of Southern California
Pedro Szekely and Craig KnoblockUniversity of Southern California
Pedro Szekely and Craig KnoblockUniversity of Southern California
Pedro Szekely and Craig KnoblockUniversity of Southern California
Related Work
• Europeana
• 17 million items, 1,500 institutions
• Require exports in “Europeana” format
• Amsterdam Museum, Museum Finland
• Rich ontology, RDF to RDF mapping rules
• LODAC museums in Japan
• 114 museums, simple ontology
• Research Space, British Museum
• CIDOC CRM ontologies, complex mappings
We focused significantly on Linking identification and curation
Next Steps
• Applications leveraging linked data
• Virtual museum
• Tools to create multimedia stories about art
• Tools to find inconsistencies
• Feed data to wikidata
• American Art Collective: a linked data
consortium of museums
Pedro Szekely and Craig KnoblockUniversity of Southern California
Merci

Más contenido relacionado

Similar a Connecting the Smithsonian American Art Museum to the Linked Data Cloud

Reverse instruction inquiry
Reverse instruction inquiryReverse instruction inquiry
Reverse instruction inquiry
George Phillip
 
Reverse instruction inquiry
Reverse instruction inquiryReverse instruction inquiry
Reverse instruction inquiry
George Phillip
 
August 12 NISO Webinar: MOOCs and Libraries: A Brewing Collaboration.
August 12 NISO Webinar: MOOCs and Libraries: A Brewing Collaboration.August 12 NISO Webinar: MOOCs and Libraries: A Brewing Collaboration.
August 12 NISO Webinar: MOOCs and Libraries: A Brewing Collaboration.
National Information Standards Organization (NISO)
 
Science Journalism - Henrietta Lacks reporting - NABJ 2013 - David Kroll
Science Journalism - Henrietta Lacks reporting - NABJ 2013 - David KrollScience Journalism - Henrietta Lacks reporting - NABJ 2013 - David Kroll
Science Journalism - Henrietta Lacks reporting - NABJ 2013 - David Kroll
davidkroll
 

Similar a Connecting the Smithsonian American Art Museum to the Linked Data Cloud (9)

From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked KnowledgeFrom Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
 
Encoded Archival Context - Challenges, Possibilities, and Future (EAC-CPF)
Encoded Archival Context - Challenges, Possibilities, and Future (EAC-CPF)Encoded Archival Context - Challenges, Possibilities, and Future (EAC-CPF)
Encoded Archival Context - Challenges, Possibilities, and Future (EAC-CPF)
 
Reverse instruction inquiry
Reverse instruction inquiryReverse instruction inquiry
Reverse instruction inquiry
 
Reverse instruction inquiry
Reverse instruction inquiryReverse instruction inquiry
Reverse instruction inquiry
 
The Blossoming of the Semantic Web
The Blossoming of the Semantic WebThe Blossoming of the Semantic Web
The Blossoming of the Semantic Web
 
August 12 NISO Webinar: MOOCs and Libraries: A Brewing Collaboration.
August 12 NISO Webinar: MOOCs and Libraries: A Brewing Collaboration.August 12 NISO Webinar: MOOCs and Libraries: A Brewing Collaboration.
August 12 NISO Webinar: MOOCs and Libraries: A Brewing Collaboration.
 
2015.08.12 RootsMOOC NISO Webinar
2015.08.12 RootsMOOC NISO Webinar2015.08.12 RootsMOOC NISO Webinar
2015.08.12 RootsMOOC NISO Webinar
 
Science Journalism - Henrietta Lacks reporting - NABJ 2013 - David Kroll
Science Journalism - Henrietta Lacks reporting - NABJ 2013 - David KrollScience Journalism - Henrietta Lacks reporting - NABJ 2013 - David Kroll
Science Journalism - Henrietta Lacks reporting - NABJ 2013 - David Kroll
 
Science Journalism - Henrietta Lacks reporting - NABJ 2013 - David Kroll
Science Journalism - Henrietta Lacks reporting - NABJ 2013 - David KrollScience Journalism - Henrietta Lacks reporting - NABJ 2013 - David Kroll
Science Journalism - Henrietta Lacks reporting - NABJ 2013 - David Kroll
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Connecting the Smithsonian American Art Museum to the Linked Data Cloud