SlideShare una empresa de Scribd logo
1 de 38
Tetherless World Constellation, RPI
Digital Archiving, The Semantic
Web, and Modern AI
Jim Hendler
Tetherless World Professor of Computer, Web and Cognitive Sciences
Director, Institute for Data Exploration and Applications
Rensselaer Polytechnic Institute
http://www.cs.rpi.edu/~hendler
@jahendler (twitter)
Major talks at: http://www.slideshare.net/jahendler
Tetherless World Constellation, RPI
Not going to talk today about issues of
AI and society, personal data, umeployment, etc.
Wrote a book about those, happy to discuss w/people…
Today I will focus on archiving:
metadata, knowledge graphs, & new directions in AI
(or see slideshare for “jahendler”, TedX, …)
Tetherless World Constellation, RPI
The real challenge
• Today would be the 60th birthday of
my best friend growing up, Jack
Pressman (who passed away 20
years ago)
– How could we find a picture/image of
him?
• Not famous enough for wikipedia
• Never made it into a youtube video
• Common name (and not likely to have been
annotated)
Tetherless World Constellation, RPI
Finding Jack
• What would you do?
– (Class exercise)
• We’d learn what we could about him
– We know his age
– Where did he grow up
• Any of those locations have pictures with people
– Where did he go to school
• Any famous classmates he may be in picture with
– Any major accomplishments
• He wrote a well-respected book on the history of medicine (lobotomies)
• Essentially, we look for things that “link” him to
places, events, objects, times …
– This is how finding things in archives happens
• How can machines help?
Tetherless World Constellation, RPI
So we annotate images/videos
But the information is saved internal to the
system, generally for later search, not exposed
externally…
Tetherless World Constellation, RPI
C) Semantic Web
2001
Tetherless World Constellation, RPI
On the Web -- links are critical!
<a href= URI>
HTML
Web page Any Web Resource
<a href=“http://…”>
RDF
URI URI
URI
RDF is like the web!
Tetherless World Constellation, RPI
<mind:Person rdf:id=“Hendler”>
<mind:title jobs:Professor>
<jobs:placeOfWork http://www.cs.rpi.edu>
</mind:Person>
DOC1
Hendler
DOC1 Mind:title
Jobs:placeOfWork Web Page
http://www…
ProfessorJobs:Mind:
Jobs:
Links in the data
Tetherless World Constellation, RPI
<mind:Person rdf:id=“Hendler”>
owl:sameAs
<http://dbpedia.org/page/James_Hendler>
DOC2
Hendler
Mind:title
Jobs:placeOfWork Web Page
http://www…
Jobs:Professor
Asserting Links in the data
Dbpedia:Hendler
Owl:sameAs
Dbpedia:ComputerScientist
Dbpedia:occupation
Tetherless World Constellation, RPI
Led to Linked Data experimentation and growth
Billions of links in public cloud – across many sectors
Marking up metadata in images
Slide from 2002
Tetherless World Constellation, RPI
Based on RDF Schema/OWL
PhotoStuff, ca. 2005-2007
Tetherless World Constellation, RPI
And instances
Tetherless World Constellation, RPI
NASA image markup (SemSpace, 2006)
Also used by other govt agencies in DoD
Tetherless World Constellation, RPI
Extended to video markup (segments)
A particular scene from
a movie…
The story that ran on
NHK television from
0847-0903 on
2001-09-11 (GMT + 9)
2008
Tetherless World Constellation, RPI
Extended to video annotation
2008
Tetherless World Constellation, RPI
Various experiments in museums
Lora Aroyo, 2011
Tetherless World Constellation, RPI
BBC Ontologies
Many demos 2012 Olympics
Tetherless World Constellation, RPI
Commercial takeoff really started ca. 2012
Tetherless World Constellation, RPI
Google 2012
The Knowledge Graph
Tetherless World Constellation, RPI
Facebook 2012
The Open Graph Protocol
Tetherless World Constellation, RPI
Impressive results
Google finds embedded metadata on >30% of its crawl – Guha, 2015
Google “knowledge vault” reported to have over 5 billion “facts” (links)
Tetherless World Constellation, RPI
But, the knowledge graph isn’t all automated
(P. Norvig, WWW 2016, 4/16)
Tetherless World Constellation, RPI
© Peter Mika, 2014.
Tetherless World Constellation, RPI
© Peter Mika, 2014.
Tetherless World Constellation, RPI
© Peter Mika, 2014.
Tetherless World Constellation, RPI
What about image/video archiving
• Despite this growth, still mostly
“experimental” in the archiving
community
– Especially image/video
• Two main impediments
– High cost of annotating collections with
enhanced metadata
– How does doing the annotation increase
the “value” of a collection
• Beyond search
Tetherless World Constellation, RPI
Recent major breakthrough in
automating computer vision
“phase transition” in capabilities of neural networks
w/machine power
Tetherless World Constellation, RPI
“deep learning”
“phase transition” in capabilities of neural networks
w/machine power
Tetherless World Constellation, RPI
Impressive results
Increasingly powerful techniques have yielded
incredible results in the past few years
Tetherless World Constellation, RPI
Moving to Vision and Text Mix
Tetherless World Constellation, RPI
Context issues a problem
Tetherless World Constellation, RPI
And still a long way to go
Tetherless World Constellation, RPI
But recent “action” descriptions doing better
than question answering
A very promising direction for
jumpstarting (semi)-automated
annotation
Tetherless World Constellation, RPI
Moving from search to exploration
(Mei Si, 2017)
Using “narrative” technology to turn our campus
archive into an interactive “story”
Tetherless World Constellation, RPI
At human scales
Cognitive and Immersive Systems Laboratory
http://cisl.rpi.edu
Tetherless World Constellation, RPI
Summary
Semantic Web (Linked Data) has been a small, but growing
presence in the archiving world
- increasing use in library and museum communities
- increasing interest in collection management
- increasing interest in collection sharing
Semantic Technologies are being deployed at scale in the
larger Web world
- still primarily for search (ad match) and social
networking (ad match)
New AI technologies have the potential to overcome some of
the key problems
- reducing the cost of metadata generation/annotation
- making archives “alive” and explorable
Tetherless World Constellation, RPI
Questions?

Más contenido relacionado

La actualidad más candente

Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveJames Hendler
 
Semantic Web: The Inside Story
Semantic Web: The Inside StorySemantic Web: The Inside Story
Semantic Web: The Inside StoryJames Hendler
 
"Why the Semantic Web will Never Work" (note the quotes)
"Why the Semantic Web will Never Work"  (note the quotes)"Why the Semantic Web will Never Work"  (note the quotes)
"Why the Semantic Web will Never Work" (note the quotes)James Hendler
 
Data Big and Broad (Oxford, 2012)
Data Big and Broad (Oxford, 2012)Data Big and Broad (Oxford, 2012)
Data Big and Broad (Oxford, 2012)James Hendler
 
Facilitating Web Science Collaboration through Semantic Markup
Facilitating Web Science Collaboration through Semantic MarkupFacilitating Web Science Collaboration through Semantic Markup
Facilitating Web Science Collaboration through Semantic MarkupJames Hendler
 
The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021Steve Omohundro
 
How to Build a Research Roadmap (avoiding tempting dead-ends)
How to Build a Research Roadmap (avoiding tempting dead-ends)How to Build a Research Roadmap (avoiding tempting dead-ends)
How to Build a Research Roadmap (avoiding tempting dead-ends)Aaron Sloman
 
Virtuality, causation and the mind-body relationship
Virtuality, causation and the mind-body relationshipVirtuality, causation and the mind-body relationship
Virtuality, causation and the mind-body relationshipAaron Sloman
 
Y conf talk - Andrej Karpathy
Y conf talk - Andrej KarpathyY conf talk - Andrej Karpathy
Y conf talk - Andrej KarpathySze Siong Teo
 
Data Culture Series - Keynote & Panel - Birmingham - 8th April 2015
Data Culture Series  - Keynote & Panel - Birmingham - 8th April 2015Data Culture Series  - Keynote & Panel - Birmingham - 8th April 2015
Data Culture Series - Keynote & Panel - Birmingham - 8th April 2015Jonathan Woodward
 
Data Science For Social Scientists Workshop
Data Science For Social Scientists WorkshopData Science For Social Scientists Workshop
Data Science For Social Scientists WorkshopIan Hopkinson
 
Making sense of messy problems - Systems thinking for interaction designers
Making sense of messy problems - Systems thinking for interaction designersMaking sense of messy problems - Systems thinking for interaction designers
Making sense of messy problems - Systems thinking for interaction designersjohanna kollmann
 
Machine Learning Introduction for Digital Business Leaders
Machine Learning Introduction for Digital Business LeadersMachine Learning Introduction for Digital Business Leaders
Machine Learning Introduction for Digital Business LeadersSudha Jamthe
 
Systems Thinking workshop @ Lean UX NYC 2014
Systems Thinking workshop @ Lean UX NYC 2014Systems Thinking workshop @ Lean UX NYC 2014
Systems Thinking workshop @ Lean UX NYC 2014johanna kollmann
 
Data Science and Culture
Data Science and CultureData Science and Culture
Data Science and CultureÍcaro Medeiros
 
IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018
IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018
IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018Carol Smith
 

La actualidad más candente (20)

Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspective
 
Semantic Web: The Inside Story
Semantic Web: The Inside StorySemantic Web: The Inside Story
Semantic Web: The Inside Story
 
"Why the Semantic Web will Never Work" (note the quotes)
"Why the Semantic Web will Never Work"  (note the quotes)"Why the Semantic Web will Never Work"  (note the quotes)
"Why the Semantic Web will Never Work" (note the quotes)
 
Data Big and Broad (Oxford, 2012)
Data Big and Broad (Oxford, 2012)Data Big and Broad (Oxford, 2012)
Data Big and Broad (Oxford, 2012)
 
Facilitating Web Science Collaboration through Semantic Markup
Facilitating Web Science Collaboration through Semantic MarkupFacilitating Web Science Collaboration through Semantic Markup
Facilitating Web Science Collaboration through Semantic Markup
 
What should students learn in an age of AI?
What should students learn in an age of AI?What should students learn in an age of AI?
What should students learn in an age of AI?
 
The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021
 
How to Build a Research Roadmap (avoiding tempting dead-ends)
How to Build a Research Roadmap (avoiding tempting dead-ends)How to Build a Research Roadmap (avoiding tempting dead-ends)
How to Build a Research Roadmap (avoiding tempting dead-ends)
 
Virtuality, causation and the mind-body relationship
Virtuality, causation and the mind-body relationshipVirtuality, causation and the mind-body relationship
Virtuality, causation and the mind-body relationship
 
From Big Data to AI
From Big Data to AIFrom Big Data to AI
From Big Data to AI
 
Y conf talk - Andrej Karpathy
Y conf talk - Andrej KarpathyY conf talk - Andrej Karpathy
Y conf talk - Andrej Karpathy
 
Data Culture Series - Keynote & Panel - Birmingham - 8th April 2015
Data Culture Series  - Keynote & Panel - Birmingham - 8th April 2015Data Culture Series  - Keynote & Panel - Birmingham - 8th April 2015
Data Culture Series - Keynote & Panel - Birmingham - 8th April 2015
 
Novi sad ai event 1-2018
Novi sad ai event 1-2018Novi sad ai event 1-2018
Novi sad ai event 1-2018
 
Data Science For Social Scientists Workshop
Data Science For Social Scientists WorkshopData Science For Social Scientists Workshop
Data Science For Social Scientists Workshop
 
Making sense of messy problems - Systems thinking for interaction designers
Making sense of messy problems - Systems thinking for interaction designersMaking sense of messy problems - Systems thinking for interaction designers
Making sense of messy problems - Systems thinking for interaction designers
 
Machine Learning Introduction for Digital Business Leaders
Machine Learning Introduction for Digital Business LeadersMachine Learning Introduction for Digital Business Leaders
Machine Learning Introduction for Digital Business Leaders
 
Systems Thinking workshop @ Lean UX NYC 2014
Systems Thinking workshop @ Lean UX NYC 2014Systems Thinking workshop @ Lean UX NYC 2014
Systems Thinking workshop @ Lean UX NYC 2014
 
Singularity
SingularitySingularity
Singularity
 
Data Science and Culture
Data Science and CultureData Science and Culture
Data Science and Culture
 
IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018
IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018
IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018
 

Similar a Digital Archiving, The Semantic Web, and Modern AI

The Semantic Web: 2010 Update
The Semantic Web: 2010 UpdateThe Semantic Web: 2010 Update
The Semantic Web: 2010 UpdateJames Hendler
 
The Semantic Web: 2010 Update
The Semantic Web: 2010 Update The Semantic Web: 2010 Update
The Semantic Web: 2010 Update James Hendler
 
Broad Data (India 2015)
Broad Data (India 2015)Broad Data (India 2015)
Broad Data (India 2015)James Hendler
 
Semantic Web: "ten year" update
Semantic Web: "ten year" updateSemantic Web: "ten year" update
Semantic Web: "ten year" updateJames Hendler
 
Semantic Wiki Based Collaborative Scientific Modeling Infrastructure
Semantic Wiki Based  Collaborative Scientific Modeling Infrastructure Semantic Wiki Based  Collaborative Scientific Modeling Infrastructure
Semantic Wiki Based Collaborative Scientific Modeling Infrastructure Jie Bao
 
The Semantic Web: It's for Real
The Semantic Web: It's for RealThe Semantic Web: It's for Real
The Semantic Web: It's for RealJames Hendler
 
Semantic Web: introduction & overview
Semantic Web: introduction & overviewSemantic Web: introduction & overview
Semantic Web: introduction & overviewAmit Sheth
 
The Future(s) of the World Wide Web
The Future(s) of the World Wide WebThe Future(s) of the World Wide Web
The Future(s) of the World Wide WebJames Hendler
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudDhaval Thakker
 
Tutorial on Semantic Digital Libraries (ESWC'2007)
Tutorial on Semantic Digital Libraries (ESWC'2007)Tutorial on Semantic Digital Libraries (ESWC'2007)
Tutorial on Semantic Digital Libraries (ESWC'2007)Sebastian Ryszard Kruk
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss
 
Knowledge Graph Semantics/Interoperability
Knowledge Graph Semantics/InteroperabilityKnowledge Graph Semantics/Interoperability
Knowledge Graph Semantics/InteroperabilityJames Hendler
 
TFS Talk by Hackathorn 20100527 v2
TFS Talk by Hackathorn 20100527 v2TFS Talk by Hackathorn 20100527 v2
TFS Talk by Hackathorn 20100527 v2Richard Hackathorn
 
The importance of the Web for the Semantic Web
The importance of the Web for the Semantic WebThe importance of the Web for the Semantic Web
The importance of the Web for the Semantic WebAlexandre Monnin
 
20110324 linked openeuropeanahumanities
20110324 linked openeuropeanahumanities20110324 linked openeuropeanahumanities
20110324 linked openeuropeanahumanitiesStefan Gradmann
 

Similar a Digital Archiving, The Semantic Web, and Modern AI (20)

The Semantic Web: 2010 Update
The Semantic Web: 2010 UpdateThe Semantic Web: 2010 Update
The Semantic Web: 2010 Update
 
The Semantic Web: 2010 Update
The Semantic Web: 2010 Update The Semantic Web: 2010 Update
The Semantic Web: 2010 Update
 
Broad Data (India 2015)
Broad Data (India 2015)Broad Data (India 2015)
Broad Data (India 2015)
 
ITWS Capstone Lecture (Spring 2013)
ITWS Capstone Lecture (Spring 2013)ITWS Capstone Lecture (Spring 2013)
ITWS Capstone Lecture (Spring 2013)
 
Semantic Web: "ten year" update
Semantic Web: "ten year" updateSemantic Web: "ten year" update
Semantic Web: "ten year" update
 
The Semantic Web: RPI ITWS Capstone (Fall 2012)
The Semantic Web: RPI ITWS Capstone (Fall 2012)The Semantic Web: RPI ITWS Capstone (Fall 2012)
The Semantic Web: RPI ITWS Capstone (Fall 2012)
 
When?
When?When?
When?
 
Semantic Wiki Based Collaborative Scientific Modeling Infrastructure
Semantic Wiki Based  Collaborative Scientific Modeling Infrastructure Semantic Wiki Based  Collaborative Scientific Modeling Infrastructure
Semantic Wiki Based Collaborative Scientific Modeling Infrastructure
 
The Semantic Web: It's for Real
The Semantic Web: It's for RealThe Semantic Web: It's for Real
The Semantic Web: It's for Real
 
Semantic Web: introduction & overview
Semantic Web: introduction & overviewSemantic Web: introduction & overview
Semantic Web: introduction & overview
 
The Future(s) of the World Wide Web
The Future(s) of the World Wide WebThe Future(s) of the World Wide Web
The Future(s) of the World Wide Web
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data Cloud
 
Tutorial on Semantic Digital Libraries (ESWC'2007)
Tutorial on Semantic Digital Libraries (ESWC'2007)Tutorial on Semantic Digital Libraries (ESWC'2007)
Tutorial on Semantic Digital Libraries (ESWC'2007)
 
ITWS Capstone (RPI, Fall 2013)
ITWS Capstone (RPI, Fall 2013)ITWS Capstone (RPI, Fall 2013)
ITWS Capstone (RPI, Fall 2013)
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
Knowledge Graph Semantics/Interoperability
Knowledge Graph Semantics/InteroperabilityKnowledge Graph Semantics/Interoperability
Knowledge Graph Semantics/Interoperability
 
TFS Talk by Hackathorn 20100527 v2
TFS Talk by Hackathorn 20100527 v2TFS Talk by Hackathorn 20100527 v2
TFS Talk by Hackathorn 20100527 v2
 
The importance of the Web for the Semantic Web
The importance of the Web for the Semantic WebThe importance of the Web for the Semantic Web
The importance of the Web for the Semantic Web
 
20110324 linked openeuropeanahumanities
20110324 linked openeuropeanahumanities20110324 linked openeuropeanahumanities
20110324 linked openeuropeanahumanities
 
Irish Digital Libraries Summit
Irish Digital Libraries SummitIrish Digital Libraries Summit
Irish Digital Libraries Summit
 

Más de James Hendler

Knowing what AI Systems Don't know and Why it matters
Knowing what AI  Systems Don't know and Why it mattersKnowing what AI  Systems Don't know and Why it matters
Knowing what AI Systems Don't know and Why it mattersJames Hendler
 
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")Exploring the Boundaries of Artificial Intelligence (or "Modern AI")
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")James Hendler
 
Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)James Hendler
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) CommonsJames Hendler
 
Enhancing Precision Wellness with Personal Health Knowledge Graphs
Enhancing Precision Wellness with Personal Health Knowledge Graphs Enhancing Precision Wellness with Personal Health Knowledge Graphs
Enhancing Precision Wellness with Personal Health Knowledge Graphs James Hendler
 
Capacity Building: Data Science in the University At Rensselaer Polytechnic ...
Capacity Building: Data Science in the University  At Rensselaer Polytechnic ...Capacity Building: Data Science in the University  At Rensselaer Polytechnic ...
Capacity Building: Data Science in the University At Rensselaer Polytechnic ...James Hendler
 
Enhancing Precision Wellness with Knowledge Graphs and Semantic Analytics: O...
Enhancing Precision Wellness with  Knowledge Graphs and Semantic Analytics: O...Enhancing Precision Wellness with  Knowledge Graphs and Semantic Analytics: O...
Enhancing Precision Wellness with Knowledge Graphs and Semantic Analytics: O...James Hendler
 
Artificial Intelligence: Existential Threat or Our Best Hope for the Future?
Artificial Intelligence: Existential Threat or Our Best Hope for the Future?Artificial Intelligence: Existential Threat or Our Best Hope for the Future?
Artificial Intelligence: Existential Threat or Our Best Hope for the Future?James Hendler
 
The Science of Data Science
The Science of Data Science The Science of Data Science
The Science of Data Science James Hendler
 
Watson: An Academic's Perspective
Watson: An Academic's PerspectiveWatson: An Academic's Perspective
Watson: An Academic's PerspectiveJames Hendler
 
Big Data and Computer Science Education
Big Data and Computer Science EducationBig Data and Computer Science Education
Big Data and Computer Science EducationJames Hendler
 
The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration James Hendler
 
Watson at RPI - Summer 2013
Watson at RPI - Summer 2013Watson at RPI - Summer 2013
Watson at RPI - Summer 2013James Hendler
 
Future of the World WIde Web (India)
Future of the World WIde Web (India)Future of the World WIde Web (India)
Future of the World WIde Web (India)James Hendler
 

Más de James Hendler (14)

Knowing what AI Systems Don't know and Why it matters
Knowing what AI  Systems Don't know and Why it mattersKnowing what AI  Systems Don't know and Why it matters
Knowing what AI Systems Don't know and Why it matters
 
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")Exploring the Boundaries of Artificial Intelligence (or "Modern AI")
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")
 
Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
 
Enhancing Precision Wellness with Personal Health Knowledge Graphs
Enhancing Precision Wellness with Personal Health Knowledge Graphs Enhancing Precision Wellness with Personal Health Knowledge Graphs
Enhancing Precision Wellness with Personal Health Knowledge Graphs
 
Capacity Building: Data Science in the University At Rensselaer Polytechnic ...
Capacity Building: Data Science in the University  At Rensselaer Polytechnic ...Capacity Building: Data Science in the University  At Rensselaer Polytechnic ...
Capacity Building: Data Science in the University At Rensselaer Polytechnic ...
 
Enhancing Precision Wellness with Knowledge Graphs and Semantic Analytics: O...
Enhancing Precision Wellness with  Knowledge Graphs and Semantic Analytics: O...Enhancing Precision Wellness with  Knowledge Graphs and Semantic Analytics: O...
Enhancing Precision Wellness with Knowledge Graphs and Semantic Analytics: O...
 
Artificial Intelligence: Existential Threat or Our Best Hope for the Future?
Artificial Intelligence: Existential Threat or Our Best Hope for the Future?Artificial Intelligence: Existential Threat or Our Best Hope for the Future?
Artificial Intelligence: Existential Threat or Our Best Hope for the Future?
 
The Science of Data Science
The Science of Data Science The Science of Data Science
The Science of Data Science
 
Watson: An Academic's Perspective
Watson: An Academic's PerspectiveWatson: An Academic's Perspective
Watson: An Academic's Perspective
 
Big Data and Computer Science Education
Big Data and Computer Science EducationBig Data and Computer Science Education
Big Data and Computer Science Education
 
The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration
 
Watson at RPI - Summer 2013
Watson at RPI - Summer 2013Watson at RPI - Summer 2013
Watson at RPI - Summer 2013
 
Future of the World WIde Web (India)
Future of the World WIde Web (India)Future of the World WIde Web (India)
Future of the World WIde Web (India)
 

Último

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Último (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Digital Archiving, The Semantic Web, and Modern AI

  • 1. Tetherless World Constellation, RPI Digital Archiving, The Semantic Web, and Modern AI Jim Hendler Tetherless World Professor of Computer, Web and Cognitive Sciences Director, Institute for Data Exploration and Applications Rensselaer Polytechnic Institute http://www.cs.rpi.edu/~hendler @jahendler (twitter) Major talks at: http://www.slideshare.net/jahendler
  • 2. Tetherless World Constellation, RPI Not going to talk today about issues of AI and society, personal data, umeployment, etc. Wrote a book about those, happy to discuss w/people… Today I will focus on archiving: metadata, knowledge graphs, & new directions in AI (or see slideshare for “jahendler”, TedX, …)
  • 3. Tetherless World Constellation, RPI The real challenge • Today would be the 60th birthday of my best friend growing up, Jack Pressman (who passed away 20 years ago) – How could we find a picture/image of him? • Not famous enough for wikipedia • Never made it into a youtube video • Common name (and not likely to have been annotated)
  • 4. Tetherless World Constellation, RPI Finding Jack • What would you do? – (Class exercise) • We’d learn what we could about him – We know his age – Where did he grow up • Any of those locations have pictures with people – Where did he go to school • Any famous classmates he may be in picture with – Any major accomplishments • He wrote a well-respected book on the history of medicine (lobotomies) • Essentially, we look for things that “link” him to places, events, objects, times … – This is how finding things in archives happens • How can machines help?
  • 5. Tetherless World Constellation, RPI So we annotate images/videos But the information is saved internal to the system, generally for later search, not exposed externally…
  • 6. Tetherless World Constellation, RPI C) Semantic Web 2001
  • 7. Tetherless World Constellation, RPI On the Web -- links are critical! <a href= URI> HTML Web page Any Web Resource <a href=“http://…”> RDF URI URI URI RDF is like the web!
  • 8. Tetherless World Constellation, RPI <mind:Person rdf:id=“Hendler”> <mind:title jobs:Professor> <jobs:placeOfWork http://www.cs.rpi.edu> </mind:Person> DOC1 Hendler DOC1 Mind:title Jobs:placeOfWork Web Page http://www… ProfessorJobs:Mind: Jobs: Links in the data
  • 9. Tetherless World Constellation, RPI <mind:Person rdf:id=“Hendler”> owl:sameAs <http://dbpedia.org/page/James_Hendler> DOC2 Hendler Mind:title Jobs:placeOfWork Web Page http://www… Jobs:Professor Asserting Links in the data Dbpedia:Hendler Owl:sameAs Dbpedia:ComputerScientist Dbpedia:occupation
  • 10. Tetherless World Constellation, RPI Led to Linked Data experimentation and growth Billions of links in public cloud – across many sectors
  • 11. Marking up metadata in images Slide from 2002
  • 12. Tetherless World Constellation, RPI Based on RDF Schema/OWL PhotoStuff, ca. 2005-2007
  • 13. Tetherless World Constellation, RPI And instances
  • 14. Tetherless World Constellation, RPI NASA image markup (SemSpace, 2006) Also used by other govt agencies in DoD
  • 15. Tetherless World Constellation, RPI Extended to video markup (segments) A particular scene from a movie… The story that ran on NHK television from 0847-0903 on 2001-09-11 (GMT + 9) 2008
  • 16. Tetherless World Constellation, RPI Extended to video annotation 2008
  • 17. Tetherless World Constellation, RPI Various experiments in museums Lora Aroyo, 2011
  • 18. Tetherless World Constellation, RPI BBC Ontologies Many demos 2012 Olympics
  • 19. Tetherless World Constellation, RPI Commercial takeoff really started ca. 2012
  • 20. Tetherless World Constellation, RPI Google 2012 The Knowledge Graph
  • 21. Tetherless World Constellation, RPI Facebook 2012 The Open Graph Protocol
  • 22. Tetherless World Constellation, RPI Impressive results Google finds embedded metadata on >30% of its crawl – Guha, 2015 Google “knowledge vault” reported to have over 5 billion “facts” (links)
  • 23. Tetherless World Constellation, RPI But, the knowledge graph isn’t all automated (P. Norvig, WWW 2016, 4/16)
  • 24. Tetherless World Constellation, RPI © Peter Mika, 2014.
  • 25. Tetherless World Constellation, RPI © Peter Mika, 2014.
  • 26. Tetherless World Constellation, RPI © Peter Mika, 2014.
  • 27. Tetherless World Constellation, RPI What about image/video archiving • Despite this growth, still mostly “experimental” in the archiving community – Especially image/video • Two main impediments – High cost of annotating collections with enhanced metadata – How does doing the annotation increase the “value” of a collection • Beyond search
  • 28. Tetherless World Constellation, RPI Recent major breakthrough in automating computer vision “phase transition” in capabilities of neural networks w/machine power
  • 29. Tetherless World Constellation, RPI “deep learning” “phase transition” in capabilities of neural networks w/machine power
  • 30. Tetherless World Constellation, RPI Impressive results Increasingly powerful techniques have yielded incredible results in the past few years
  • 31. Tetherless World Constellation, RPI Moving to Vision and Text Mix
  • 32. Tetherless World Constellation, RPI Context issues a problem
  • 33. Tetherless World Constellation, RPI And still a long way to go
  • 34. Tetherless World Constellation, RPI But recent “action” descriptions doing better than question answering A very promising direction for jumpstarting (semi)-automated annotation
  • 35. Tetherless World Constellation, RPI Moving from search to exploration (Mei Si, 2017) Using “narrative” technology to turn our campus archive into an interactive “story”
  • 36. Tetherless World Constellation, RPI At human scales Cognitive and Immersive Systems Laboratory http://cisl.rpi.edu
  • 37. Tetherless World Constellation, RPI Summary Semantic Web (Linked Data) has been a small, but growing presence in the archiving world - increasing use in library and museum communities - increasing interest in collection management - increasing interest in collection sharing Semantic Technologies are being deployed at scale in the larger Web world - still primarily for search (ad match) and social networking (ad match) New AI technologies have the potential to overcome some of the key problems - reducing the cost of metadata generation/annotation - making archives “alive” and explorable