Exposing Digital Content as Linked Data

•Descargar como PPTX, PDF•

0 recomendaciones•972 vistas

This document summarizes a project that automatically generates metadata and links between digital books. It extracts semantic tags from books using natural language processing, represents each book as a collection of tags, and links books that share common tags. It optimizes the linking process by only considering the top 50% of tags to improve performance. The project aims to scale this approach and provide semantic recommendations between related books.

Ingeniería

Exposing Digital Content as Linked Data,
and Linking them using StoryBlink
Ben De Meester
Tom De Nies, Laurens De Vocht,
Ruben Verborgh, Erik Mannens,
and Rik Van de Walle
University Ghent – iMinds – Multimedia Lab
ben.demeester@ugent.be | @Ben__DM
NLPDBpedia2015@ISWC | October 11th 2015 | Bethlehem, PA

We live in a fast world
with a lot of content to sift through
http://blog.qmee.com/qmee-online-in-60-seconds/

Finding a good book in short time?
Recommendations!

Recommendations?
Social recommendations
Long tail
Metadata recommendations
Manual?

What do we want?
Automatic content-based metadata
to fuel future recommendation-engines

Content-based metadata
Get the tags…
DBPedia Spotlight
... use them to represent books’ content …
EPUB CFI, NIF, ITS, …
… and link to other books … in a good way.
TPF, EiCE
Storyblink!

Get the tags
Find out what a book is about…
Semantic tags!
Using NER/NED!
Extract all semantic concepts from the book

From a book to a semantic book
… …
Split HTML
into chunks
HTML
to text
Local
Spotlight

Represent a book by tags
@prefix schema: <http://schema.org/> .
@prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .
@prefix dbr: <http://dbpedia.org/resource/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix pg84: <http://www.gutenberg.org/ebooks/84.epub#> .
pg84:book a schema:Book .
pg84:epubcfi(/6/12!/4/2/4) itsrdf:taIdentRef dbr:Chamois ;
nif:sourceUrl pg84:book .
pg84:epubcfi(/6/2!/4/46[chap01]/16/42) itsrdf:taIdentRef dbr:Chamois ;
nif:sourceUrl pg84:book .
pg84:epubcfi(/6/12!/4/2/6) itsrdf:taIdentRef dbr:Desert ;
nif:sourceUrl pg84:book .
...

Link to other books
Open Source
Linked data
path finding
Multiple paths

Keeping all concepts…
Not all mentioned concepts are useful.
The path finding becomes really slow.

Keeping all concepts…
Not all mentioned concepts are useful.
The path finding becomes really slow.
What happens if we keep the top X%?

0
10
20
30
40
50
60
0
2
4
6
8
10
12
14
0 10 20 30 40 50 60 70 80 90 100
Time (s)#paths
Amount of considered concepts (%)
Top 50% of found concepts gives similar paths,
but a lot faster

Optimized Results
@prefix schema: <http://schema.org/> .
@prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#>
@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .
@prefix dbr: <http://dbpedia.org/resource/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix pg84: <http://www.gutenberg.org/ebooks/84.epub#> .
pg84:book a schema:Book .
pg84:book itsrdf:taIdentRef dbr:Chamois,
dbr:Desert,
...
http://uvdt.test.iminds.be/storyblinkdata/books

Storyblink
Exploring the links between classic works
Choose two books, and…

Next steps
Scale
Indirect paths
e.g. book about WWI and book about WWII
Relevancy measures
Knowledge base influence
Filtering influence

Storyblink
gives a semantic representation
of important semantic concepts
inside books, and uses those to connect
books together content-wise
http://uvdt.test.iminds.be/storyblink
Demo 48

Our project
The Publisher of the Future
Our pilot project partners:

Más contenido relacionado

Destacado

Judy daines slides finaljadaines

UI Testing mit Xcode 7OPEN KNOWLEDGE GmbH

Iqbal cviqbalsep

A2: Analog Malicious Hardwareyeokm1

Imuaaiuto

DisneyAditya Chaturvedi

Ahmed Abdel Kader El Zeini CV updatedPh Ahmad

FenomenologiaMilena Silva

Presentacion registral...yusbeli marina conde

Ficha formativa diamantes provetaN C

PCB Business Cardyeokm1

Presentation by jonathan holdsworthJonathan Holdsworth

Destacado (12)

Judy daines slides final

UI Testing mit Xcode 7

Iqbal cv

A2: Analog Malicious Hardware

Imu

Disney

Ahmed Abdel Kader El Zeini CV updated

Fenomenologia

Presentacion registral...

Ficha formativa diamantes proveta

PCB Business Card

Presentation by jonathan holdsworth

OrdRing2015 - Event-Driven Rule-based Reasoning using EYE

LINKed2015 - SERIF - A Semantic ExeRcise Interchange Format

ISWC2015 P&D - StoryBlink

LocWeb2015 - Reconnecting Digital Publications to the Web Using their Spatial...

Creating discoverable learning content using a user-friendly authoring enviro...

Último

Introduction to IEEE STANDARDS and its different types.pptxupamatechverse

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile

Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth

Porous Ceramics seminar and technical writingrakeshbaidya232001

Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi

Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile

Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N

Online banking management system project.pdfKamal Acharya

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

Roadmap to Membership of RICS - Pathways and RoutesM Maged Hegazy, LLM, MBA, CCP, P3O

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat

DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEslot gacor bisa pakai pulsa

Exposing Digital Content as Linked Data

1. Exposing Digital Content as Linked Data, and Linking them using StoryBlink Ben De Meester Tom De Nies, Laurens De Vocht, Ruben Verborgh, Erik Mannens, and Rik Van de Walle University Ghent – iMinds – Multimedia Lab ben.demeester@ugent.be | @Ben__DM NLPDBpedia2015@ISWC | October 11th 2015 | Bethlehem, PA

2. We live in a fast world with a lot of content to sift through http://blog.qmee.com/qmee-online-in-60-seconds/

3. Book ≠ Fast

4. Finding a good book in short time? Recommendations!

5. Recommendations? Social recommendations Long tail Metadata recommendations Manual?

6. What do we want? Automatic content-based metadata to fuel future recommendation-engines

7. Content-based metadata Get the tags… DBPedia Spotlight ... use them to represent books’ content … EPUB CFI, NIF, ITS, … … and link to other books … in a good way. TPF, EiCE Storyblink!

8. Get the tags Find out what a book is about… Semantic tags! Using NER/NED! Extract all semantic concepts from the book

9. AGDISTIS

10. AGDISTIS Open source Local NER/NED/NEL

11. From a book to a semantic book … … Split HTML into chunks HTML to text Local Spotlight

12. Represent a book by tags @prefix schema: <http://schema.org/> . @prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> . @prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> . @prefix dbr: <http://dbpedia.org/resource/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix pg84: <http://www.gutenberg.org/ebooks/84.epub#> . pg84:book a schema:Book . pg84:epubcfi(/6/12!/4/2/4) itsrdf:taIdentRef dbr:Chamois ; nif:sourceUrl pg84:book . pg84:epubcfi(/6/2!/4/46[chap01]/16/42) itsrdf:taIdentRef dbr:Chamois ; nif:sourceUrl pg84:book . pg84:epubcfi(/6/12!/4/2/6) itsrdf:taIdentRef dbr:Desert ; nif:sourceUrl pg84:book . ...

13. Represent a book by tags @prefix schema: <http://schema.org/> . @prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> . @prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> . @prefix dbr: <http://dbpedia.org/resource/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix pg84: <http://www.gutenberg.org/ebooks/84.epub#> . pg84:book a schema:Book . pg84:epubcfi(/6/12!/4/2/4) itsrdf:taIdentRef dbr:Chamois ; nif:sourceUrl pg84:book . pg84:epubcfi(/6/2!/4/46[chap01]/16/42) itsrdf:taIdentRef dbr:Chamois ; nif:sourceUrl pg84:book . pg84:epubcfi(/6/12!/4/2/6) itsrdf:taIdentRef dbr:Desert ; nif:sourceUrl pg84:book . ...

14. Link to other books Open Source Linked data path finding Multiple paths

15.

16.

17.

18.

19.

20. Keeping all concepts… Not all mentioned concepts are useful. The path finding becomes really slow.

21. Keeping all concepts… Not all mentioned concepts are useful. The path finding becomes really slow. What happens if we keep the top X%?

22. 0 10 20 30 40 50 60 0 2 4 6 8 10 12 14 0 10 20 30 40 50 60 70 80 90 100 Time (s)#paths Amount of considered concepts (%) Top 50% of found concepts gives similar paths, but a lot faster

23. 0 10 20 30 40 50 60 0 2 4 6 8 10 12 14 0 10 20 30 40 50 60 70 80 90 100 Time (s)#paths Amount of considered concepts (%) Top 50% of found concepts gives similar paths, but a lot faster Time-out

24. Optimized Results @prefix schema: <http://schema.org/> . @prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> @prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> . @prefix dbr: <http://dbpedia.org/resource/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix pg84: <http://www.gutenberg.org/ebooks/84.epub#> . pg84:book a schema:Book . pg84:book itsrdf:taIdentRef dbr:Chamois, dbr:Desert, ... http://uvdt.test.iminds.be/storyblinkdata/books

25. Storyblink Exploring the links between classic works Choose two books, and…

26. Storyblink

27. Next steps Scale Indirect paths e.g. book about WWI and book about WWII Relevancy measures Knowledge base influence Filtering influence

28. Storyblink gives a semantic representation of important semantic concepts inside books, and uses those to connect books together content-wise http://uvdt.test.iminds.be/storyblink Demo 48

29. Our project The Publisher of the Future Our pilot project partners:

Notas del editor

I would like to talk about representing large bodies of text to a small set of representative tags automatically, and how we can use that to find out how stories are related, so namely how we automatically exposed digital content as linked data, and link those semantic representations using an application we dubbed storyblink
Not tf-idf because a book about paper  paper important
… so please come and check our booth 48 to play around with storyblink

Exposing Digital Content as Linked Data

Recomendados

Recomendados

Más contenido relacionado

Destacado

Destacado (12)

Similar a Exposing Digital Content as Linked Data

Similar a Exposing Digital Content as Linked Data (20)

Más de Ben De Meester

Más de Ben De Meester (12)

Último

Último (20)

Exposing Digital Content as Linked Data

Notas del editor