SlideShare una empresa de Scribd logo
Exploring Semantic Relationships in the Web of Data
Laurens De Vocht – 3.7.2017
DEPARTMENT OF ELECTRONICS AND INFORMATION SYSTEMS
IDLAB
I. Searching the web
II. Reveal releationships
III. Visually explore relationships
3
I. Searching the web
II. Reveal releationships
III. Visually explore relationships
4
5
EXAMPLE: FIND OUT MORE ABOUT EINSTEIN
A SEARCH QUERY
How to search so many pages this fast?
SEACHING IN PHYSICAL DOCUMENTS
Pile of documents
Organized
documents
SEARCHING VIA INDEX CARDS
Indexed
documents
CLASSICAL RETRIEVAL MODEL OF SEARCH ENGINES
Document
Document-
representation
Query Information ‘need’
[Bates, 1989; Robertson, 1977]
Search engine
‘match’
Index
SEARCHING THE WEB
Impressive state of the art
Millions of results, almost always relevant results among the first
10
Increidbly fast
< 1s
Billions of documents, spread across the globe, within a few
>50 billion estimated in index of largest search engines.
[van den Bosch, Bogers & Kunder, 2016]
LIMITATIONS OF CURRENT WEB SEARCH ENGINES
A. Further explore search results?
Exploratory search
B. What if the keywords intended something else?
Semantics
C. Combine different search results?
Find relationships
17
SEARCHING THE WEB: NEXT STEPS
A. Exploratory search
B. Semantics
C. Find relationships
18
BERRYPICKING
BERRYPICKING
[Bates, 1989]
20
Lookup Learn Investigate
Exploratory search
DIFFERENT TYPES OF SEARCH ACTIVITIES
21
Classic
information retrieval model
[Marchionini, 2006]
SEARCHING THE WEB: NEXT STEPS
A. Exploratory search
B. Semantics
C. Find relationships
22
SEMANTIEK KORT UITGELEGD
23
I can eat an apple.
But I can’t eat Apple.
SEMANTICS IN WEB DOCUMENTS
24
𝒆 = 𝒎𝒄 𝟐
General relativity
Theory of special relativity
Albert Einstein
Twin paradox
ANALOGY ON HOW MACHINES FIND THINGS IN WEB
DOCUMENTS
25
Unieke identification on the web:
Uniform Resource Identifier: URI
Unique identification in a printed atlas
7
L
URI’S VOOR DATA OP HET WEB
𝒆 = 𝒎𝒄 𝟐
http://dbpedia.org/resource/Albert_Einstein
http://dbpedia.org/resource/Special_relativity
http://dbpedia.org/resource/Twin_Paradoxhttp://dbpedia.org/resource/General_relativity
26
EVERY CONCEPT IS ASSIGNED
PROPERTIES/ATTRIBUTES
27
𝒆 = 𝒎𝒄 𝟐
http://dbpedia.org/resource/Special_relativity
http://dbpedia.org/page/Special_relativity
𝒆 = 𝒎𝒄 𝟐
(…)
dbr:Special_relativity dbc:subject dbc:Concepts_in_physics .
(…)
dbr:Albert_Einstein dbp:knownFor dbr:Special_relativity .
(…)
TRIPLE
28
dbr:Albert_Einstein dbp:knownFor dbr:Special_relativity .
subject predicate object
Resource Description Framework
RDF
Namespace vocabularia
dbr: http://dbpedia.org/resource/
dbp: http://dbpedia.org/property/
SEARCHING THE WEB: NEXT STEPS
A. Exploratory search
B. Semantics
C. Find relationships
29
?
EXAMPLE: CONNECTION BETWEEN EINSTEIN AND
NEWTON
FIND RELATIONSHIPS
Non common properties? More distant relationships?
Not all things are being related to each other and described within a single
document.
1
2
Efficient revealing relationships between data.
Allow users to gradually refine their search queries.
Map the influence of different search actions on the search precision.
Determine the contribution of revealed relationships while searching.
34
3
PURPOSE OF THIS PHD RESEARCH
4
part II
part III
I. Searching the web
II. Reveal releationships
III. Visually explore relationships
35
REVEALING RELATIONSHIPS: SERENDIPITY
37
EXAMPLE: REVEALING RELATIONSHIPS
Einstein
Newton
Physics
Hume :influences
:discipline
:birthplace
:residenc
e
:discipline
:influences
(…)
38
MANY, MANY POSSIBILITIES, EVEN FOR ‘SIMPLE’
RELATIONSHIPS
ITERATIVE ALGORITHM TO FIND RELATIONSHIPS
39
initialisation
filtering
find relationships
score relationships
next iteration?
(…)
index
RDF
relationships with
different path lengths
ITERATIVE ALGORITHM TO FIND RELATIONSHIPS
40
expand
search space
filtering
find relationships
score relationships
next iteration?
(…)
index
RDF
relationships with
different path lengths
INCREASING COMPLEXITY
Number of data elements (resources) to check with increasing path length
41
ARE THE RELATIONSHIPS RANDOM, ARBITRARY?
A PRIORI ESTIMATION
43
Heuristics
“the art of finding”
Examples:
 Jaccard distance
difference in semantic
relationships
 Normalized (DBpedia) distance
based on common references
 Confidence
possibility a resource does not
occur if another already does
A PRIORI ESTIMATION
44
Weights
Assign value to a relationship
Examples:
 Jaccard distance
difference in properties
 Combined node degree
rare things
 Jiang & Conrath
relations on the same level of
abstraction
A POSTERIORI SCORING
45
Semantic ranking
The score includes all relationships and all
resources along the entire path
EVALUATION: TRIVIAABOUT (KNOWN) SCIENTISTS
A priori estimates evaluated according to
som semantic ranking mechanisms and a user study.
Different relationships combined in a short ‘story’ about combinations of pairs:
Carl Linnaeus
Charles Darwin
Albert Einstein
Isaac Newton
Dataset
46
PATH SCORE: SEMANTIC RANKING
Focus on
Semantic commonalities
Focus on
Semantic differencesMixed
47
USER STUDY RESULTS
% voorkeur relatief t.o.v
mediaan in paarsgewijze
A/B beoordelingen.
48
EVALUATION: CONFERENCES & DIGITAL LIBRARIES IN COMPUTER
SCIENCES
Check the precision of search results during the search.
Comparison between:
own method (minimal cost paths with optimale estimates)
de de-facto baseline for many semantic applications, ‘Virtuoso’ (kortste paden)
Datasets
49
Eigen methode
SEARCH PRECISION
50
Virtuoso (baseline)
Baseline: more stable and on average similar
Ownl method: notable high scores for Q1, Q4 en Q7
Gemiddelde
Precisie
I. Searching the web
II. Reveal releationships
III. Visually explore relationships
51
WHEN EXPLORATORY SEARCH?
When users
(i) Do not know exactly how to formulate the most suited search query;
(ii) Rather want to browse or surf information than lookup something
specific.
52
FROM SEARCHING IN DOCUMENTS TO SEARCHING IN
DATA
53
Zoekmachine
Zoekresultaten
Vraagstellin
g
(…)
FROM SEARCHING IN DOCUMENTS TO SEARCHING IN
DATA
Zoekmachine
Zoekresultaten
Vraagstellin
g
(…)
?
54
EXPLORATORY SEARCH IN DATA
[Tvazorek et al., 2010] [Smith et al., 2005]
Via interacting the underlying data structure
Network based Tabular or faceted
55
PROPOSED DATA INTERACTION FLOW
56
EXAMPLE
57
Hypothesis
Revealing realtionships
among indirect related computer science publications, conferences and researchers,
facilities adding new relevant results to already found results.
Testing
A. Added value of revealing relationships among search results
B. Effectiveness and productivity of different search actions
Datasets
58
EVALUATION: SCENARIO
A. ADDED VALUE OF REVEALING RELATIONSHIPS AMONG SEARCH RESULTS
59
Effect with a simple and a complex query.
Simple
“Find a publication. Find a number of publications that have common co-authors with
the found publications.”
Complex
“Find multiple persons that had a publication two consequent years in the same
conference series”.
Search details to be filled in by the users.
The users were not aware if the pathfinding functionality was activated or not
A. ADDED VALUE OF REVEALING RELATIONSHIPS AMONG SEARCH RESULTS
60
0
10
20
30
40
50
60
70
Simple Query Complex Query
Negative (%) Positive (%)
B. EFFECTIVENESS AND PRODUCTIVITY OF DIFFERENT SEARCH ACTIONS
Tested actions:
1. Keyword-based search query
2. Add a top related resource
3. Expand neighbours
4. Expand neighbour of neigbour
5. Expand further related resource
61
Einstein
Search Query
Top related
Special Relativity
General Relativity
Twin Paradox
EFFECTIVITY OF A SEARCH ACTION
62
‘All’ data Showed data Relevante
showed data
Effectivity here equals precision
𝑬
E = amount of
relevant showed data
to showed data
PRODUCTIVITY OF CONSECUTIVE SEARCH ACTIONS
0
1
2
…
k
Consecutive
Search Actions
𝑬 𝟎
63
P = average increase of effectivity
after k search actions measured from
the second search action on (I)
B. EFFECTIVENESS AND PRODUCTIVITY OF DIFFERENT SEARCH ACTIONS
0
10
20
30
40
50
60
Lookup Add top related Neighbour expand Neighbour of
neigbour expand
Expand further
related resource
Effectiveness (%) Productivity (%)
64
Conclusions
65
EXPLORING SEMANTIC RELATIONSHIPS ON THE WEB
66
Compared searching the web vs. searching physical documents; impressive state of
the art.
From searching to exploring via ‘berrypicking’, more possibilities than pure ‘lookup’.
Semantics:
the meaning of resources, aside from their expression, description or
representation;
documents describe resources and consist of data;
‘linked’ data has a threefold structure ‘triples’ to express semantics.
Exploring relationships between resources is not trivial for non-common properties.
 Alternative for searching in different data sources using each time another search interface:
→ exploratory search via semantic relationships between data
 Choice of heuristics and weights contribute to and influence the serendipity among results.
 Focus on revealing semantic relationships
→ supporting visually exploratory search in data on the web
 The techniques are mainly tested with data on:
→ encyclopedic facts from Wikipedia (DBpedia)
→ academic digital libraries (DBLP) en conferences (COLINDA)
 Proposed techniques remain close to the structure of the linked data (RDF),
→ methods applicable in other domains that have linked data.
MOST IMPORTANT TAKEAWAYS
67
68
PhD Presentation: Exploring Semantic Relationships in the Web of Data

Más contenido relacionado

La actualidad más candente

Applyng wtd pr cttn nets
Applyng wtd pr cttn netsApplyng wtd pr cttn nets
Applyng wtd pr cttn nets
ivan weinel
 
Social network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and moreSocial network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and more
Wael Elrifai
 
Done reread deeperinsidepagerank
Done reread deeperinsidepagerankDone reread deeperinsidepagerank
Done reread deeperinsidepagerank
James Arnold
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
Toronto Metropolitan University
 

La actualidad más candente (19)

NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...
NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...
NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...
 
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
 
The Basics of Social Network Analysis
The Basics of Social Network AnalysisThe Basics of Social Network Analysis
The Basics of Social Network Analysis
 
Ppt
PptPpt
Ppt
 
Applyng wtd pr cttn nets
Applyng wtd pr cttn netsApplyng wtd pr cttn nets
Applyng wtd pr cttn nets
 
Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview. Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis Introduction including Data Structure Graph overview.
 
IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...
IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...
IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...
 
Social network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and moreSocial network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and more
 
Predicting tie strength with ego network structures
Predicting tie strength with ego network structuresPredicting tie strength with ego network structures
Predicting tie strength with ego network structures
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis Workshop
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
How to conduct a social network analysis: A tool for empowering teams and wor...
How to conduct a social network analysis: A tool for empowering teams and wor...How to conduct a social network analysis: A tool for empowering teams and wor...
How to conduct a social network analysis: A tool for empowering teams and wor...
 
Conducting Twitter Reserch
Conducting Twitter ReserchConducting Twitter Reserch
Conducting Twitter Reserch
 
10 Network Experiments
10 Network Experiments10 Network Experiments
10 Network Experiments
 
Done reread deeperinsidepagerank
Done reread deeperinsidepagerankDone reread deeperinsidepagerank
Done reread deeperinsidepagerank
 
10 More than a Pretty Picture: Visual Thinking in Network Studies
10 More than a Pretty Picture: Visual Thinking in Network Studies10 More than a Pretty Picture: Visual Thinking in Network Studies
10 More than a Pretty Picture: Visual Thinking in Network Studies
 
Methods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of DataMethods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of Data
 
Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Social Network Analysis (Part 1)
Social Network Analysis (Part 1)
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
 

Similar a PhD Presentation: Exploring Semantic Relationships in the Web of Data

Relational Navigation: A Taxonomy-Based Approach to Information Access and Di...
Relational Navigation: A Taxonomy-Based Approach to Information Access and Di...Relational Navigation: A Taxonomy-Based Approach to Information Access and Di...
Relational Navigation: A Taxonomy-Based Approach to Information Access and Di...
Bradley Allen
 
Searching for patterns in crowdsourced information
Searching for patterns in crowdsourced informationSearching for patterns in crowdsourced information
Searching for patterns in crowdsourced information
Silvia Puglisi
 
A Compositional-distributional Semantic Model over Structured Data
A Compositional-distributional Semantic Model over Structured DataA Compositional-distributional Semantic Model over Structured Data
A Compositional-distributional Semantic Model over Structured Data
Andre Freitas
 
Interactive informationretrieval 토인모_201202
Interactive informationretrieval 토인모_201202Interactive informationretrieval 토인모_201202
Interactive informationretrieval 토인모_201202
Jungah Park
 

Similar a PhD Presentation: Exploring Semantic Relationships in the Web of Data (20)

WISE2019 presentation
WISE2019 presentationWISE2019 presentation
WISE2019 presentation
 
Sub1579
Sub1579Sub1579
Sub1579
 
Entity linking with a knowledge base issues techniques and solutions
Entity linking with a knowledge base issues techniques and solutionsEntity linking with a knowledge base issues techniques and solutions
Entity linking with a knowledge base issues techniques and solutions
 
Mapping big data science
Mapping big data scienceMapping big data science
Mapping big data science
 
Investigating the effects of popularity data on predictive relevance judgment...
Investigating the effects of popularity data on predictive relevance judgment...Investigating the effects of popularity data on predictive relevance judgment...
Investigating the effects of popularity data on predictive relevance judgment...
 
Data Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data AnalysisData Visualization in Exploratory Data Analysis
Data Visualization in Exploratory Data Analysis
 
An empirical performance evaluation of relational keyword search systems
An empirical performance evaluation of relational keyword search systemsAn empirical performance evaluation of relational keyword search systems
An empirical performance evaluation of relational keyword search systems
 
02 Network Data Collection
02 Network Data Collection02 Network Data Collection
02 Network Data Collection
 
02 Network Data Collection (2016)
02 Network Data Collection (2016)02 Network Data Collection (2016)
02 Network Data Collection (2016)
 
Visualizing and Making Sense of Information
Visualizing and Making Sense of InformationVisualizing and Making Sense of Information
Visualizing and Making Sense of Information
 
Relational Navigation: A Taxonomy-Based Approach to Information Access and Di...
Relational Navigation: A Taxonomy-Based Approach to Information Access and Di...Relational Navigation: A Taxonomy-Based Approach to Information Access and Di...
Relational Navigation: A Taxonomy-Based Approach to Information Access and Di...
 
Searching for patterns in crowdsourced information
Searching for patterns in crowdsourced informationSearching for patterns in crowdsourced information
Searching for patterns in crowdsourced information
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
 
A Compositional-distributional Semantic Model over Structured Data
A Compositional-distributional Semantic Model over Structured DataA Compositional-distributional Semantic Model over Structured Data
A Compositional-distributional Semantic Model over Structured Data
 
DREaM Event 2: Louise Cooke
DREaM Event 2: Louise CookeDREaM Event 2: Louise Cooke
DREaM Event 2: Louise Cooke
 
Learning from Complex Online Behavior with Andy Edmonds - Big Brains
Learning from Complex Online Behavior with Andy Edmonds - Big BrainsLearning from Complex Online Behavior with Andy Edmonds - Big Brains
Learning from Complex Online Behavior with Andy Edmonds - Big Brains
 
Interactive informationretrieval 토인모_201202
Interactive informationretrieval 토인모_201202Interactive informationretrieval 토인모_201202
Interactive informationretrieval 토인모_201202
 
G5234552
G5234552G5234552
G5234552
 
Relevance Clues: Developing an experimental research design to investigate a ...
Relevance Clues: Developing an experimental research design to investigate a ...Relevance Clues: Developing an experimental research design to investigate a ...
Relevance Clues: Developing an experimental research design to investigate a ...
 
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationSemantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and Summarization
 

Más de Laurens De Vocht

OSLO: Open Standards for Linked Organizations
OSLO: Open Standards for Linked OrganizationsOSLO: Open Standards for Linked Organizations
OSLO: Open Standards for Linked Organizations
Laurens De Vocht
 
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...
Laurens De Vocht
 
Researcher Profiling based on Semantic Analysis in Social Networks
Researcher Profiling based on Semantic Analysis in Social NetworksResearcher Profiling based on Semantic Analysis in Social Networks
Researcher Profiling based on Semantic Analysis in Social Networks
Laurens De Vocht
 

Más de Laurens De Vocht (12)

OSLO: Open Standards for Linked Organizations
OSLO: Open Standards for Linked OrganizationsOSLO: Open Standards for Linked Organizations
OSLO: Open Standards for Linked Organizations
 
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked DataEffect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data
 
Big Linked Data ETL Benchmark on Cloud Commodity Hardware
Big Linked Data ETL Benchmark on Cloud Commodity HardwareBig Linked Data ETL Benchmark on Cloud Commodity Hardware
Big Linked Data ETL Benchmark on Cloud Commodity Hardware
 
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...
 
Providing Interchangeable Open Data to Accelerate Development of Sustainable ...
Providing Interchangeable Open Data to Accelerate Development of Sustainable ...Providing Interchangeable Open Data to Accelerate Development of Sustainable ...
Providing Interchangeable Open Data to Accelerate Development of Sustainable ...
 
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
 
The DataTank, RML and Domain Modelling
The DataTank, RML and Domain ModellingThe DataTank, RML and Domain Modelling
The DataTank, RML and Domain Modelling
 
A Visual Exploration Workflow as Enabler for the Exploitation of Linked Open ...
A Visual Exploration Workflow as Enabler for the Exploitation of Linked Open ...A Visual Exploration Workflow as Enabler for the Exploitation of Linked Open ...
A Visual Exploration Workflow as Enabler for the Exploitation of Linked Open ...
 
Aligning Web Collaboration Tools with Research Data for Scholars
Aligning Web Collaboration Tools with Research Data for ScholarsAligning Web Collaboration Tools with Research Data for Scholars
Aligning Web Collaboration Tools with Research Data for Scholars
 
Discovering Meaningful Connections between Resources in the Web of Data
Discovering Meaningful Connections between Resources in the Web of DataDiscovering Meaningful Connections between Resources in the Web of Data
Discovering Meaningful Connections between Resources in the Web of Data
 
A Framework Concept for Profiling Researchers on Twitter using the Web of Data
A Framework Concept for Profiling Researchers on Twitter using the Web of DataA Framework Concept for Profiling Researchers on Twitter using the Web of Data
A Framework Concept for Profiling Researchers on Twitter using the Web of Data
 
Researcher Profiling based on Semantic Analysis in Social Networks
Researcher Profiling based on Semantic Analysis in Social NetworksResearcher Profiling based on Semantic Analysis in Social Networks
Researcher Profiling based on Semantic Analysis in Social Networks
 

Último

audience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkkaudience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
lolsDocherty
 
Article writing on excessive use of internet.pptx
Article writing on excessive use of internet.pptxArticle writing on excessive use of internet.pptx
Article writing on excessive use of internet.pptx
abhinandnam9997
 
Production 2024 sunderland culture final - Copy.pptx
Production 2024 sunderland culture final - Copy.pptxProduction 2024 sunderland culture final - Copy.pptx
Production 2024 sunderland culture final - Copy.pptx
ChloeMeadows1
 

Último (16)

Thank You Luv I’ll Never Walk Alone Again T shirts
Thank You Luv I’ll Never Walk Alone Again T shirtsThank You Luv I’ll Never Walk Alone Again T shirts
Thank You Luv I’ll Never Walk Alone Again T shirts
 
Statistical Analysis of DNS Latencies.pdf
Statistical Analysis of DNS Latencies.pdfStatistical Analysis of DNS Latencies.pdf
Statistical Analysis of DNS Latencies.pdf
 
The Use of AI in Indonesia Election 2024: A Case Study
The Use of AI in Indonesia Election 2024: A Case StudyThe Use of AI in Indonesia Election 2024: A Case Study
The Use of AI in Indonesia Election 2024: A Case Study
 
Cyber Security Services Unveiled: Strategies to Secure Your Digital Presence
Cyber Security Services Unveiled: Strategies to Secure Your Digital PresenceCyber Security Services Unveiled: Strategies to Secure Your Digital Presence
Cyber Security Services Unveiled: Strategies to Secure Your Digital Presence
 
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkkaudience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
 
Reggie miller choke t shirtsReggie miller choke t shirts
Reggie miller choke t shirtsReggie miller choke t shirtsReggie miller choke t shirtsReggie miller choke t shirts
Reggie miller choke t shirtsReggie miller choke t shirts
 
Article writing on excessive use of internet.pptx
Article writing on excessive use of internet.pptxArticle writing on excessive use of internet.pptx
Article writing on excessive use of internet.pptx
 
Bug Bounty Blueprint : A Beginner's Guide
Bug Bounty Blueprint : A Beginner's GuideBug Bounty Blueprint : A Beginner's Guide
Bug Bounty Blueprint : A Beginner's Guide
 
Pvtaan Social media marketing proposal.pdf
Pvtaan Social media marketing proposal.pdfPvtaan Social media marketing proposal.pdf
Pvtaan Social media marketing proposal.pdf
 
Premier Mobile App Development Agency in USA.pdf
Premier Mobile App Development Agency in USA.pdfPremier Mobile App Development Agency in USA.pdf
Premier Mobile App Development Agency in USA.pdf
 
Development Lifecycle.pptx for the secure development of apps
Development Lifecycle.pptx for the secure development of appsDevelopment Lifecycle.pptx for the secure development of apps
Development Lifecycle.pptx for the secure development of apps
 
Topology of the Network class 8 .ppt pdf
Topology of the Network class 8 .ppt pdfTopology of the Network class 8 .ppt pdf
Topology of the Network class 8 .ppt pdf
 
How Do I Begin the Linksys Velop Setup Process?
How Do I Begin the Linksys Velop Setup Process?How Do I Begin the Linksys Velop Setup Process?
How Do I Begin the Linksys Velop Setup Process?
 
Production 2024 sunderland culture final - Copy.pptx
Production 2024 sunderland culture final - Copy.pptxProduction 2024 sunderland culture final - Copy.pptx
Production 2024 sunderland culture final - Copy.pptx
 
Case study on merger of Vodafone and Idea (VI).pptx
Case study on merger of Vodafone and Idea (VI).pptxCase study on merger of Vodafone and Idea (VI).pptx
Case study on merger of Vodafone and Idea (VI).pptx
 
iThome_CYBERSEC2024_Drive_Into_the_DarkWeb
iThome_CYBERSEC2024_Drive_Into_the_DarkWebiThome_CYBERSEC2024_Drive_Into_the_DarkWeb
iThome_CYBERSEC2024_Drive_Into_the_DarkWeb
 

PhD Presentation: Exploring Semantic Relationships in the Web of Data

  • 1.
  • 2. Exploring Semantic Relationships in the Web of Data Laurens De Vocht – 3.7.2017 DEPARTMENT OF ELECTRONICS AND INFORMATION SYSTEMS IDLAB
  • 3. I. Searching the web II. Reveal releationships III. Visually explore relationships 3
  • 4. I. Searching the web II. Reveal releationships III. Visually explore relationships 4
  • 5. 5 EXAMPLE: FIND OUT MORE ABOUT EINSTEIN
  • 7. How to search so many pages this fast?
  • 10.
  • 13.
  • 15. CLASSICAL RETRIEVAL MODEL OF SEARCH ENGINES Document Document- representation Query Information ‘need’ [Bates, 1989; Robertson, 1977] Search engine ‘match’ Index
  • 16. SEARCHING THE WEB Impressive state of the art Millions of results, almost always relevant results among the first 10 Increidbly fast < 1s Billions of documents, spread across the globe, within a few >50 billion estimated in index of largest search engines. [van den Bosch, Bogers & Kunder, 2016]
  • 17. LIMITATIONS OF CURRENT WEB SEARCH ENGINES A. Further explore search results? Exploratory search B. What if the keywords intended something else? Semantics C. Combine different search results? Find relationships 17
  • 18. SEARCHING THE WEB: NEXT STEPS A. Exploratory search B. Semantics C. Find relationships 18
  • 21. Lookup Learn Investigate Exploratory search DIFFERENT TYPES OF SEARCH ACTIVITIES 21 Classic information retrieval model [Marchionini, 2006]
  • 22. SEARCHING THE WEB: NEXT STEPS A. Exploratory search B. Semantics C. Find relationships 22
  • 23. SEMANTIEK KORT UITGELEGD 23 I can eat an apple. But I can’t eat Apple.
  • 24. SEMANTICS IN WEB DOCUMENTS 24 𝒆 = 𝒎𝒄 𝟐 General relativity Theory of special relativity Albert Einstein Twin paradox
  • 25. ANALOGY ON HOW MACHINES FIND THINGS IN WEB DOCUMENTS 25 Unieke identification on the web: Uniform Resource Identifier: URI Unique identification in a printed atlas 7 L
  • 26. URI’S VOOR DATA OP HET WEB 𝒆 = 𝒎𝒄 𝟐 http://dbpedia.org/resource/Albert_Einstein http://dbpedia.org/resource/Special_relativity http://dbpedia.org/resource/Twin_Paradoxhttp://dbpedia.org/resource/General_relativity 26
  • 27. EVERY CONCEPT IS ASSIGNED PROPERTIES/ATTRIBUTES 27 𝒆 = 𝒎𝒄 𝟐 http://dbpedia.org/resource/Special_relativity http://dbpedia.org/page/Special_relativity 𝒆 = 𝒎𝒄 𝟐 (…) dbr:Special_relativity dbc:subject dbc:Concepts_in_physics . (…) dbr:Albert_Einstein dbp:knownFor dbr:Special_relativity . (…)
  • 28. TRIPLE 28 dbr:Albert_Einstein dbp:knownFor dbr:Special_relativity . subject predicate object Resource Description Framework RDF Namespace vocabularia dbr: http://dbpedia.org/resource/ dbp: http://dbpedia.org/property/
  • 29. SEARCHING THE WEB: NEXT STEPS A. Exploratory search B. Semantics C. Find relationships 29
  • 30. ? EXAMPLE: CONNECTION BETWEEN EINSTEIN AND NEWTON
  • 31.
  • 32.
  • 33. FIND RELATIONSHIPS Non common properties? More distant relationships? Not all things are being related to each other and described within a single document.
  • 34. 1 2 Efficient revealing relationships between data. Allow users to gradually refine their search queries. Map the influence of different search actions on the search precision. Determine the contribution of revealed relationships while searching. 34 3 PURPOSE OF THIS PHD RESEARCH 4 part II part III
  • 35. I. Searching the web II. Reveal releationships III. Visually explore relationships 35
  • 37. 37 EXAMPLE: REVEALING RELATIONSHIPS Einstein Newton Physics Hume :influences :discipline :birthplace :residenc e :discipline :influences (…)
  • 38. 38 MANY, MANY POSSIBILITIES, EVEN FOR ‘SIMPLE’ RELATIONSHIPS
  • 39. ITERATIVE ALGORITHM TO FIND RELATIONSHIPS 39 initialisation filtering find relationships score relationships next iteration? (…) index RDF relationships with different path lengths
  • 40. ITERATIVE ALGORITHM TO FIND RELATIONSHIPS 40 expand search space filtering find relationships score relationships next iteration? (…) index RDF relationships with different path lengths
  • 41. INCREASING COMPLEXITY Number of data elements (resources) to check with increasing path length 41
  • 42. ARE THE RELATIONSHIPS RANDOM, ARBITRARY?
  • 43. A PRIORI ESTIMATION 43 Heuristics “the art of finding” Examples:  Jaccard distance difference in semantic relationships  Normalized (DBpedia) distance based on common references  Confidence possibility a resource does not occur if another already does
  • 44. A PRIORI ESTIMATION 44 Weights Assign value to a relationship Examples:  Jaccard distance difference in properties  Combined node degree rare things  Jiang & Conrath relations on the same level of abstraction
  • 45. A POSTERIORI SCORING 45 Semantic ranking The score includes all relationships and all resources along the entire path
  • 46. EVALUATION: TRIVIAABOUT (KNOWN) SCIENTISTS A priori estimates evaluated according to som semantic ranking mechanisms and a user study. Different relationships combined in a short ‘story’ about combinations of pairs: Carl Linnaeus Charles Darwin Albert Einstein Isaac Newton Dataset 46
  • 47. PATH SCORE: SEMANTIC RANKING Focus on Semantic commonalities Focus on Semantic differencesMixed 47
  • 48. USER STUDY RESULTS % voorkeur relatief t.o.v mediaan in paarsgewijze A/B beoordelingen. 48
  • 49. EVALUATION: CONFERENCES & DIGITAL LIBRARIES IN COMPUTER SCIENCES Check the precision of search results during the search. Comparison between: own method (minimal cost paths with optimale estimates) de de-facto baseline for many semantic applications, ‘Virtuoso’ (kortste paden) Datasets 49
  • 50. Eigen methode SEARCH PRECISION 50 Virtuoso (baseline) Baseline: more stable and on average similar Ownl method: notable high scores for Q1, Q4 en Q7 Gemiddelde Precisie
  • 51. I. Searching the web II. Reveal releationships III. Visually explore relationships 51
  • 52. WHEN EXPLORATORY SEARCH? When users (i) Do not know exactly how to formulate the most suited search query; (ii) Rather want to browse or surf information than lookup something specific. 52
  • 53. FROM SEARCHING IN DOCUMENTS TO SEARCHING IN DATA 53 Zoekmachine Zoekresultaten Vraagstellin g (…)
  • 54. FROM SEARCHING IN DOCUMENTS TO SEARCHING IN DATA Zoekmachine Zoekresultaten Vraagstellin g (…) ? 54
  • 55. EXPLORATORY SEARCH IN DATA [Tvazorek et al., 2010] [Smith et al., 2005] Via interacting the underlying data structure Network based Tabular or faceted 55
  • 58. Hypothesis Revealing realtionships among indirect related computer science publications, conferences and researchers, facilities adding new relevant results to already found results. Testing A. Added value of revealing relationships among search results B. Effectiveness and productivity of different search actions Datasets 58 EVALUATION: SCENARIO
  • 59. A. ADDED VALUE OF REVEALING RELATIONSHIPS AMONG SEARCH RESULTS 59 Effect with a simple and a complex query. Simple “Find a publication. Find a number of publications that have common co-authors with the found publications.” Complex “Find multiple persons that had a publication two consequent years in the same conference series”. Search details to be filled in by the users. The users were not aware if the pathfinding functionality was activated or not
  • 60. A. ADDED VALUE OF REVEALING RELATIONSHIPS AMONG SEARCH RESULTS 60 0 10 20 30 40 50 60 70 Simple Query Complex Query Negative (%) Positive (%)
  • 61. B. EFFECTIVENESS AND PRODUCTIVITY OF DIFFERENT SEARCH ACTIONS Tested actions: 1. Keyword-based search query 2. Add a top related resource 3. Expand neighbours 4. Expand neighbour of neigbour 5. Expand further related resource 61 Einstein Search Query Top related Special Relativity General Relativity Twin Paradox
  • 62. EFFECTIVITY OF A SEARCH ACTION 62 ‘All’ data Showed data Relevante showed data Effectivity here equals precision 𝑬 E = amount of relevant showed data to showed data
  • 63. PRODUCTIVITY OF CONSECUTIVE SEARCH ACTIONS 0 1 2 … k Consecutive Search Actions 𝑬 𝟎 63 P = average increase of effectivity after k search actions measured from the second search action on (I)
  • 64. B. EFFECTIVENESS AND PRODUCTIVITY OF DIFFERENT SEARCH ACTIONS 0 10 20 30 40 50 60 Lookup Add top related Neighbour expand Neighbour of neigbour expand Expand further related resource Effectiveness (%) Productivity (%) 64
  • 66. EXPLORING SEMANTIC RELATIONSHIPS ON THE WEB 66 Compared searching the web vs. searching physical documents; impressive state of the art. From searching to exploring via ‘berrypicking’, more possibilities than pure ‘lookup’. Semantics: the meaning of resources, aside from their expression, description or representation; documents describe resources and consist of data; ‘linked’ data has a threefold structure ‘triples’ to express semantics. Exploring relationships between resources is not trivial for non-common properties.
  • 67.  Alternative for searching in different data sources using each time another search interface: → exploratory search via semantic relationships between data  Choice of heuristics and weights contribute to and influence the serendipity among results.  Focus on revealing semantic relationships → supporting visually exploratory search in data on the web  The techniques are mainly tested with data on: → encyclopedic facts from Wikipedia (DBpedia) → academic digital libraries (DBLP) en conferences (COLINDA)  Proposed techniques remain close to the structure of the linked data (RDF), → methods applicable in other domains that have linked data. MOST IMPORTANT TAKEAWAYS 67
  • 68. 68