SlideShare una empresa de Scribd logo
1 de 48
Descargar para leer sin conexión
1/48
Visual search for supporting content
exploration in large document collections
Drahomira Herrmannova and Petr Knoth
2/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
3/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
4/48
What do we do
• Improve search in (large) document collections
• Examples of collections:
– News articles
– Cultural heritage collection
– Collection of scientific papers
• Current search engines:
– Support for lookup
– Much less support for exploration
5/48
Search tasks (Rose and Levinson, 2004)
• Undirected (or exploratory) queries – significant
portion of all searches (Rose and Levinson, 2004)
6/48
Exploratory search (Marchionini, 2006)
7/48
How to support exploratory search
• One possible solution – information
visualisation
• Why?
– Easier to communicate structure, organisation and
relations in content
– Visually appealing
8/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
9/48
Information Visualisation (1/2)
• Division according to granularity of
information
– Collection level
– Document level
– Intra-document level
10/48
Collection level visualisations
• Visualise attributes of the collection
• Typically aim at providing a general overview
of the collection content
• Examples
11/48
Tag clouds (Montero and Solana, 2006)
12/48
TIARA (Wei et al., 2010)
13/48
GRIDL (Schneiderman et al., 2000)
14/48
Document level visualisations
• Visualise attributes of the collection items
• Mutual links and relations of collection items
• Examples
15/48
Hopara (Milne and Witten, 2011)
16/48
Wivi (Lehmann et al., 2010)
17/48
Apolo (Chau et al., 2011)
18/48
Intra-document level visualisations
• Visualise the internal structure of a document
• Example
19/48
TileBars (Hirst, 1995)
20/48
Information Visualisation (2/2)
• Division according to the “starting point” of
the visualisation
– Browsing focused
– Query focused
21/48
Browsing focused
• Exploration starts at a specific point in the
collection from which the user navigates
through the collection
• Usually the same starting point is used every
time
22/48
InfoSky (Granitzer et al., 2004)
23/48
Query focused
• Starts with a query
• The query determines the entry point from
which the exploration starts
24/48
ThinkPedia (Hirsch et al., 2009)
25/48
Our approach
• Document level information
• Query focused browsing
26/48
Design principles (1/2)
• For visual search interfaces
• Should be considered when designing the
interface
• Related studies:
– Chen and Yu, 2000
– Sebrechts et al., 1999
27/48
Design principles (2/2)
1. Added value
2. Simplicity
3. Visual legibility
4. Use of colours
5. Dimension
6. Fixed spatial location
28/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
29/48
Considered types of collections
• Every document in a collection defined
according to a set of dimensions
• Dimensions typically of different types
• Document = set of properties expressing
values of dimensions
• Dimensions always present
• Examples
30/48
News articles collection
• Dimensions:
– Time
– Themes
– Locations
– Relations to other articles
31/48
Cultural heritage artifacts
• Dimensions:
– Artifact type
– Historical period
– Style
– Material
32/48
Scientific papers
• Dimensions:
– Citations
– Authors
– Concepts
– Similarities with other articles
33/48
The visualisation
34/48
The visualisation
35/48
The visualisation
36/48
The visualisation
37/48
Discovering connections
38/48
Comparing and contrasting documents
39/48
Limitations
• In theory not restricted, the limitations might
be:
– the size and resolution of the screen
– the limitations of human perception
40/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
41/48
Conclusion (1/2)
• Motivation:
1. Provide better support for exploratory search
than current textual interfaces
2. Interface that is conceptually applicable in any
document collection regardless of its type
3. Provide an added value by assisting in the
discovery of interesting connections that would
otherwise remain hidden
42/48
Conclusion (2/2)
• Results:
1. Support for comparing and contrasting content.
2. Support for exploration across dimensions.
3. Universal approach to the visualised dimensions.
43/48
Future plans
• Planned release end of June
• Integration with CORE system
• Evaluation
44/48
References (1/4)
• G. Marchionini. Exploratory search: from finding to understanding.
Communications of the ACM - Supporting exploratory search. 2006.
• D. Rose & D. Levinson. Understanding user goals in web search.
Proceedings of the 13th conference on World Wide Web. 2004.
• Yusef Hassan-Montero and Victor Herrero-Solana. Improving tag-clouds as
visual information retrieval interfaces. In MERIDA, INSCIT2006
CONFERENCE. 2006.
• Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan, Michelle X. Zhou, Weihong
Qian, Lei Shi, Li Tan, and Qiang Zhang. Tiara: a visual exploratory text an-
alytic system. In Proceedings of the 16th ACMSIGKDD international
conference on Knowledge discovery and data mining. 2010.
45/48
References (2/4)
• Ben Shneiderman, David Feldman, Anne Rose, and Xavier Ferré Grau.
Visualizing digital librarysearch results with categorical and hierarchical
axes. In Proceedings of the fifth ACM conference on Digital libraries. 2000.
• Marti A. Hearst. TileBars: Visualization of Term Distribution Information in
Full Text Information Access. In the Proceedings of the ACM SIGCHI
Conference on Human Factors in Computing Systems. 1995.
• David Milne, Ian Witten. A link-based visual search engine for Wikipedia.
Proceeding of the 11th annual international ACM/IEEE joint conference on
Digital libraries. 2011.
• Simon Lehmann, Ulrich Schwanecke, and Rolf Dorner. Interactive
visualization for opportunistic exploration of large document collections.
Information Systems. 2010.
46/48
References (3/4)
• Duen Horng Chau, Aniket Kittur, Jason I. Hong, and Christos Faloutsos.
Apolo: making sense of large network data by combining rich user
interaction and machine learning. In Proceedings of the 2011 annual
conference on Human factors in computing systems. 2011.
• Michael Granitzer, Wolfgang Kienreich, Vedran Sabol, Keith Andrews, and
Werner Klieber. Evaluating a system for interactive exploration of large,
hierarchically structured document repositories. In Proceedings of the IEEE
Symposium on Information Visualization. 2004.
• Christian Hirsch, John Hosking, and John Grundy. Interactive visualization
tools for exploring the semantic graph of large knowledge spaces.
Interfaces. 2009.
47/48
References (4/4)
• Chaomei Chen and Yue Yu. Empirical studies of information visualization: a
meta-analysis. Int. J. Hum.- Comput. Stud. 2000.
• Marc M. Sebrechts, John V. Cugini, Sharon J. Laskowski, Joanna Vasilakis,
and Michael S. Miller. Visualization of search results: a comparative
evaluation of text, 2d, and 3d interfaces. In Proceedings of the 22nd
annual international ACM SIGIR conference on Research and development
in information retrieval. 1999.
48/48
Thanks for listening!
Questions?

Más contenido relacionado

La actualidad más candente

Kms Serveying The Landscape
Kms Serveying The LandscapeKms Serveying The Landscape
Kms Serveying The Landscapechu2mm
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonAfrican Open Science Platform
 
Writing a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolWriting a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolkfear
 
OSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitorOSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitorOpen Science Fair
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional RepositoriesRobin Rice
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsMartin Donnelly
 
You down with dmp yeah you know me!
You down with dmp  yeah you know me!You down with dmp  yeah you know me!
You down with dmp yeah you know me!Renaine Julian
 
Dissemination Information Packages (DIPS) for Information Reuse
Dissemination Information Packages (DIPS) for Information Reuse Dissemination Information Packages (DIPS) for Information Reuse
Dissemination Information Packages (DIPS) for Information Reuse Micah Altman
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...Nancy Pontika
 
Digital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and RequirementsDigital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and RequirementsDigitalPreservationEurope
 

La actualidad más candente (13)

Kms Serveying The Landscape
Kms Serveying The LandscapeKms Serveying The Landscape
Kms Serveying The Landscape
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
Writing a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolWriting a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPTool
 
OSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitorOSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitor
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
 
You down with dmp yeah you know me!
You down with dmp  yeah you know me!You down with dmp  yeah you know me!
You down with dmp yeah you know me!
 
Dissemination Information Packages (DIPS) for Information Reuse
Dissemination Information Packages (DIPS) for Information Reuse Dissemination Information Packages (DIPS) for Information Reuse
Dissemination Information Packages (DIPS) for Information Reuse
 
Think like a Digital Curator
Think like a Digital CuratorThink like a Digital Curator
Think like a Digital Curator
 
Building a Trusted Framework - Kevin Hawkins, University North Texas
Building a Trusted Framework - Kevin Hawkins, University North TexasBuilding a Trusted Framework - Kevin Hawkins, University North Texas
Building a Trusted Framework - Kevin Hawkins, University North Texas
 
Hawkins "Monitoring Usage of Open Access Long-Form Content"
Hawkins "Monitoring Usage of Open Access Long-Form Content"Hawkins "Monitoring Usage of Open Access Long-Form Content"
Hawkins "Monitoring Usage of Open Access Long-Form Content"
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...
 
Digital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and RequirementsDigital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and Requirements
 

Similar a Visual Search for Supporting Content Exploration in Large Document Collections

Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)TimelessFuture
 
Lessons Learned from a Digital Tool Criticism Workshop
Lessons Learned from a Digital Tool Criticism WorkshopLessons Learned from a Digital Tool Criticism Workshop
Lessons Learned from a Digital Tool Criticism WorkshopMarijn Koolen
 
How Do UK Students, Researchers and Academics use the Internet
How Do UK Students, Researchers and Academics use the InternetHow Do UK Students, Researchers and Academics use the Internet
How Do UK Students, Researchers and Academics use the InternetCaroline Williams
 
A hands-on approach to digital tool criticism: Tools for (self-)reflection
A hands-on approach to digital tool criticism: Tools for (self-)reflectionA hands-on approach to digital tool criticism: Tools for (self-)reflection
A hands-on approach to digital tool criticism: Tools for (self-)reflectionMarijn Koolen
 
Linking Collections Through Linked Open Data
Linking Collections Through Linked Open DataLinking Collections Through Linked Open Data
Linking Collections Through Linked Open DataThe European Library
 
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHLorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHlorna_hughes
 
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library EthnographyCapturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library EthnographyOCLC
 
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library EthnographyCapturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library EthnographyLynn Connaway
 
Managing Ireland's Research Data - 3 Research Methods
Managing Ireland's Research Data - 3 Research MethodsManaging Ireland's Research Data - 3 Research Methods
Managing Ireland's Research Data - 3 Research MethodsRebecca Grant
 
Data sharing as part of the research ecosystem
Data sharing as part of the research ecosystemData sharing as part of the research ecosystem
Data sharing as part of the research ecosystemVarsha Khodiyar
 
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)OpenAIRE
 
Introduction to the workshop Services to support FAIR data - Sarah Jones
Introduction to the workshop Services to support FAIR data - Sarah JonesIntroduction to the workshop Services to support FAIR data - Sarah Jones
Introduction to the workshop Services to support FAIR data - Sarah JonesOpenAIRE
 
Qualitative Research Methods in LIS
Qualitative Research Methods in LISQualitative Research Methods in LIS
Qualitative Research Methods in LISLynn Connaway
 
Qualitative Research Methods in LIS
Qualitative Research Methods in LISQualitative Research Methods in LIS
Qualitative Research Methods in LISOCLC
 
information-skills-for-researchers-v3
information-skills-for-researchers-v3information-skills-for-researchers-v3
information-skills-for-researchers-v3Jacqueline Thomas
 

Similar a Visual Search for Supporting Content Exploration in Large Document Collections (20)

Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)
 
Lessons Learned from a Digital Tool Criticism Workshop
Lessons Learned from a Digital Tool Criticism WorkshopLessons Learned from a Digital Tool Criticism Workshop
Lessons Learned from a Digital Tool Criticism Workshop
 
C N I20080404
C N I20080404C N I20080404
C N I20080404
 
Torsten Reimer
Torsten ReimerTorsten Reimer
Torsten Reimer
 
How Do UK Students, Researchers and Academics use the Internet
How Do UK Students, Researchers and Academics use the InternetHow Do UK Students, Researchers and Academics use the Internet
How Do UK Students, Researchers and Academics use the Internet
 
A hands-on approach to digital tool criticism: Tools for (self-)reflection
A hands-on approach to digital tool criticism: Tools for (self-)reflectionA hands-on approach to digital tool criticism: Tools for (self-)reflection
A hands-on approach to digital tool criticism: Tools for (self-)reflection
 
Linking Collections Through Linked Open Data
Linking Collections Through Linked Open DataLinking Collections Through Linked Open Data
Linking Collections Through Linked Open Data
 
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHLorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
 
Reasoning with Reasoning (STRiX 2014)
Reasoning with Reasoning (STRiX 2014)Reasoning with Reasoning (STRiX 2014)
Reasoning with Reasoning (STRiX 2014)
 
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library EthnographyCapturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
 
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library EthnographyCapturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
 
Managing Ireland's Research Data - 3 Research Methods
Managing Ireland's Research Data - 3 Research MethodsManaging Ireland's Research Data - 3 Research Methods
Managing Ireland's Research Data - 3 Research Methods
 
Digital libraries
Digital librariesDigital libraries
Digital libraries
 
Data sharing as part of the research ecosystem
Data sharing as part of the research ecosystemData sharing as part of the research ecosystem
Data sharing as part of the research ecosystem
 
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
 
Introduction to the workshop Services to support FAIR data - Sarah Jones
Introduction to the workshop Services to support FAIR data - Sarah JonesIntroduction to the workshop Services to support FAIR data - Sarah Jones
Introduction to the workshop Services to support FAIR data - Sarah Jones
 
Qualitative Research Methods in LIS
Qualitative Research Methods in LISQualitative Research Methods in LIS
Qualitative Research Methods in LIS
 
Qualitative Research Methods in LIS
Qualitative Research Methods in LISQualitative Research Methods in LIS
Qualitative Research Methods in LIS
 
B08 B4pc 141 Diapo Amiotte En
B08 B4pc 141 Diapo Amiotte EnB08 B4pc 141 Diapo Amiotte En
B08 B4pc 141 Diapo Amiotte En
 
information-skills-for-researchers-v3
information-skills-for-researchers-v3information-skills-for-researchers-v3
information-skills-for-researchers-v3
 

Más de Dasha Herrmannova

Machine Learning for Data Extraction
Machine Learning for Data ExtractionMachine Learning for Data Extraction
Machine Learning for Data ExtractionDasha Herrmannova
 
Do Authors Deposit on Time? Tracking Open Access Policy Compliance
Do Authors Deposit on Time? Tracking Open Access Policy ComplianceDo Authors Deposit on Time? Tracking Open Access Policy Compliance
Do Authors Deposit on Time? Tracking Open Access Policy ComplianceDasha Herrmannova
 
Semantometrics: Text Analysis in Research Evaluation
Semantometrics: Text Analysis in Research Evaluation Semantometrics: Text Analysis in Research Evaluation
Semantometrics: Text Analysis in Research Evaluation Dasha Herrmannova
 
Do Citations and Readership Predict Excellent Publications?
Do Citations and Readership Predict Excellent Publications?Do Citations and Readership Predict Excellent Publications?
Do Citations and Readership Predict Excellent Publications?Dasha Herrmannova
 
An Analysis of the Microsoft Academic Graph
An Analysis of the Microsoft Academic GraphAn Analysis of the Microsoft Academic Graph
An Analysis of the Microsoft Academic GraphDasha Herrmannova
 
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...Dasha Herrmannova
 
Simple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
Simple Yet Effective Methods for Large-Scale Scholarly Publication RankingSimple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
Simple Yet Effective Methods for Large-Scale Scholarly Publication RankingDasha Herrmannova
 
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...Dasha Herrmannova
 
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...Dasha Herrmannova
 
Mining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal SeminarMining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal SeminarDasha Herrmannova
 

Más de Dasha Herrmannova (10)

Machine Learning for Data Extraction
Machine Learning for Data ExtractionMachine Learning for Data Extraction
Machine Learning for Data Extraction
 
Do Authors Deposit on Time? Tracking Open Access Policy Compliance
Do Authors Deposit on Time? Tracking Open Access Policy ComplianceDo Authors Deposit on Time? Tracking Open Access Policy Compliance
Do Authors Deposit on Time? Tracking Open Access Policy Compliance
 
Semantometrics: Text Analysis in Research Evaluation
Semantometrics: Text Analysis in Research Evaluation Semantometrics: Text Analysis in Research Evaluation
Semantometrics: Text Analysis in Research Evaluation
 
Do Citations and Readership Predict Excellent Publications?
Do Citations and Readership Predict Excellent Publications?Do Citations and Readership Predict Excellent Publications?
Do Citations and Readership Predict Excellent Publications?
 
An Analysis of the Microsoft Academic Graph
An Analysis of the Microsoft Academic GraphAn Analysis of the Microsoft Academic Graph
An Analysis of the Microsoft Academic Graph
 
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
 
Simple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
Simple Yet Effective Methods for Large-Scale Scholarly Publication RankingSimple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
Simple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
 
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
 
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
 
Mining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal SeminarMining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal Seminar
 

Último

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Visual Search for Supporting Content Exploration in Large Document Collections

  • 1. 1/48 Visual search for supporting content exploration in large document collections Drahomira Herrmannova and Petr Knoth
  • 2. 2/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 3. 3/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 4. 4/48 What do we do • Improve search in (large) document collections • Examples of collections: – News articles – Cultural heritage collection – Collection of scientific papers • Current search engines: – Support for lookup – Much less support for exploration
  • 5. 5/48 Search tasks (Rose and Levinson, 2004) • Undirected (or exploratory) queries – significant portion of all searches (Rose and Levinson, 2004)
  • 7. 7/48 How to support exploratory search • One possible solution – information visualisation • Why? – Easier to communicate structure, organisation and relations in content – Visually appealing
  • 8. 8/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 9. 9/48 Information Visualisation (1/2) • Division according to granularity of information – Collection level – Document level – Intra-document level
  • 10. 10/48 Collection level visualisations • Visualise attributes of the collection • Typically aim at providing a general overview of the collection content • Examples
  • 11. 11/48 Tag clouds (Montero and Solana, 2006)
  • 12. 12/48 TIARA (Wei et al., 2010)
  • 14. 14/48 Document level visualisations • Visualise attributes of the collection items • Mutual links and relations of collection items • Examples
  • 15. 15/48 Hopara (Milne and Witten, 2011)
  • 17. 17/48 Apolo (Chau et al., 2011)
  • 18. 18/48 Intra-document level visualisations • Visualise the internal structure of a document • Example
  • 20. 20/48 Information Visualisation (2/2) • Division according to the “starting point” of the visualisation – Browsing focused – Query focused
  • 21. 21/48 Browsing focused • Exploration starts at a specific point in the collection from which the user navigates through the collection • Usually the same starting point is used every time
  • 23. 23/48 Query focused • Starts with a query • The query determines the entry point from which the exploration starts
  • 25. 25/48 Our approach • Document level information • Query focused browsing
  • 26. 26/48 Design principles (1/2) • For visual search interfaces • Should be considered when designing the interface • Related studies: – Chen and Yu, 2000 – Sebrechts et al., 1999
  • 27. 27/48 Design principles (2/2) 1. Added value 2. Simplicity 3. Visual legibility 4. Use of colours 5. Dimension 6. Fixed spatial location
  • 28. 28/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 29. 29/48 Considered types of collections • Every document in a collection defined according to a set of dimensions • Dimensions typically of different types • Document = set of properties expressing values of dimensions • Dimensions always present • Examples
  • 30. 30/48 News articles collection • Dimensions: – Time – Themes – Locations – Relations to other articles
  • 31. 31/48 Cultural heritage artifacts • Dimensions: – Artifact type – Historical period – Style – Material
  • 32. 32/48 Scientific papers • Dimensions: – Citations – Authors – Concepts – Similarities with other articles
  • 39. 39/48 Limitations • In theory not restricted, the limitations might be: – the size and resolution of the screen – the limitations of human perception
  • 40. 40/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 41. 41/48 Conclusion (1/2) • Motivation: 1. Provide better support for exploratory search than current textual interfaces 2. Interface that is conceptually applicable in any document collection regardless of its type 3. Provide an added value by assisting in the discovery of interesting connections that would otherwise remain hidden
  • 42. 42/48 Conclusion (2/2) • Results: 1. Support for comparing and contrasting content. 2. Support for exploration across dimensions. 3. Universal approach to the visualised dimensions.
  • 43. 43/48 Future plans • Planned release end of June • Integration with CORE system • Evaluation
  • 44. 44/48 References (1/4) • G. Marchionini. Exploratory search: from finding to understanding. Communications of the ACM - Supporting exploratory search. 2006. • D. Rose & D. Levinson. Understanding user goals in web search. Proceedings of the 13th conference on World Wide Web. 2004. • Yusef Hassan-Montero and Victor Herrero-Solana. Improving tag-clouds as visual information retrieval interfaces. In MERIDA, INSCIT2006 CONFERENCE. 2006. • Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan, Michelle X. Zhou, Weihong Qian, Lei Shi, Li Tan, and Qiang Zhang. Tiara: a visual exploratory text an- alytic system. In Proceedings of the 16th ACMSIGKDD international conference on Knowledge discovery and data mining. 2010.
  • 45. 45/48 References (2/4) • Ben Shneiderman, David Feldman, Anne Rose, and Xavier Ferré Grau. Visualizing digital librarysearch results with categorical and hierarchical axes. In Proceedings of the fifth ACM conference on Digital libraries. 2000. • Marti A. Hearst. TileBars: Visualization of Term Distribution Information in Full Text Information Access. In the Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems. 1995. • David Milne, Ian Witten. A link-based visual search engine for Wikipedia. Proceeding of the 11th annual international ACM/IEEE joint conference on Digital libraries. 2011. • Simon Lehmann, Ulrich Schwanecke, and Rolf Dorner. Interactive visualization for opportunistic exploration of large document collections. Information Systems. 2010.
  • 46. 46/48 References (3/4) • Duen Horng Chau, Aniket Kittur, Jason I. Hong, and Christos Faloutsos. Apolo: making sense of large network data by combining rich user interaction and machine learning. In Proceedings of the 2011 annual conference on Human factors in computing systems. 2011. • Michael Granitzer, Wolfgang Kienreich, Vedran Sabol, Keith Andrews, and Werner Klieber. Evaluating a system for interactive exploration of large, hierarchically structured document repositories. In Proceedings of the IEEE Symposium on Information Visualization. 2004. • Christian Hirsch, John Hosking, and John Grundy. Interactive visualization tools for exploring the semantic graph of large knowledge spaces. Interfaces. 2009.
  • 47. 47/48 References (4/4) • Chaomei Chen and Yue Yu. Empirical studies of information visualization: a meta-analysis. Int. J. Hum.- Comput. Stud. 2000. • Marc M. Sebrechts, John V. Cugini, Sharon J. Laskowski, Joanna Vasilakis, and Michael S. Miller. Visualization of search results: a comparative evaluation of text, 2d, and 3d interfaces. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. 1999.