SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
ECIR 2014 Industry Day
Content Discovery Through Entity Driven Search
Alessandro Benedetti
http://uk.linkedin.com/in/alexbenedetti
Antonio David Perez Morales
http://es.linkedin.com/in/adperezmorales
16th
April 2014
• Experienced at building and delivering a wide range of enterprise
solutions across the whole information life cycle
• Alfresco & Ephesoft certified Platinum Partner
• Red Hat Enterprise Linux Ready Partner
• Crafter & Varnish Gold Partners
• Search Solutions Consultant
Alfresco Partner of the Year 2012 and
2013
Working effectively together
Who We Are
3
Antonio David Pérez Morales
- R&D Senior Engineer
- Master in Engineering and Technology
Software
- Digital Identity and Security expert
- Enterprise Search Background
- Semantic, NLP, ML Technologies and
Information Retrieval lover
- Apache Stanbol Committer
- Apache contributor
@adperezmorales
http://es.linkedin.com/in/adperezmorales/
Alessandro Benedetti
- R&D Senior Engineer
- Master in Computer Science
- Information Retrieval background
-- Enterprise Search specialist
- Semantic, NLP, ML Technologies
and Information Retrieval lover
@AlexBenedetti
http://uk.linkedin.com/in/alexbenedetti
Working effectively together
Agenda
4
• Context
• Problem
• Solution
• Demo
• Future Works
Working effectively together
Agenda
5
• Context
• Problem
• Solution
• Demo
• Future Works
Working effectively together
Zaizi R&D Department
6
•Giving sense to the content
• Enriching it semantically
•Adding value to ECM/CMS
• More structured content, easy to manage, link and search,
•Improving search
• Across different domains, data sources, User Experience
• Machine Learning applied research
• Content Organization – Recommendation Systems
Working effectively together
Agenda
7
• Context
• Problem
• Solution
• Demo
• Future Works
Working effectively together
Enterprise Search Problems
8
Challenge :
Search within Big and Heterogeneus Repositories
• Heterogeneus Data Sources
• Filesystem, DB, ECM/CMS, Email, …
• Unstructured Content
• PDFs, text plain, Word, …
• Documents not linked between each other
• Federated Search needed
• Search across data sources
• Different permissions
• Centralized endpoint
Working effectively together
Current Enterprise Search Weaknesses
9
• Keyword based
• Low precision
• Ambiguous terms not in context
• Not accurate weighting when keywords are combined
in a query
Working effectively together
Agenda
10
• Context
• Problem
• Solution
• Demo
• Future Works
Working effectively together
Entity Driven Search
11
• Moves from keywords to Entities
•More understandable to a Human
• Process the unstructured text
• Enrich it
• Build specific indexes
• Use entities and concepts in searches
Working effectively together
Sensefy
12
• Semantic Enterprise Search Engine
• Federated Search
• Evolved User Experience
• Based on cutting-edge Open Source Frameworks
Working effectively together
Architecture
13
Working effectively together
RedLink
14
• Semantic Cloud platform
• Providing Software as a Service
• Manage unstructured data
• Extract knowledge and intelligence
• Make sense of information
• Feed into business processes
• Open-Source based components
• Entity Linking using Knowledge Bases
Working effectively together
NLP & Semantic Enrichment
15
• From unstructured to structured
• NLP Analysis. POS Tagging
• Named Entities Recognition
• Linked Data
• Entity Linking using Knowledge Bases
• Disambiguation
• Indexing in Solr
Working effectively together
Smart Autocomplete
16
• Multi Phase suggestions
• Closer to natural language query formulation
• Named Entities infix
• Entity types infix
• Multi Language entity type support
• Properties driven query approach
Working effectively together
Smart Autocomplete
Configuration
17
• Entity type properties
•Interesting to our use case and scenario
• Properties inheritance through type hierarchy
• Enhance type information from external resource
•Freebase, DbPedia , Custom Data Set
Working effectively together
Semantic Search
18
• Search by Named Entity
• Search by Entity Type
• Search by Entity Type properties
• Grouping Results by Sense
• Contextualize Results Using Semantic Information
Working effectively together
Semantic More Like This
19
• Search for Similar Documents based on Entities and Entities’
categories
• Similarity Function based on Documents’ Sense
• Not based on text tokens
• Entity Frequency /
Inverted Document Frequency
• Entity Type Frequency /
Inverted Document Frequency
Working effectively together
Agenda
20
• Context
• Problem
• Solution
• Demo
• Future Works
Working effectively together
Agenda
21
• Context
• Problem
• Solution
• Demo
• Future Works
Working effectively together
Future Work
22
• Semantic More Like This new approach (Graph
relations)
• Machine Learning components: Classification, Topic
annotation, Clustering
• Semantic facets
• Secured Entity Search
• Image and Media searches
Working effectively together
Conclusions
23
• Better user experience
• More precision in search results
• Closer to human language
Zaizi Headquarters
Brook House
4th Floor, North Wing
229-243 Shepherd’s Bush Road
London W6 7AN
United Kingdom
T: (+44) 20 3582 8330
Zaizi Iberia
Calle Gremios 13-15, Edificio Diseño
Planta 1, Oficina 5
41927 Mairena del Aljarafe
Sevilla
Spain
T: (+34) 666 42 43 64
Zaizi Asia
50 Flower Road
Colombo 07
Sri Lanka
T: (+94) 112 301 461
Zaizi Singapore
14 Robinson Road #13-00
Far East Finance Building
Singapore 048545
T: (+65) 3158 5886
F: (+65) 6323 1839
VAT Registration No GB 932 8855 89
Registered in England and Wales with registration number 6440931
www.zaizi.com
Thanks!

Más contenido relacionado

La actualidad más candente

Semantic Search on the Rise
Semantic Search on the RiseSemantic Search on the Rise
Semantic Search on the RisePeter Mika
 
Managing Annotations (OR2016)
Managing Annotations (OR2016)Managing Annotations (OR2016)
Managing Annotations (OR2016)Robert Sanderson
 
Semantic Search at Yahoo
Semantic Search at YahooSemantic Search at Yahoo
Semantic Search at YahooPeter Mika
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataOntotext
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
 
Making things findable
Making things findableMaking things findable
Making things findablePeter Mika
 
The Art Of Searching
The Art Of SearchingThe Art Of Searching
The Art Of SearchingPaul Neal
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Dr. Haxel Consult
 
Linked Data for Czech Legislation
Linked Data for Czech LegislationLinked Data for Czech Legislation
Linked Data for Czech LegislationMartin Necasky
 
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...semanticsconference
 
An Introduction to Entities in Semantic Search
An Introduction to Entities in Semantic SearchAn Introduction to Entities in Semantic Search
An Introduction to Entities in Semantic SearchDavid Amerland
 
IR: Open source state
IR: Open source stateIR: Open source state
IR: Open source stateDmitry Kan
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Peter Mika
 
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...semanticsconference
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrTrey Grainger
 
IIIF Foundational Specifications
IIIF Foundational SpecificationsIIIF Foundational Specifications
IIIF Foundational SpecificationsRobert Sanderson
 

La actualidad más candente (17)

Semantic Search on the Rise
Semantic Search on the RiseSemantic Search on the Rise
Semantic Search on the Rise
 
Managing Annotations (OR2016)
Managing Annotations (OR2016)Managing Annotations (OR2016)
Managing Annotations (OR2016)
 
Semantic Search at Yahoo
Semantic Search at YahooSemantic Search at Yahoo
Semantic Search at Yahoo
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
IR
IRIR
IR
 
Making things findable
Making things findableMaking things findable
Making things findable
 
The Art Of Searching
The Art Of SearchingThe Art Of Searching
The Art Of Searching
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents
 
Linked Data for Czech Legislation
Linked Data for Czech LegislationLinked Data for Czech Legislation
Linked Data for Czech Legislation
 
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
 
An Introduction to Entities in Semantic Search
An Introduction to Entities in Semantic SearchAn Introduction to Entities in Semantic Search
An Introduction to Entities in Semantic Search
 
IR: Open source state
IR: Open source stateIR: Open source state
IR: Open source state
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
 
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache Solr
 
IIIF Foundational Specifications
IIIF Foundational SpecificationsIIIF Foundational Specifications
IIIF Foundational Specifications
 

Similar a Content Discovery Through Entity Driven Search

Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
Chalitha Perera | Cross Media Concept and Entity Driven Search for EnterpriseChalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprisesemanticsconference
 
cross media concept and entity driven search for enterprise
cross media concept and entity driven search for enterprisecross media concept and entity driven search for enterprise
cross media concept and entity driven search for enterpriseDileepa Jayakody
 
How search engines work
How search engines workHow search engines work
How search engines workChinna Botla
 
The evolution of Search spscinci
The evolution of Search spscinciThe evolution of Search spscinci
The evolution of Search spscinciJohnny Lopez
 
Enterprise search Information
Enterprise search Information Enterprise search Information
Enterprise search Information Netwoven Inc.
 
Solving Real World Challenges with Enterprise Search
Solving Real World Challenges with Enterprise SearchSolving Real World Challenges with Enterprise Search
Solving Real World Challenges with Enterprise SearchSPC Adriatics
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignMarianne Sweeny
 
Search Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint Summit
Search Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint SummitSearch Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint Summit
Search Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint SummitJoel Oleson
 
Graph databases and the #panamapapers
Graph databases and the #panamapapersGraph databases and the #panamapapers
Graph databases and the #panamapapersdarthvader42
 
Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologiesenterprisesearchmeetup
 
10 Sourcing Tips with Ryan Gillis - SourceCon DC Webinar 8-29-19
10 Sourcing Tips with Ryan Gillis - SourceCon DC Webinar 8-29-1910 Sourcing Tips with Ryan Gillis - SourceCon DC Webinar 8-29-19
10 Sourcing Tips with Ryan Gillis - SourceCon DC Webinar 8-29-19rgillis
 
Harnessing search engines for KM
Harnessing search engines for KMHarnessing search engines for KM
Harnessing search engines for KMInvotra
 
Information Architecture Exposing the Secret Sauce for Success
Information Architecture Exposing the Secret Sauce for Success Information Architecture Exposing the Secret Sauce for Success
Information Architecture Exposing the Secret Sauce for Success Baltimore SharePoint (BSPUG)
 
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...Agnes Molnar
 
Going Meta – How to Use Metadata in SharePoint and Office 365
Going Meta – How to Use Metadata in SharePoint and Office 365Going Meta – How to Use Metadata in SharePoint and Office 365
Going Meta – How to Use Metadata in SharePoint and Office 365Concept Searching, Inc
 
SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint ...
SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint ...SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint ...
SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint ...Concept Searching, Inc
 
Crawlable Spatial Data - #Geo4Web research topic #3
Crawlable Spatial Data - #Geo4Web research topic #3Crawlable Spatial Data - #Geo4Web research topic #3
Crawlable Spatial Data - #Geo4Web research topic #3Dimitri van Hees
 
#SEASPC: Information Architecture and Enterprise Search - Better Together
#SEASPC: Information Architecture and Enterprise Search - Better Together#SEASPC: Information Architecture and Enterprise Search - Better Together
#SEASPC: Information Architecture and Enterprise Search - Better TogetherAgnes Molnar
 
European SharePoint Conference Automated Tagging and Metadata Management w...
European SharePoint Conference   Automated Tagging and Metadata  Management w...European SharePoint Conference   Automated Tagging and Metadata  Management w...
European SharePoint Conference Automated Tagging and Metadata Management w...B-S-S Business Software Solutions GmbH
 

Similar a Content Discovery Through Entity Driven Search (20)

Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
Chalitha Perera | Cross Media Concept and Entity Driven Search for EnterpriseChalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
 
cross media concept and entity driven search for enterprise
cross media concept and entity driven search for enterprisecross media concept and entity driven search for enterprise
cross media concept and entity driven search for enterprise
 
How search engines work
How search engines workHow search engines work
How search engines work
 
The evolution of Search spscinci
The evolution of Search spscinciThe evolution of Search spscinci
The evolution of Search spscinci
 
Enterprise search Information
Enterprise search Information Enterprise search Information
Enterprise search Information
 
Solving Real World Challenges with Enterprise Search
Solving Real World Challenges with Enterprise SearchSolving Real World Challenges with Enterprise Search
Solving Real World Challenges with Enterprise Search
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By Design
 
Search Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint Summit
Search Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint SummitSearch Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint Summit
Search Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint Summit
 
Graph databases and the #panamapapers
Graph databases and the #panamapapersGraph databases and the #panamapapers
Graph databases and the #panamapapers
 
Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologies
 
10 Sourcing Tips with Ryan Gillis - SourceCon DC Webinar 8-29-19
10 Sourcing Tips with Ryan Gillis - SourceCon DC Webinar 8-29-1910 Sourcing Tips with Ryan Gillis - SourceCon DC Webinar 8-29-19
10 Sourcing Tips with Ryan Gillis - SourceCon DC Webinar 8-29-19
 
Harnessing search engines for KM
Harnessing search engines for KMHarnessing search engines for KM
Harnessing search engines for KM
 
Information Architecture Exposing the Secret Sauce for Success
Information Architecture Exposing the Secret Sauce for Success Information Architecture Exposing the Secret Sauce for Success
Information Architecture Exposing the Secret Sauce for Success
 
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
 
Going Meta – How to Use Metadata in SharePoint and Office 365
Going Meta – How to Use Metadata in SharePoint and Office 365Going Meta – How to Use Metadata in SharePoint and Office 365
Going Meta – How to Use Metadata in SharePoint and Office 365
 
SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint ...
SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint ...SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint ...
SharePoint Saturday Toronto - Going Meta – How to Use Metadata in SharePoint ...
 
Crawlable Spatial Data - #Geo4Web research topic #3
Crawlable Spatial Data - #Geo4Web research topic #3Crawlable Spatial Data - #Geo4Web research topic #3
Crawlable Spatial Data - #Geo4Web research topic #3
 
#SEASPC: Information Architecture and Enterprise Search - Better Together
#SEASPC: Information Architecture and Enterprise Search - Better Together#SEASPC: Information Architecture and Enterprise Search - Better Together
#SEASPC: Information Architecture and Enterprise Search - Better Together
 
FAST Search-webinar-06-29-2010
FAST Search-webinar-06-29-2010FAST Search-webinar-06-29-2010
FAST Search-webinar-06-29-2010
 
European SharePoint Conference Automated Tagging and Metadata Management w...
European SharePoint Conference   Automated Tagging and Metadata  Management w...European SharePoint Conference   Automated Tagging and Metadata  Management w...
European SharePoint Conference Automated Tagging and Metadata Management w...
 

Más de Alessandro Benedetti

Search Quality Evaluation to Help Reproducibility : an Open Source Approach
Search Quality Evaluation to Help Reproducibility : an Open Source ApproachSearch Quality Evaluation to Help Reproducibility : an Open Source Approach
Search Quality Evaluation to Help Reproducibility : an Open Source ApproachAlessandro Benedetti
 
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationAlessandro Benedetti
 
Search Quality Evaluation to Help Reproducibility: An Open-source Approach
Search Quality Evaluation to Help Reproducibility: An Open-source ApproachSearch Quality Evaluation to Help Reproducibility: An Open-source Approach
Search Quality Evaluation to Help Reproducibility: An Open-source ApproachAlessandro Benedetti
 
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationAlessandro Benedetti
 
Search Quality Evaluation: Tools and Techniques
Search Quality Evaluation: Tools and TechniquesSearch Quality Evaluation: Tools and Techniques
Search Quality Evaluation: Tools and TechniquesAlessandro Benedetti
 
From Academic Papers To Production : A Learning To Rank Story
From Academic Papers To Production : A Learning To Rank StoryFrom Academic Papers To Production : A Learning To Rank Story
From Academic Papers To Production : A Learning To Rank StoryAlessandro Benedetti
 
Advanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache LuceneAdvanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache LuceneAlessandro Benedetti
 
Lucene And Solr Document Classification
Lucene And Solr Document ClassificationLucene And Solr Document Classification
Lucene And Solr Document ClassificationAlessandro Benedetti
 
Multi-language Content Discovery Through Entity Driven Search
Multi-language Content Discovery Through Entity Driven SearchMulti-language Content Discovery Through Entity Driven Search
Multi-language Content Discovery Through Entity Driven SearchAlessandro Benedetti
 

Más de Alessandro Benedetti (9)

Search Quality Evaluation to Help Reproducibility : an Open Source Approach
Search Quality Evaluation to Help Reproducibility : an Open Source ApproachSearch Quality Evaluation to Help Reproducibility : an Open Source Approach
Search Quality Evaluation to Help Reproducibility : an Open Source Approach
 
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
 
Search Quality Evaluation to Help Reproducibility: An Open-source Approach
Search Quality Evaluation to Help Reproducibility: An Open-source ApproachSearch Quality Evaluation to Help Reproducibility: An Open-source Approach
Search Quality Evaluation to Help Reproducibility: An Open-source Approach
 
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
 
Search Quality Evaluation: Tools and Techniques
Search Quality Evaluation: Tools and TechniquesSearch Quality Evaluation: Tools and Techniques
Search Quality Evaluation: Tools and Techniques
 
From Academic Papers To Production : A Learning To Rank Story
From Academic Papers To Production : A Learning To Rank StoryFrom Academic Papers To Production : A Learning To Rank Story
From Academic Papers To Production : A Learning To Rank Story
 
Advanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache LuceneAdvanced Document Similarity With Apache Lucene
Advanced Document Similarity With Apache Lucene
 
Lucene And Solr Document Classification
Lucene And Solr Document ClassificationLucene And Solr Document Classification
Lucene And Solr Document Classification
 
Multi-language Content Discovery Through Entity Driven Search
Multi-language Content Discovery Through Entity Driven SearchMulti-language Content Discovery Through Entity Driven Search
Multi-language Content Discovery Through Entity Driven Search
 

Último

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Último (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Content Discovery Through Entity Driven Search

  • 1. ECIR 2014 Industry Day Content Discovery Through Entity Driven Search Alessandro Benedetti http://uk.linkedin.com/in/alexbenedetti Antonio David Perez Morales http://es.linkedin.com/in/adperezmorales 16th April 2014
  • 2. • Experienced at building and delivering a wide range of enterprise solutions across the whole information life cycle • Alfresco & Ephesoft certified Platinum Partner • Red Hat Enterprise Linux Ready Partner • Crafter & Varnish Gold Partners • Search Solutions Consultant Alfresco Partner of the Year 2012 and 2013
  • 3. Working effectively together Who We Are 3 Antonio David Pérez Morales - R&D Senior Engineer - Master in Engineering and Technology Software - Digital Identity and Security expert - Enterprise Search Background - Semantic, NLP, ML Technologies and Information Retrieval lover - Apache Stanbol Committer - Apache contributor @adperezmorales http://es.linkedin.com/in/adperezmorales/ Alessandro Benedetti - R&D Senior Engineer - Master in Computer Science - Information Retrieval background -- Enterprise Search specialist - Semantic, NLP, ML Technologies and Information Retrieval lover @AlexBenedetti http://uk.linkedin.com/in/alexbenedetti
  • 4. Working effectively together Agenda 4 • Context • Problem • Solution • Demo • Future Works
  • 5. Working effectively together Agenda 5 • Context • Problem • Solution • Demo • Future Works
  • 6. Working effectively together Zaizi R&D Department 6 •Giving sense to the content • Enriching it semantically •Adding value to ECM/CMS • More structured content, easy to manage, link and search, •Improving search • Across different domains, data sources, User Experience • Machine Learning applied research • Content Organization – Recommendation Systems
  • 7. Working effectively together Agenda 7 • Context • Problem • Solution • Demo • Future Works
  • 8. Working effectively together Enterprise Search Problems 8 Challenge : Search within Big and Heterogeneus Repositories • Heterogeneus Data Sources • Filesystem, DB, ECM/CMS, Email, … • Unstructured Content • PDFs, text plain, Word, … • Documents not linked between each other • Federated Search needed • Search across data sources • Different permissions • Centralized endpoint
  • 9. Working effectively together Current Enterprise Search Weaknesses 9 • Keyword based • Low precision • Ambiguous terms not in context • Not accurate weighting when keywords are combined in a query
  • 10. Working effectively together Agenda 10 • Context • Problem • Solution • Demo • Future Works
  • 11. Working effectively together Entity Driven Search 11 • Moves from keywords to Entities •More understandable to a Human • Process the unstructured text • Enrich it • Build specific indexes • Use entities and concepts in searches
  • 12. Working effectively together Sensefy 12 • Semantic Enterprise Search Engine • Federated Search • Evolved User Experience • Based on cutting-edge Open Source Frameworks
  • 14. Working effectively together RedLink 14 • Semantic Cloud platform • Providing Software as a Service • Manage unstructured data • Extract knowledge and intelligence • Make sense of information • Feed into business processes • Open-Source based components • Entity Linking using Knowledge Bases
  • 15. Working effectively together NLP & Semantic Enrichment 15 • From unstructured to structured • NLP Analysis. POS Tagging • Named Entities Recognition • Linked Data • Entity Linking using Knowledge Bases • Disambiguation • Indexing in Solr
  • 16. Working effectively together Smart Autocomplete 16 • Multi Phase suggestions • Closer to natural language query formulation • Named Entities infix • Entity types infix • Multi Language entity type support • Properties driven query approach
  • 17. Working effectively together Smart Autocomplete Configuration 17 • Entity type properties •Interesting to our use case and scenario • Properties inheritance through type hierarchy • Enhance type information from external resource •Freebase, DbPedia , Custom Data Set
  • 18. Working effectively together Semantic Search 18 • Search by Named Entity • Search by Entity Type • Search by Entity Type properties • Grouping Results by Sense • Contextualize Results Using Semantic Information
  • 19. Working effectively together Semantic More Like This 19 • Search for Similar Documents based on Entities and Entities’ categories • Similarity Function based on Documents’ Sense • Not based on text tokens • Entity Frequency / Inverted Document Frequency • Entity Type Frequency / Inverted Document Frequency
  • 20. Working effectively together Agenda 20 • Context • Problem • Solution • Demo • Future Works
  • 21. Working effectively together Agenda 21 • Context • Problem • Solution • Demo • Future Works
  • 22. Working effectively together Future Work 22 • Semantic More Like This new approach (Graph relations) • Machine Learning components: Classification, Topic annotation, Clustering • Semantic facets • Secured Entity Search • Image and Media searches
  • 23. Working effectively together Conclusions 23 • Better user experience • More precision in search results • Closer to human language
  • 24. Zaizi Headquarters Brook House 4th Floor, North Wing 229-243 Shepherd’s Bush Road London W6 7AN United Kingdom T: (+44) 20 3582 8330 Zaizi Iberia Calle Gremios 13-15, Edificio Diseño Planta 1, Oficina 5 41927 Mairena del Aljarafe Sevilla Spain T: (+34) 666 42 43 64 Zaizi Asia 50 Flower Road Colombo 07 Sri Lanka T: (+94) 112 301 461 Zaizi Singapore 14 Robinson Road #13-00 Far East Finance Building Singapore 048545 T: (+65) 3158 5886 F: (+65) 6323 1839 VAT Registration No GB 932 8855 89 Registered in England and Wales with registration number 6440931 www.zaizi.com Thanks!