SlideShare una empresa de Scribd logo
1 de 31
Descargar para leer sin conexión
Future Challenges in
(automated) Patent Search
Alexander G. Klenner-Bajaja, PhD.
aklenner@epo.org
Why Search – European Patent Convention
2
Information Management`s
Task: Support Search
Introduction – What do we want, where are we?
3
The current Search System
 A boolean search system, documents are returned as sets
 Search is dominated by meta-data search as well as keywords
4
Search
Space
boolean query
The current Search System
 A Lucene elastic search based system, documents are returned as
ranked lists (pilot – fully available but no extensive training)
 Moving away from a meta-data dominated search...?
5
Search
Space
k
Lucene query
1
Patent Gold Standards
 We have “manually” curated search reports for about 40 million simple
patent families
 The relevant documents are mentioned in the search report as either
–X(I,N),A,Y,... documents
6
median: 5 citations
in search reports
Citation temporal distribution
 50% of all citations are younger than 10 years (2005-now); 80% of all
citations are younger than 20 years; only 5% of citations are older than
1974.
7
Setting up a benchmarking environment
 We need to move away from anecdotal evidence to statistically
meaningful facts
 TAPAS
8
SEARCH
INDEX
Applications
Method 1 Method 2
MAP:0.4 MAP:0.2
Patent Corpus
1
2
3
4
* Exploiting real queries
Setting up a prototyping environment - KNIME
9
1
2
3
4
1
1
1
1 2
2
2
3
3
3
1
Evaluating the results
10
Graph Databases are valid tools - if we have a good
starting document (seed)
11
Graph Databases are valid tools - if we have a good
starting document (seed)
12
Graph Databases are valid tools - if we have a good
starting document (seed)
13
Graph Databases are valid tools - if we have a good
starting document (seed)
14
Again Meta-Data based!
But where do we start with an incoming patent
application?
15
?
Patent
Application
This has been implemented during the last 1-3
years, but
 Literature suggest that we are sealed with our parameter optimization
strategies applying classic IR methods
 We ignore the huge NPL part of the citations
 The problem becomes worse every day (~3000 applications per week)
16
A searcher tries to work around “meaning” by:
 Proximity Queries simulate or approximate “meaning”
 Assumption: certain distances transport more meaning than others
(e.g. 3w or p) .
 We want to ask “Give me all documents that are relevant with regards
to treatment of migraine pain with Aspirin”
 But we actually ask “Migraine AND Pain AND Aspirin” or many variants
of that.
 Classification is a very strong aid, representing a meaningful relation
<belongs to>
17
What does search actually mean?
 Claim 1: A composition comprising a combination of paracetamol
and aspirin for use in the treatment a migraine pain in a human
subject.
 Claim 2: A composition according to claim 1 where the composition
further comprises caffeine
18
A Knowledge Map of Claims 1 & 2
 Claim 1: A method for treating migraine pain comprising administering to a human subject a
composition comprising a combination of paracetamol and aspirin.
 Claim 2: A method according to claim 1 where the composition further comprises caffeine
19
What is the Δ of Prior Art and the Application?
Δ
20
We use meta-data knowledge maps with simple
relations already
21
Moving towards real knowledge maps
 Normalized Annotations are one step towards semantic search
connecting mentions in patents with normalized entities
 Good coverage for biomedical domains
 Lack of good terminologies for everything else
22
Approaches do exist
23
Patents have multi-modal information content: Images
 Images
– Chemical Formulas
– Flow Diagrams
– Circuits
– Technical Drawings
24
Google Image Search
25
Image Search
26
Search
Space
Query
State of the art
Image processing
Filtering and
Visualisation
Image Search using S&K prototype
27
Modelling Search – which direction do we go?
28
PA X
Is modelling the Examiner the best
choice?
Enrichment and
Annotations
Natural Language
Processing
Topic ModellingInformation Extraction
Knowledge Bases
Visualisation
Techniques
Workflow Management
Information Retrieval
Modelling the Search
Process
Knowledge Organisation
Systems
Technologies that can guide us
29
Future Search Ecosystem bringing together many
technologies
• Captured Domain
Knowledge allows to
merge and get relevant
third party
documents/results
• „Machine“ Understanding of
Application allows for „Auto-Query“
generation
• IR System retrieves relevant documents from query
• Enrichment
allows
„semantic“
search
• Examiner is „Search Pilot“
30
Thank you for your attention
aklenner@epo.org
31

Más contenido relacionado

La actualidad más candente

ANALYTICS OF PATENT CASE RULINGS: EMPIRICAL EVALUATION OF MODELS FOR LEGAL RE...
ANALYTICS OF PATENT CASE RULINGS: EMPIRICAL EVALUATION OF MODELS FOR LEGAL RE...ANALYTICS OF PATENT CASE RULINGS: EMPIRICAL EVALUATION OF MODELS FOR LEGAL RE...
ANALYTICS OF PATENT CASE RULINGS: EMPIRICAL EVALUATION OF MODELS FOR LEGAL RE...
Kripa (कृपा) Rajshekhar
 
Technological Route between Pioneerism and Improvement
Technological Route between Pioneerism and ImprovementTechnological Route between Pioneerism and Improvement
Technological Route between Pioneerism and Improvement
Roberto Nani
 

La actualidad más candente (20)

Self Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docxSelf Study Business Approach to DS_01022022.docx
Self Study Business Approach to DS_01022022.docx
 
2016 04-19 machine learning
2016 04-19 machine learning2016 04-19 machine learning
2016 04-19 machine learning
 
2016 03-16 digital energy luncheon
2016 03-16 digital energy luncheon2016 03-16 digital energy luncheon
2016 03-16 digital energy luncheon
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
2016 06-07 data driven production
2016 06-07 data driven production2016 06-07 data driven production
2016 06-07 data driven production
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
 
Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...
 
resume_MH
resume_MHresume_MH
resume_MH
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data science
 
Towards Correlating Search on Google and Asking on Stack Overflow
Towards Correlating Search on Google and Asking on Stack OverflowTowards Correlating Search on Google and Asking on Stack Overflow
Towards Correlating Search on Google and Asking on Stack Overflow
 
ANALYTICS OF PATENT CASE RULINGS: EMPIRICAL EVALUATION OF MODELS FOR LEGAL RE...
ANALYTICS OF PATENT CASE RULINGS: EMPIRICAL EVALUATION OF MODELS FOR LEGAL RE...ANALYTICS OF PATENT CASE RULINGS: EMPIRICAL EVALUATION OF MODELS FOR LEGAL RE...
ANALYTICS OF PATENT CASE RULINGS: EMPIRICAL EVALUATION OF MODELS FOR LEGAL RE...
 
Technological Route between Pioneerism and Improvement
Technological Route between Pioneerism and ImprovementTechnological Route between Pioneerism and Improvement
Technological Route between Pioneerism and Improvement
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
II-SDV 2017: Applications of RNN (Recurrent Neural Networks) within Machine T...
II-SDV 2017: Applications of RNN (Recurrent Neural Networks) within Machine T...II-SDV 2017: Applications of RNN (Recurrent Neural Networks) within Machine T...
II-SDV 2017: Applications of RNN (Recurrent Neural Networks) within Machine T...
 
Adding Open Data Value to 'Closed Data' Problems
Adding Open Data Value to 'Closed Data' ProblemsAdding Open Data Value to 'Closed Data' Problems
Adding Open Data Value to 'Closed Data' Problems
 
Interleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904LabsInterleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904Labs
 
Content + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learningContent + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learning
 
Data science
Data scienceData science
Data science
 
Programming with Semantic Broad Data
Programming with Semantic Broad DataProgramming with Semantic Broad Data
Programming with Semantic Broad Data
 
Overview of Data Science and AI
Overview of Data Science and AIOverview of Data Science and AI
Overview of Data Science and AI
 

Destacado

II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
Dr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
Dr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
Dr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
Dr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
Dr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
Dr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
Dr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
Dr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
Dr. Haxel Consult
 

Destacado (9)

II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 

Similar a II-SDV 2015, 20 - 21 April, in Nice

ODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For GoodODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For Good
Karry Lu
 
MLforIR.pps
MLforIR.ppsMLforIR.pps
MLforIR.pps
butest
 
RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?
Sven Van Poucke, MD, PhD
 
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
CTSI at UCSF
 

Similar a II-SDV 2015, 20 - 21 April, in Nice (20)

ODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For GoodODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For Good
 
informatics_future.pdf
informatics_future.pdfinformatics_future.pdf
informatics_future.pdf
 
Wc11 talk trawling_bibliome_3_r_alkema_25082021
Wc11 talk trawling_bibliome_3_r_alkema_25082021Wc11 talk trawling_bibliome_3_r_alkema_25082021
Wc11 talk trawling_bibliome_3_r_alkema_25082021
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016
 
MLforIR.pps
MLforIR.ppsMLforIR.pps
MLforIR.pps
 
Introduction to machine_learning_us
Introduction to machine_learning_usIntroduction to machine_learning_us
Introduction to machine_learning_us
 
From data lakes to actionable data (adventures in data curation)
From data lakes to actionable data (adventures in data curation)From data lakes to actionable data (adventures in data curation)
From data lakes to actionable data (adventures in data curation)
 
Connected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul Groth
 
RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?
 
Embi cri review-2012-final
Embi cri review-2012-finalEmbi cri review-2012-final
Embi cri review-2012-final
 
Martone grethe
Martone gretheMartone grethe
Martone grethe
 
Linked Data and Ontology Tutorial (for RD-Connect)
Linked Data and Ontology Tutorial (for RD-Connect)Linked Data and Ontology Tutorial (for RD-Connect)
Linked Data and Ontology Tutorial (for RD-Connect)
 
An Improved Mining Of Biomedical Data From Web Documents Using Clustering
An Improved Mining Of Biomedical Data From Web Documents Using ClusteringAn Improved Mining Of Biomedical Data From Web Documents Using Clustering
An Improved Mining Of Biomedical Data From Web Documents Using Clustering
 
Frankie Rybicki slide set for Deep Learning in Radiology / Medicine
Frankie Rybicki slide set for Deep Learning in Radiology / MedicineFrankie Rybicki slide set for Deep Learning in Radiology / Medicine
Frankie Rybicki slide set for Deep Learning in Radiology / Medicine
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical Data
 
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
 
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYINTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
 
PLOS Visualization Project
PLOS Visualization ProjectPLOS Visualization Project
PLOS Visualization Project
 
Nordic health data metadata
Nordic health data   metadataNordic health data   metadata
Nordic health data metadata
 
Peter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-ReviewPeter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-Review
 

Más de Dr. Haxel Consult

AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
Dr. Haxel Consult
 
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
Dr. Haxel Consult
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
Dr. Haxel Consult
 

Más de Dr. Haxel Consult (20)

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance Center
 
AI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IPAI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IP
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOC
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
 

Último

call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Monica Sydney
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理
F
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
JOHNBEBONYAP1
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Monica Sydney
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
ayvbos
 

Último (20)

call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime NagercoilNagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 

II-SDV 2015, 20 - 21 April, in Nice

  • 1. Future Challenges in (automated) Patent Search Alexander G. Klenner-Bajaja, PhD. aklenner@epo.org
  • 2. Why Search – European Patent Convention 2 Information Management`s Task: Support Search
  • 3. Introduction – What do we want, where are we? 3
  • 4. The current Search System  A boolean search system, documents are returned as sets  Search is dominated by meta-data search as well as keywords 4 Search Space boolean query
  • 5. The current Search System  A Lucene elastic search based system, documents are returned as ranked lists (pilot – fully available but no extensive training)  Moving away from a meta-data dominated search...? 5 Search Space k Lucene query 1
  • 6. Patent Gold Standards  We have “manually” curated search reports for about 40 million simple patent families  The relevant documents are mentioned in the search report as either –X(I,N),A,Y,... documents 6 median: 5 citations in search reports
  • 7. Citation temporal distribution  50% of all citations are younger than 10 years (2005-now); 80% of all citations are younger than 20 years; only 5% of citations are older than 1974. 7
  • 8. Setting up a benchmarking environment  We need to move away from anecdotal evidence to statistically meaningful facts  TAPAS 8 SEARCH INDEX Applications Method 1 Method 2 MAP:0.4 MAP:0.2 Patent Corpus 1 2 3 4 * Exploiting real queries
  • 9. Setting up a prototyping environment - KNIME 9 1 2 3 4 1 1 1 1 2 2 2 3 3 3 1
  • 11. Graph Databases are valid tools - if we have a good starting document (seed) 11
  • 12. Graph Databases are valid tools - if we have a good starting document (seed) 12
  • 13. Graph Databases are valid tools - if we have a good starting document (seed) 13
  • 14. Graph Databases are valid tools - if we have a good starting document (seed) 14 Again Meta-Data based!
  • 15. But where do we start with an incoming patent application? 15 ? Patent Application
  • 16. This has been implemented during the last 1-3 years, but  Literature suggest that we are sealed with our parameter optimization strategies applying classic IR methods  We ignore the huge NPL part of the citations  The problem becomes worse every day (~3000 applications per week) 16
  • 17. A searcher tries to work around “meaning” by:  Proximity Queries simulate or approximate “meaning”  Assumption: certain distances transport more meaning than others (e.g. 3w or p) .  We want to ask “Give me all documents that are relevant with regards to treatment of migraine pain with Aspirin”  But we actually ask “Migraine AND Pain AND Aspirin” or many variants of that.  Classification is a very strong aid, representing a meaningful relation <belongs to> 17
  • 18. What does search actually mean?  Claim 1: A composition comprising a combination of paracetamol and aspirin for use in the treatment a migraine pain in a human subject.  Claim 2: A composition according to claim 1 where the composition further comprises caffeine 18
  • 19. A Knowledge Map of Claims 1 & 2  Claim 1: A method for treating migraine pain comprising administering to a human subject a composition comprising a combination of paracetamol and aspirin.  Claim 2: A method according to claim 1 where the composition further comprises caffeine 19
  • 20. What is the Δ of Prior Art and the Application? Δ 20
  • 21. We use meta-data knowledge maps with simple relations already 21
  • 22. Moving towards real knowledge maps  Normalized Annotations are one step towards semantic search connecting mentions in patents with normalized entities  Good coverage for biomedical domains  Lack of good terminologies for everything else 22
  • 24. Patents have multi-modal information content: Images  Images – Chemical Formulas – Flow Diagrams – Circuits – Technical Drawings 24
  • 26. Image Search 26 Search Space Query State of the art Image processing Filtering and Visualisation
  • 27. Image Search using S&K prototype 27
  • 28. Modelling Search – which direction do we go? 28 PA X Is modelling the Examiner the best choice?
  • 29. Enrichment and Annotations Natural Language Processing Topic ModellingInformation Extraction Knowledge Bases Visualisation Techniques Workflow Management Information Retrieval Modelling the Search Process Knowledge Organisation Systems Technologies that can guide us 29
  • 30. Future Search Ecosystem bringing together many technologies • Captured Domain Knowledge allows to merge and get relevant third party documents/results • „Machine“ Understanding of Application allows for „Auto-Query“ generation • IR System retrieves relevant documents from query • Enrichment allows „semantic“ search • Examiner is „Search Pilot“ 30
  • 31. Thank you for your attention aklenner@epo.org 31