Enviar búsqueda
Cargar
Search Basics
•
Descargar como PPTX, PDF
•
0 recomendaciones
•
499 vistas
Sander Kieft
Seguir
Search Basics presentation from August 6th, 2012.
Leer menos
Leer más
Tecnología
Denunciar
Compartir
Denunciar
Compartir
1 de 50
Descargar ahora
Recomendados
Tips for Tuning Solr Search: No Coding Required
Tips for Tuning Solr Search: No Coding Required
Acquia
5 Reasons Your Site Needs Acquia Search
5 Reasons Your Site Needs Acquia Search
Acquia
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
Petter Skodvin-Hvammen
Cloudera search
Cloudera search
Mark Kerzner
Cloudera Search Webinar: Big Data Search, Bigger Insights
Cloudera Search Webinar: Big Data Search, Bigger Insights
Cloudera, Inc.
Apache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 Acquia
Dropsolid
AWS Simple Storage Service (s3)
AWS Simple Storage Service (s3)
zekeLabs Technologies
Intro to AWS: Storage Services
Intro to AWS: Storage Services
Amazon Web Services
Recomendados
Tips for Tuning Solr Search: No Coding Required
Tips for Tuning Solr Search: No Coding Required
Acquia
5 Reasons Your Site Needs Acquia Search
5 Reasons Your Site Needs Acquia Search
Acquia
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
Petter Skodvin-Hvammen
Cloudera search
Cloudera search
Mark Kerzner
Cloudera Search Webinar: Big Data Search, Bigger Insights
Cloudera Search Webinar: Big Data Search, Bigger Insights
Cloudera, Inc.
Apache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 Acquia
Dropsolid
AWS Simple Storage Service (s3)
AWS Simple Storage Service (s3)
zekeLabs Technologies
Intro to AWS: Storage Services
Intro to AWS: Storage Services
Amazon Web Services
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Lucidworks
Candidate selection tutorial
Candidate selection tutorial
Yiqun Liu
SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...
SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...
Aman Grover
2016 Cymer Intern
2016 Cymer Intern
Akhilesh Aji
How to Apply Your Taxonomy to Your Content Automatically
How to Apply Your Taxonomy to Your Content Automatically
Access Innovations, Inc.
Sumo Logic QuickStart
Sumo Logic QuickStart
Sumo Logic
Análisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic Stack
Elasticsearch
Introduction to Cloudera Search Training
Introduction to Cloudera Search Training
Cloudera, Inc.
Search
Search
Gayathri Narayanan
Sumo Logic QuickStart - May 2016
Sumo Logic QuickStart - May 2016
Sumo Logic
Sumo Logic QuickStart Webinar Sep 2016
Sumo Logic QuickStart Webinar Sep 2016
Sumo Logic
Extending Solr: Behind CareerBuilder’s Cloud-like Knowledge Discovery Platfor...
Extending Solr: Behind CareerBuilder’s Cloud-like Knowledge Discovery Platfor...
lucenerevolution
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
Lucidworks (Archived)
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
Richard Robinson
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
Grokking VN
GWAVACon 2015: GWAVA - Sneak Peek
GWAVACon 2015: GWAVA - Sneak Peek
GWAVA
Elastic Stack roadmap deep dive
Elastic Stack roadmap deep dive
Elasticsearch
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Lucidworks
RedisSearch / CRDT: Kyle Davis, Meir Shpilraien
RedisSearch / CRDT: Kyle Davis, Meir Shpilraien
Redis Labs
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
Dr. Haxel Consult
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
UiPathCommunity
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
FIDO Alliance
Más contenido relacionado
Similar a Search Basics
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Lucidworks
Candidate selection tutorial
Candidate selection tutorial
Yiqun Liu
SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...
SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...
Aman Grover
2016 Cymer Intern
2016 Cymer Intern
Akhilesh Aji
How to Apply Your Taxonomy to Your Content Automatically
How to Apply Your Taxonomy to Your Content Automatically
Access Innovations, Inc.
Sumo Logic QuickStart
Sumo Logic QuickStart
Sumo Logic
Análisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic Stack
Elasticsearch
Introduction to Cloudera Search Training
Introduction to Cloudera Search Training
Cloudera, Inc.
Search
Search
Gayathri Narayanan
Sumo Logic QuickStart - May 2016
Sumo Logic QuickStart - May 2016
Sumo Logic
Sumo Logic QuickStart Webinar Sep 2016
Sumo Logic QuickStart Webinar Sep 2016
Sumo Logic
Extending Solr: Behind CareerBuilder’s Cloud-like Knowledge Discovery Platfor...
Extending Solr: Behind CareerBuilder’s Cloud-like Knowledge Discovery Platfor...
lucenerevolution
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
Lucidworks (Archived)
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
Richard Robinson
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
Grokking VN
GWAVACon 2015: GWAVA - Sneak Peek
GWAVACon 2015: GWAVA - Sneak Peek
GWAVA
Elastic Stack roadmap deep dive
Elastic Stack roadmap deep dive
Elasticsearch
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Lucidworks
RedisSearch / CRDT: Kyle Davis, Meir Shpilraien
RedisSearch / CRDT: Kyle Davis, Meir Shpilraien
Redis Labs
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
Dr. Haxel Consult
Similar a Search Basics
(20)
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Candidate selection tutorial
Candidate selection tutorial
SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...
SIGIR 2017 - Candidate Selection for Large Scale Personalized Search and Reco...
2016 Cymer Intern
2016 Cymer Intern
How to Apply Your Taxonomy to Your Content Automatically
How to Apply Your Taxonomy to Your Content Automatically
Sumo Logic QuickStart
Sumo Logic QuickStart
Análisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic Stack
Introduction to Cloudera Search Training
Introduction to Cloudera Search Training
Search
Search
Sumo Logic QuickStart - May 2016
Sumo Logic QuickStart - May 2016
Sumo Logic QuickStart Webinar Sep 2016
Sumo Logic QuickStart Webinar Sep 2016
Extending Solr: Behind CareerBuilder’s Cloud-like Knowledge Discovery Platfor...
Extending Solr: Behind CareerBuilder’s Cloud-like Knowledge Discovery Platfor...
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
Extending Solr: Building a Cloud-like Knowledge Discovery Platform
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
GWAVACon 2015: GWAVA - Sneak Peek
GWAVACon 2015: GWAVA - Sneak Peek
Elastic Stack roadmap deep dive
Elastic Stack roadmap deep dive
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
RedisSearch / CRDT: Kyle Davis, Meir Shpilraien
RedisSearch / CRDT: Kyle Davis, Meir Shpilraien
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
Último
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
UiPathCommunity
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
FIDO Alliance
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
johnbeverley2021
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
Kumar Satyam
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
caitlingebhard1
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
danishmna97
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
WSO2
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
Zilliz
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Jeffrey Haguewood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Juan lago vázquez
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
WSO2
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
Zilliz
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
MarkSteadman7
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
TopCSSGallery
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
WSO2
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
AnitaRaj43
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
Pixlogix Infotech
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
Dropbox
Último
(20)
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
Search Basics
1.
Search Find the rabbit..
2.
224.4.2015 © Sanoma
Media Agenda • Search Basics • Features • Search solutions »MySQL (Full-Text search and Sphinx) »Solr »ElasticSearch • Sanoma Content Library • Common gotcha’s
3.
Basics A.B.C. of search
4.
High level components Filtering
Indexing Querying Ranking 424.4.2015 © Sanoma Media
5.
High level components Filtering
techniques Filtering Indexing Querying Ranking 54/24/2015 © Sanoma Media • Tokenizing • Stop Words • Synonyms • Stemming • Term occurrence • Phonetics
6.
High level components Filtering
techniques Filtering Indexing Querying Ranking 64/24/2015 © Sanoma Media • Tokenizing • Stop Words • Synonyms • Stemming • Term occurrence • Phonetics The quick brown fox jumps over a lazy dog The quick brown fox jumps over a lazy dog
7.
High level components Filtering
techniques Filtering Indexing Querying Ranking 74/24/2015 © Sanoma Media • Tokenizing • Stop Words • Synonyms • Stemming • Term occurrence • Phonetics • Special characters: +/-,.;!@#$%^& etc. »I.B.M. • Case and numeric changes: »PowerShot, TransAM, SD500, iPod • Decide what you want to happened with: »Canon Power-Shot SD500 (Canon Power shot SD-500, Canon Powershot SD 500) »O’neill’s
8.
• Remove stop
words from being indexed • No value, since they’re to common The quick quick brown brown fox fox jumps jumps over a lazy lazy dog dog Stop words a,able,about, across,after,al l,almost,also, am,among,an ,and,any,are, as,at,be,beca use,been,but, by,can,cannot ,could,dear,di d,do,does,eith er,else,ever,e very,for,from, ,got,had have,h High level components Filtering techniques Filtering Indexing Querying Ranking 84/24/2015 © Sanoma Media • Tokenizing • Stop Words • Synonyms • Stemming • Term occurrence • Phonetics
9.
High level components Filtering
techniques Filtering Indexing Querying Ranking 94/24/2015 © Sanoma Media • Tokenizing • Stop Words • Synonyms • Stemming • Term occurrence • Phonetics • De-duplicate various words: »bicycle, cycle, bike »i-pod, ipot => iPod
10.
High level components Filtering
techniques Filtering Indexing Querying Ranking 104/24/2015 © Sanoma Media • Tokenizing • Stop Words • Synonyms • Stemming • Term occurrence • Phonetics • Determine the stem of a word: »Dogs => dog »Recharging => recharg »Rechargeable => recharg • Language specific: »Porter for English (-s, -ed, -ly, -ing, etc.) »SnowballPorter or Kraaij-Pohlmann for Dutch (ge-, -en, etc.)
11.
High level components Filtering
techniques Filtering Indexing Querying Ranking 114/24/2015 © Sanoma Media • Tokenizing • Stop Words • Synonyms • Stemming • Term occurrence • Phonetics • Options for limiting the size of the index: »Minimum Term frequency »Minimum Term Length
12.
High level components Filtering
techniques Filtering Indexing Querying Ranking 124/24/2015 © Sanoma Media • Tokenizing • Stop Words • Synonyms • Stemming • Term occurrence • Phonetics • Handling sounds like queries: » Robert => R163 <= Rupert » Smith => (SM0,XMT) ∩ (XMT,SMT) <= Schmith • Various methods available: » DoubleMetaphone » Metaphone » Soundex » RefinedSoundex » Caverphone » BeiderMorse • Levenstein can be used during quering
13.
High level components Apply
the filters on Filtering and querying Filtering Indexing Querying Ranking 1324.4.2015 © Sanoma Media Same filters
14.
Stopwords,stemming, synonyms,etc. Filters High level components Indexing Filtering
Indexing Querying Ranking 1424.4.2015 © Sanoma Media
15.
High level components Querying Filtering
Indexing Querying Ranking 1524.4.2015 © Sanoma Media
16.
DEMO Stemming, Phonetics 1624.4.2015 ©
Sanoma Media
17.
High level components Ranking Filtering
Indexing Querying Ranking 1724.4.2015 © Sanoma Media TF-IDF Term Frequency-Inverse Document Frequency How often does the search term occur in the text How many words are in the entire text
18.
High level components Ranking
– TF-IDF Filtering Indexing Querying Ranking 1824.4.2015 © Sanoma Media 3/12 = 0,25 5/24 = 0,21 More relevant
19.
USER PATTERNS 1924.4.2015 ©
Sanoma Media
20.
User patterns • Features
should be adjusted to the user and usage patterns your seeing • What are users searching for on your site • How are they searching for it • Use web analytics to track and improve your search behavior 2024.4.2015 © Sanoma Media Image credits: http://www.flickr.com/photos/morville/collections/72157604060564791/
21.
User pattern -
Quit 2124.4.2015 © Sanoma Media Image credits: http://www.flickr.com/photos/morville/collections/72157604060564791/
22.
User patterns –
Pogosticking 2224.4.2015 © Sanoma Media Image credits: http://www.flickr.com/photos/morville/collections/72157604060564791/
23.
User patterns -
Thrashing 2324.4.2015 © Sanoma Media Image credits: http://www.flickr.com/photos/morville/collections/72157604060564791/
24.
User patterns -
Narrow 2424.4.2015 © Sanoma Media Image credits: http://www.flickr.com/photos/morville/collections/72157604060564791/
25.
User patterns –
Others • Pearl Growing • Expand 2524.4.2015 © Sanoma Media Image credits: http://www.flickr.com/photos/morville/collections/72157604060564791/
26.
Search Features
27.
Search Features • Faceting •
Autocomplete • More like this.. • Highlighting • Spellchecking did you mean • Geospatial “bike repair” in area of [long,lat],[long,lat] • Boosting when title is more relevant then content • Elevation always get a certain result at position n get the current weather, current traffic at 1st position or ingest ads 2724.4.2015 © Sanoma Media
28.
Search Features -
Faceting 2824.4.2015 © Sanoma Media From the user perspective, faceted search (also called faceted navigation, guided navigation, or parametric search) breaks up search results into multiple categories, typically showing counts for each, and allows the user to "drill down" or further restrict their search results based on those facets.
29.
Search Features -
Autocomplete 2924.4.2015 © Sanoma Media
30.
Search Features -
More like this.. 3024.4.2015 © Sanoma Media • Give you the related items based on a document • Compares the Term Vectors of various documents • Creates a query with boosting: body:pre body:username^.56974 body:column^.57123 body:oracle^.61915 ... Term Number of Instances of Term in Document Number of Documents Matching Term IDF value Score pre 18 26 4.609916 82.978 username 10 23 4.7276993 47.276 column 9 13 5.266696 47.400264 oracle 9 8 5.7085285 51.376 alter 7 1 7.212606 50.488
31.
Search Features -
Highlighting 3124.4.2015 © Sanoma Media • Highlighting the search terms • Includes stemming and other logic
32.
DEMO SOLR 3224.4.2015 ©
Sanoma Media
33.
SOLUTIONS 3324.4.2015 © Sanoma
Media
34.
Services Common search options •
MySQL based »Native Full-Text search »Sphinx Search Plugin • Lucene based (Java) »Apache Lucene/Solr »ElasticSearch 3424.4.2015 © Sanoma Media
35.
Services Common search options 3524.4.2015
© Sanoma Media Ease of use Power
36.
MySQL Based Native Full-Text
vs Sphinx MySQL Full-Text search • Only for MyISAM tables, and only on CHAR, VARCHAR and TEXT fields • Only standard English stop words • Limited query capabilities • Slow on large collections (1GB+) • Building facetting is “hard” and “expensive” • No stemming, no synonyms, no custom flieds, no highlighting Sphinx • External plugin • All storage engines • Also on numeric field types • ~3x faster on index and query • Simple stemming and synonyms • No custom fields, no highlighting 3624.4.2015 © Sanoma Media
37.
Querying is easy •
MySQL Full-Text query: SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('database'); • Getting the score: SELECT id, MATCH (title,body) AGAINST ('Tutorial') FROM articles; • Sphinx query, index is separate table: SELECT id, created_time, @weight FROM my_sphinx_index WHERE created_time BETWEEN (X AND Y) AND MATCH ('Android phone’) ORDER by @weight DESC, created_time DESC 3724.4.2015 © Sanoma Media
38.
Lucene based ElasticSearch • Simpler
Solr • No need for a schema • Easy to cluster • Focus on scaling and realtime • Go with the defaults • Configuration = 3 lines • Percolation! • Versions and TTLs Solr • Exposing all of the lucene power • Clustering possible, but harder • Focus on complete and customizable • Defaults? • Configuration = 3.000 lines 3824.4.2015 © Sanoma Media
39.
Solr vs ElasticSearch Search
Fresh Index While Idle 0 10 20 30 40 50 60 Searchtimeinms. ElasticSearch Solr 3924.4.2015 © Sanoma Media Lower is better
40.
Solr vs ElasticSearch Search
Fresh Index While Indexing 1doc/3sec 0 50 100 150 200 250 Searchtimeinms. ElasticSearch Solr 4024.4.2015 © Sanoma Media Lower is better
41.
Solr vs ElasticSearch Search
Full Index While Indexing 1doc/3sec 0 500 1000 1500 2000 2500 Searchtimeinms. ElasticSearch Solr 4124.4.2015 © Sanoma Media Lower is better
42.
Solr vs ElasticSearch Search
Full Index While Indexing 1doc/3sec 0 500 1000 1500 2000 2500 Searchtimeinms. ElasticSearch Solr 4224.4.2015 © Sanoma Media Lower is better Idle Indexing Full + Indexing
43.
Solr vs ElasticSearch 4324.4.2015
© Sanoma Media Lower is better SOLR ElasticSearch
44.
Querying with Solr
and ElasticSearch Solr • Normal query http://../solr?q=field:banana • Facetting http://../solr?q=field:banana&facet= on&facet.field=tags ElasticSearch • Normal query http://../_search?q=field:value • Advanced queries, via PUT: POST http://../collection/seach { "query": { "query_string" :{"query" : "T*"} }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } } } 4424.4.2015 © Sanoma Media
45.
ElasticSearch 4524.4.2015 © Sanoma
Media
46.
SANOMA CONTENT LIBRARY 4624.4.2015
© Sanoma Media
47.
Sanoma Content Library Search ..
in site .. in cluster .. in network Elevation (ads) Facetting Related More like this Relevant ads Products Reuse Sharing Variants (simple) Drm Images Analyse Sentiment Named Entities Tagging Classificatie Key phrases 474/24/2015 © Sanoma Media
48.
Services: Content Library 4824.4.2015
© Sanoma Media Content Library Analyse Pipeline NER Sentiment Crawler Indexer Search index Search - nu.nl - wtf Related - Vrouwen - Kieskeurig Relevant - Txel API Edge Redir ects Loader Solr Mongo Integration - Vrouwen - Wordpress - SAS CMS / JCR Keyphrase extractor Classifier
49.
Common gotcha’s • Use
right settings for your language stopwords and stemming • Indexing too much or too detailed: »Timestamps 4924.4.2015 © Sanoma Media
50.
END 5024.4.2015 © Sanoma
Media
Notas del editor
1
2
4
5
6
7
8
9
10
11
12
13
14
15
17
18
MySQL Stopwords => myisam/ft_static.h Also no multi-term
Search Precolation
Descargar ahora