SlideShare una empresa de Scribd logo
1 de 19
Descargar para leer sin conexión
ElasticSearch/Elastica
Nicolas Badey
About me
Yesterday CTO of Yoopies
Tomorow CTO of Expertissim
SfLive is magic !
What is it ?
● “Distributed, RESTful, Search Engine built on top of Apache
Lucene”
● Easy to install : aptitude install elasticsearch
● Easy to use, you will love JSON
● Denormalizing your data
Features
- Scoring : Calculate relevance, boost, Score Scripting
- Analyzers : a Tokenizer with TokenFilters and CharFilters
- GeoLocation
- Facets => Aggregations
- Highlighting
- Scripting
- Percolator : Prospective search
- 3 layers cache
- Plugin (attachment type, River …)
- Suggester : autocompletion and more
Why ElasticSearch
● For SearchEngine: we reach SQL efficient and functional limits
● An easy solution for a first approach to Search Engine
● Denormalize our data for search
● Used in : Search Form, Cron , SEO page, Business Metrics...
Elastica / ElasticaBundle
● Persistence automatic provider, Doctrine/Propel/MongoDB
● Pagination, PagerFanta/KNPpaginator
● Persistence listener CallBack (only Doctrine)
● Populate
Finally we don’t use it anymore, we just keep it for index config and services
Index Type FinderClient
Search
curl -XGET http://localhost:9200/[INDEX]/[TYPE]/_search -d ‘{
"query": {
"query_string": {
"query": "foobar"
}
},
"filter": {
"numeric_range": {
"price": {
"lte": 42
}
}
},
"sort": {
"created_at": {
"order": "desc"
}
}
Query:
- Relevance
- Scoring
Filter :
- Discriminate
- Cached
- Fast
Search
ETL
● Extract all ads from SQL, Transform it then Load it in ElasticSearch
● Don’t use “Populate” for large project
● Still in PHP and Symfony2 for using our Model layer (or not...)
● DoctrineListener as AMQP publisher for live indexing
● Need to be fast : PDO & Curl : 10 types, 500 000 ads , 5min
● Next : decoupling outside Symfony with Console Components
Usage SitterForm
SitterSearch
SitterQuery
extend ElasticaQuery
QueryFactory
ResultSet
PagerFanta
ElasticaAdapter
SearchManager
A Good FullText Search
● MultiMatch Query : Search text in multiple fields
● Highlighting : Highlight words in documents
● Suggester : Do autocompletion
● Find compromise between relevance and quantity
Multi Match Query
subfields, for fullText search : my_field.fr and
my_field.en
“regular” field “my_field”
Multi Match Query
a boost by 3 on content’s subfields
all title’s subfields but not title itself
Highlight with MultiMatch
Suggester
Percollator
● Index user’s search query in a “percolator index”
● When an ad is registered, send it to regular index and percolator
● Matched percolator names will be return
● You can alert user that an ad corresponding to his alert has just
been registered
Aggregator
Score Scripting
in /etc/elasticsearch/scripts/grade.groovy :
doc['average_grade'].value > 3.5 ? _score * doc['average_grade'].value : _score
in /etc/elasticsearch/scripts/login.groovy :
doc['lastLogin'].value < minLastLogin ? _score * 0.5 : _score
Error : Easy To Understand :)
● Most of the time due to strong typing (string instead of int)
● Be carreful to space left in HDD when indexing

Más contenido relacionado

La actualidad más candente

Object Oriented Programming in JavaScript
Object Oriented Programming in JavaScriptObject Oriented Programming in JavaScript
Object Oriented Programming in JavaScript
zand3rs
 
Python, web scraping and content management: Scrapy and Django
Python, web scraping and content management: Scrapy and DjangoPython, web scraping and content management: Scrapy and Django
Python, web scraping and content management: Scrapy and Django
Sammy Fung
 

La actualidad más candente (20)

NS Prefix 外伝 … Copy-On-Write #関モバ
NS Prefix 外伝 … Copy-On-Write #関モバNS Prefix 外伝 … Copy-On-Write #関モバ
NS Prefix 外伝 … Copy-On-Write #関モバ
 
Using Webservice in iOS
Using Webservice  in iOS Using Webservice  in iOS
Using Webservice in iOS
 
Token module in drupal 8
Token module in drupal 8Token module in drupal 8
Token module in drupal 8
 
Module Ninja .JS
Module Ninja .JSModule Ninja .JS
Module Ninja .JS
 
Spring Data in 10 minutes
Spring Data in 10 minutesSpring Data in 10 minutes
Spring Data in 10 minutes
 
JS basics
JS basicsJS basics
JS basics
 
Suggest.js
Suggest.jsSuggest.js
Suggest.js
 
Object Oriented Programming in JavaScript
Object Oriented Programming in JavaScriptObject Oriented Programming in JavaScript
Object Oriented Programming in JavaScript
 
Python, web scraping and content management: Scrapy and Django
Python, web scraping and content management: Scrapy and DjangoPython, web scraping and content management: Scrapy and Django
Python, web scraping and content management: Scrapy and Django
 
A Deeper look into Javascript Basics
A Deeper look into Javascript BasicsA Deeper look into Javascript Basics
A Deeper look into Javascript Basics
 
N hidden gems you didn't know hippo delivery tier and hippo (forge) could give
N hidden gems you didn't know hippo delivery tier and hippo (forge) could giveN hidden gems you didn't know hippo delivery tier and hippo (forge) could give
N hidden gems you didn't know hippo delivery tier and hippo (forge) could give
 
Closer look at PHP Unserialization by Ashwin Shenoi
Closer look at PHP Unserialization by Ashwin ShenoiCloser look at PHP Unserialization by Ashwin Shenoi
Closer look at PHP Unserialization by Ashwin Shenoi
 
Pollock
PollockPollock
Pollock
 
JavaScript Basics
JavaScript BasicsJavaScript Basics
JavaScript Basics
 
Advanced JavaScript
Advanced JavaScriptAdvanced JavaScript
Advanced JavaScript
 
N hidden gems in hippo forge and experience plugins (dec17)
N hidden gems in hippo forge and experience plugins (dec17)N hidden gems in hippo forge and experience plugins (dec17)
N hidden gems in hippo forge and experience plugins (dec17)
 
Querydsl overview 2014
Querydsl overview 2014Querydsl overview 2014
Querydsl overview 2014
 
Javascript basics for automation testing
Javascript  basics for automation testingJavascript  basics for automation testing
Javascript basics for automation testing
 
Jena
JenaJena
Jena
 
sos4R - 52° North Innovation Price Presentation
sos4R - 52° North Innovation Price Presentationsos4R - 52° North Innovation Price Presentation
sos4R - 52° North Innovation Price Presentation
 

Similar a ElasticSearch & Elastica in Symfony2 - SfLive 2015

Effective Searching by Dominik Kornas
Effective Searching by Dominik KornasEffective Searching by Dominik Kornas
Effective Searching by Dominik Kornas
AEM HUB
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Lucidworks
 
MongoDB@sfr.fr
MongoDB@sfr.frMongoDB@sfr.fr
MongoDB@sfr.fr
beboutou
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
Databricks
 

Similar a ElasticSearch & Elastica in Symfony2 - SfLive 2015 (20)

SoftNews-lowres
SoftNews-lowresSoftNews-lowres
SoftNews-lowres
 
Building multi billion ( dollars, users, documents ) search engines on open ...
Building multi billion ( dollars, users, documents ) search engines  on open ...Building multi billion ( dollars, users, documents ) search engines  on open ...
Building multi billion ( dollars, users, documents ) search engines on open ...
 
Machine learning and TensorFlow
Machine learning and TensorFlowMachine learning and TensorFlow
Machine learning and TensorFlow
 
Effective Searching by Dominik Kornas
Effective Searching by Dominik KornasEffective Searching by Dominik Kornas
Effective Searching by Dominik Kornas
 
Website Monitoring with Distributed Messages/Tasks Processing (AMQP & RabbitM...
Website Monitoring with Distributed Messages/Tasks Processing (AMQP & RabbitM...Website Monitoring with Distributed Messages/Tasks Processing (AMQP & RabbitM...
Website Monitoring with Distributed Messages/Tasks Processing (AMQP & RabbitM...
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
Elasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ SignalElasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ Signal
 
OSMC 2023 | Experiments with OpenSearch and AI by Jochen Kressin & Leanne La...
OSMC 2023 | Experiments with OpenSearch and AI by Jochen Kressin &  Leanne La...OSMC 2023 | Experiments with OpenSearch and AI by Jochen Kressin &  Leanne La...
OSMC 2023 | Experiments with OpenSearch and AI by Jochen Kressin & Leanne La...
 
Using Search API, Search API Solr and Facets in Drupal 8
Using Search API, Search API Solr and Facets in Drupal 8Using Search API, Search API Solr and Facets in Drupal 8
Using Search API, Search API Solr and Facets in Drupal 8
 
KFServing and Feast
KFServing and FeastKFServing and Feast
KFServing and Feast
 
Building a Meta-search Engine
Building a Meta-search EngineBuilding a Meta-search Engine
Building a Meta-search Engine
 
EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...
EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...
EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...
 
Apache lucene
Apache luceneApache lucene
Apache lucene
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
Accelerate Quality with Postman Advance
Accelerate Quality with Postman AdvanceAccelerate Quality with Postman Advance
Accelerate Quality with Postman Advance
 
Building data "Py-pelines"
Building data "Py-pelines"Building data "Py-pelines"
Building data "Py-pelines"
 
MongoDB@sfr.fr
MongoDB@sfr.frMongoDB@sfr.fr
MongoDB@sfr.fr
 
Real time analytics on deep learning @ strata data 2019
Real time analytics on deep learning @ strata data 2019Real time analytics on deep learning @ strata data 2019
Real time analytics on deep learning @ strata data 2019
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHP
 

Último

Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Monica Sydney
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Monica Sydney
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
JOHNBEBONYAP1
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
PowerDirector Explination Process...pptx
PowerDirector Explination Process...pptxPowerDirector Explination Process...pptx
PowerDirector Explination Process...pptx
galaxypingy
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
pxcywzqs
 

Último (20)

"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 
Power point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria IuzzolinoPower point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria Iuzzolino
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
PowerDirector Explination Process...pptx
PowerDirector Explination Process...pptxPowerDirector Explination Process...pptx
PowerDirector Explination Process...pptx
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
 
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime NagercoilNagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
Microsoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftMicrosoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck Microsoft
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 

ElasticSearch & Elastica in Symfony2 - SfLive 2015

  • 2. About me Yesterday CTO of Yoopies Tomorow CTO of Expertissim SfLive is magic !
  • 3. What is it ? ● “Distributed, RESTful, Search Engine built on top of Apache Lucene” ● Easy to install : aptitude install elasticsearch ● Easy to use, you will love JSON ● Denormalizing your data
  • 4. Features - Scoring : Calculate relevance, boost, Score Scripting - Analyzers : a Tokenizer with TokenFilters and CharFilters - GeoLocation - Facets => Aggregations - Highlighting - Scripting - Percolator : Prospective search - 3 layers cache - Plugin (attachment type, River …) - Suggester : autocompletion and more
  • 5. Why ElasticSearch ● For SearchEngine: we reach SQL efficient and functional limits ● An easy solution for a first approach to Search Engine ● Denormalize our data for search ● Used in : Search Form, Cron , SEO page, Business Metrics...
  • 6. Elastica / ElasticaBundle ● Persistence automatic provider, Doctrine/Propel/MongoDB ● Pagination, PagerFanta/KNPpaginator ● Persistence listener CallBack (only Doctrine) ● Populate Finally we don’t use it anymore, we just keep it for index config and services Index Type FinderClient
  • 7. Search curl -XGET http://localhost:9200/[INDEX]/[TYPE]/_search -d ‘{ "query": { "query_string": { "query": "foobar" } }, "filter": { "numeric_range": { "price": { "lte": 42 } } }, "sort": { "created_at": { "order": "desc" } } Query: - Relevance - Scoring Filter : - Discriminate - Cached - Fast
  • 9. ETL ● Extract all ads from SQL, Transform it then Load it in ElasticSearch ● Don’t use “Populate” for large project ● Still in PHP and Symfony2 for using our Model layer (or not...) ● DoctrineListener as AMQP publisher for live indexing ● Need to be fast : PDO & Curl : 10 types, 500 000 ads , 5min ● Next : decoupling outside Symfony with Console Components
  • 11. A Good FullText Search ● MultiMatch Query : Search text in multiple fields ● Highlighting : Highlight words in documents ● Suggester : Do autocompletion ● Find compromise between relevance and quantity
  • 12. Multi Match Query subfields, for fullText search : my_field.fr and my_field.en “regular” field “my_field”
  • 13. Multi Match Query a boost by 3 on content’s subfields all title’s subfields but not title itself
  • 16. Percollator ● Index user’s search query in a “percolator index” ● When an ad is registered, send it to regular index and percolator ● Matched percolator names will be return ● You can alert user that an ad corresponding to his alert has just been registered
  • 18. Score Scripting in /etc/elasticsearch/scripts/grade.groovy : doc['average_grade'].value > 3.5 ? _score * doc['average_grade'].value : _score in /etc/elasticsearch/scripts/login.groovy : doc['lastLogin'].value < minLastLogin ? _score * 0.5 : _score
  • 19. Error : Easy To Understand :) ● Most of the time due to strong typing (string instead of int) ● Be carreful to space left in HDD when indexing