SlideShare una empresa de Scribd logo
1 de 96
Descargar para leer sin conexión
“ Terms of Endearment” The ElasticSearch query language explained Clinton Gormley, YAPC::EU 2011 DRTECH @clintongormley
search for : “ DELETE QUERY ”  We can
search for : “ DELETE QUERY ”  and find : “ deleteByQuery ” We can
but you can only find  what is stored in the database
Normalise values  “ deleteByQuery” 'delete' 'by' 'query' 'deletebyquery'
Normalise values  and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
Normalise  values  and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
Analyse  values  and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
What is stored in ElasticSearch?
{ tweet  => "Perl is GREAT!", posted => "2011-08-15", user  => { name  => "Clinton Gormley", email => "drtech@cpan.org", }, tags  => [" perl" ,"opinion"],  posts  => 2, } Document:
{ tweet   => "Perl is GREAT!", posted  => "2011-08-15", user   => { name   => "Clinton Gormley", email  => "drtech@cpan.org", }, tags   => [" perl" ,"opinion"],  posts   => 2, } Fields:
{ tweet  =>  "Perl is GREAT!", posted =>  "2011-08-15", user  =>  { name  =>  "Clinton Gormley", email =>  "drtech@cpan.org", }, tags   =>  [" perl" ,"opinion"],  posts  =>  2, } Values:
{ tweet  => "Perl is GREAT!", posted => "2011-08-15", user  => { name  => "Clinton Gormley", email => "drtech@cpan.org" }, tags  => [" perl" ,"opinion"],  posts  => 2, } Field types: # object # string # date # nested object # string # string # array of enums # integer
{ tweet  => "Perl is GREAT!", posted => "2011-08-15", user   => { name  => "Clinton Gormley", email  => "drtech@cpan.org", }, tags  => [" perl" ,"opinion"],  posts  => 2, } Nested objects flattened:
{ tweet  => "Perl is GREAT!", posted  => "2011-08-15", user.name  => "Clinton Gormley", user.email  => "drtech@cpan.org", tags  => [" perl" ,"opinion"],  posts  => 2, } Nested objects flattened
{ tweet  =>  "Perl is GREAT!", posted  =>  "2011-08-15", user.name  =>  "Clinton Gormley", user.email =>  "drtech@cpan.org", tags  =>  [" perl" ,"opinion"],  posts  =>  2, } Values analyzed into terms
{ tweet  =>  ['perl','great'], posted  =>  [Date(2011-08-15)], user.name  =>  ['clinton','gormley'], user.email =>  ['drtech','cpan.org'], tags  =>  [' perl' ,'opinion'],  posts  =>  [2], } Values analyzed into terms
database table row ⇒  many tables ⇒  many rows ⇒  one schema ⇒  many columns In MySQL
index type document ⇒  many types ⇒  many documents ⇒  one mapping ⇒  many fields In ElasticSearch
Create index with mappings $es-> create_index ( index  => 'twitter', mappings   => { tweet   => { properties  => { title  => { type => 'string' }, created => { type => 'date'  } }  } } );
Add a mapping $es-> put_mapping (  index => 'twitter', type  => ' user ', mapping   => { properties  => { name  => { type => 'string' }, created => { type => 'date'  }, }  } );
Can add to existing mapping
Can add to existing mapping Cannot change mapping for field
Core field types { type  => 'string', }
Core field types { type  => 'string', # byte|short|integer|long|double|float # date, ip addr, geolocation # boolean # binary (as base 64) }
Core field types { type  => 'string', index  => ' analyzed ', # 'Foo Bar'  ⇒  [ 'foo', 'bar' ] }
Core field types { type  => 'string', index  => ' not_analyzed ', # 'Foo Bar'  ⇒  [ 'Foo Bar' ] }
Core field types { type  => 'string', index  => ' no ', # 'Foo Bar'  ⇒  [ ] }
Core field types { type  => 'string', index  => 'analyzed', analyzer  => 'default', }
Core field types { type  => 'string', index  => 'analyzed', index_ analyzer  => 'default', search_ analyzer => 'default', }
Core field types { type  => 'string', index  => 'analyzed', analyzer  => 'default', boost  => 2, }
Core field types { type  => 'string', index  => 'analyzed', analyzer  => 'default', boost  => 2, include_in_all  => 1 |0 }
[object Object]
Simple
Whitespace
Stop
Keyword Built in analyzers ,[object Object]
Language
Snowball
Custom
The Brown-Cow's Part_No.  #A.BC123-456 joe@bloggs.com keyword: The Brown-Cow's Part_No. #A.BC123-456 joe@bloggs.com whitespace: The, Brown-Cow's, Part_No., #A.BC123-456, joe@bloggs.com simple: the, brown, cow, s, part, no, a, bc, joe, bloggs, com standard: brown, cow's, part_no, a.bc123, 456, joe, bloggs.com snowball (English): brown, cow, part_no, a.bc123, 456, joe, bloggs.com
Token filters ,[object Object]
ASCII Folding
Length
Lowercase
NGram
Edge NGram
Porter Stem
Shingle
Stop
Word Delimiter ,[object Object]
KStem
Snowball
Phonetic
Synonym
Compound Word
Reverse
Elision
Truncate
Unique
Custom Analyzer $c->create_index( index  => 'twitter', settings  => { analysis => { analyzer => { ascii_html => { type  => 'custom', tokenizer  => 'standard', filter  => [ qw( standard lowercase asciifolding stop ) ], char_filter => ['html_strip'] } } }} );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet',  );
Searching $result = $es->search( index  =>  ['twitter','facebook'] , type  =>  ['tweet','post'] ,  );
Searching $result = $es->search( #  all indices #  all types );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet',  query  => { text => { _all => 'foo' }}, );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet', query b   =>  'foo' , #  b == ElasticSearch::SearchBuilder );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet', query  => { text => { _all => 'foo' }}, sort  => [{ '_score': 'desc' }] );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet', query  => { text => { _all => 'foo' }}, sort  => [{ '_score': 'desc' }] from  => 0, size  => 10, );
Query DSL
Queries   vs  Filters
Queries   vs  Filters  ,[object Object],[object Object]
Queries   vs  Filters  ,[object Object]
relevance scoring ,[object Object]
no scoring
Queries   vs  Filters  ,[object Object]
relevance scoring
slower ,[object Object]
no scoring
faster
Queries   vs  Filters  ,[object Object]
relevance scoring
slower
no caching ,[object Object]
no scoring
faster
cacheable
Queries   vs  Filters  ,[object Object]
relevance scoring
slower
no caching ,[object Object]
no scoring
faster
cacheable  Use filters for anything that doesn't affect the relevance score!
Query only Query DSL: $es->search(  query => {  text => { title => 'perl' }  } ); SearchBuilder: $es->search(  query b  => {  title => 'perl'  } );
Filter only Query DSL: $es->search( query => { constant_score => { filter => { term => { tag => 'perl } } } }); SearchBuilder: $es->search( query b  => { -filter => {  tag => 'perl'  } });
Query and filter Query DSL: $es->search( query => { filtered  => { query => {  text => { title => 'perl' } }, filter =>{  term => { tag => 'perl'  } } } }); SearchBuilder: $es->search( query b  => { title  => 'perl', -filter => {  tag => 'perl'  }  });

Más contenido relacionado

La actualidad más candente

Module 2: Adding JavaScript to APEX Apps
Module 2: Adding JavaScript to APEX AppsModule 2: Adding JavaScript to APEX Apps
Module 2: Adding JavaScript to APEX AppsDaniel McGhan
 
엘라스틱 서치 세미나
엘라스틱 서치 세미나엘라스틱 서치 세미나
엘라스틱 서치 세미나종현 김
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation FrameworkMongoDB
 
SPARQL in a nutshell
SPARQL in a nutshellSPARQL in a nutshell
SPARQL in a nutshellFabien Gandon
 
Building Next-Generation Web APIs with JSON-LD and Hydra
Building Next-Generation Web APIs with JSON-LD and HydraBuilding Next-Generation Web APIs with JSON-LD and Hydra
Building Next-Generation Web APIs with JSON-LD and HydraMarkus Lanthaler
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)MongoDB
 
Boost Your Neo4j with User-Defined Procedures
Boost Your Neo4j with User-Defined ProceduresBoost Your Neo4j with User-Defined Procedures
Boost Your Neo4j with User-Defined ProceduresNeo4j
 
Typescript Fundamentals
Typescript FundamentalsTypescript Fundamentals
Typescript FundamentalsSunny Sharma
 
Optimizing Cypher Queries in Neo4j
Optimizing Cypher Queries in Neo4jOptimizing Cypher Queries in Neo4j
Optimizing Cypher Queries in Neo4jNeo4j
 
Reinventing the Transaction Script (NDC London 2020)
Reinventing the Transaction Script (NDC London 2020)Reinventing the Transaction Script (NDC London 2020)
Reinventing the Transaction Script (NDC London 2020)Scott Wlaschin
 
JSON-LD: JSON for the Social Web
JSON-LD: JSON for the Social WebJSON-LD: JSON for the Social Web
JSON-LD: JSON for the Social WebGregg Kellogg
 
Aem sling resolution
Aem sling resolutionAem sling resolution
Aem sling resolutionGaurav Tiwari
 
TypeScript: coding JavaScript without the pain
TypeScript: coding JavaScript without the painTypeScript: coding JavaScript without the pain
TypeScript: coding JavaScript without the painSander Mak (@Sander_Mak)
 

La actualidad más candente (20)

SPARQL Tutorial
SPARQL TutorialSPARQL Tutorial
SPARQL Tutorial
 
Module 2: Adding JavaScript to APEX Apps
Module 2: Adding JavaScript to APEX AppsModule 2: Adding JavaScript to APEX Apps
Module 2: Adding JavaScript to APEX Apps
 
엘라스틱 서치 세미나
엘라스틱 서치 세미나엘라스틱 서치 세미나
엘라스틱 서치 세미나
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
 
Using Forms in Share
Using Forms in ShareUsing Forms in Share
Using Forms in Share
 
SPARQL in a nutshell
SPARQL in a nutshellSPARQL in a nutshell
SPARQL in a nutshell
 
Building Next-Generation Web APIs with JSON-LD and Hydra
Building Next-Generation Web APIs with JSON-LD and HydraBuilding Next-Generation Web APIs with JSON-LD and Hydra
Building Next-Generation Web APIs with JSON-LD and Hydra
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
Boost Your Neo4j with User-Defined Procedures
Boost Your Neo4j with User-Defined ProceduresBoost Your Neo4j with User-Defined Procedures
Boost Your Neo4j with User-Defined Procedures
 
Typescript Fundamentals
Typescript FundamentalsTypescript Fundamentals
Typescript Fundamentals
 
Full Text Search In PostgreSQL
Full Text Search In PostgreSQLFull Text Search In PostgreSQL
Full Text Search In PostgreSQL
 
World of CSS Grid
World of CSS GridWorld of CSS Grid
World of CSS Grid
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Optimizing Cypher Queries in Neo4j
Optimizing Cypher Queries in Neo4jOptimizing Cypher Queries in Neo4j
Optimizing Cypher Queries in Neo4j
 
Reinventing the Transaction Script (NDC London 2020)
Reinventing the Transaction Script (NDC London 2020)Reinventing the Transaction Script (NDC London 2020)
Reinventing the Transaction Script (NDC London 2020)
 
JSON-LD: JSON for the Social Web
JSON-LD: JSON for the Social WebJSON-LD: JSON for the Social Web
JSON-LD: JSON for the Social Web
 
Aem sling resolution
Aem sling resolutionAem sling resolution
Aem sling resolution
 
TypeScript: coding JavaScript without the pain
TypeScript: coding JavaScript without the painTypeScript: coding JavaScript without the pain
TypeScript: coding JavaScript without the pain
 
Typescript ppt
Typescript pptTypescript ppt
Typescript ppt
 
MYSQL
MYSQLMYSQL
MYSQL
 

Similar a Terms of endearment - the ElasticSearch Query DSL explained

Php Basic Security
Php Basic SecurityPhp Basic Security
Php Basic Securitymussawir20
 
High-level Web Testing
High-level Web TestingHigh-level Web Testing
High-level Web Testingpetersergeant
 
Exploiting Php With Php
Exploiting Php With PhpExploiting Php With Php
Exploiting Php With PhpJeremy Coates
 
PHP 102: Out with the Bad, In with the Good
PHP 102: Out with the Bad, In with the GoodPHP 102: Out with the Bad, In with the Good
PHP 102: Out with the Bad, In with the GoodJeremy Kendall
 
Intro python
Intro pythonIntro python
Intro pythonkamzilla
 
Drupal Lightning FAPI Jumpstart
Drupal Lightning FAPI JumpstartDrupal Lightning FAPI Jumpstart
Drupal Lightning FAPI Jumpstartguestfd47e4c7
 
03 Php Array String Functions
03 Php Array String Functions03 Php Array String Functions
03 Php Array String FunctionsGeshan Manandhar
 
Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)MongoSF
 
Forum Presentation
Forum PresentationForum Presentation
Forum PresentationAngus Pratt
 
Intro to #memtech PHP 2011-12-05
Intro to #memtech PHP   2011-12-05Intro to #memtech PHP   2011-12-05
Intro to #memtech PHP 2011-12-05Jeremy Kendall
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchclintongormley
 
Haml & Sass presentation
Haml & Sass presentationHaml & Sass presentation
Haml & Sass presentationbryanbibat
 
Ods Markup And Tagsets: A Tutorial
Ods Markup And Tagsets: A TutorialOds Markup And Tagsets: A Tutorial
Ods Markup And Tagsets: A Tutorialsimienc
 
Introduction into Struts2 jQuery Grid Tags
Introduction into Struts2 jQuery Grid TagsIntroduction into Struts2 jQuery Grid Tags
Introduction into Struts2 jQuery Grid TagsJohannes Geppert
 

Similar a Terms of endearment - the ElasticSearch Query DSL explained (20)

Php Basic Security
Php Basic SecurityPhp Basic Security
Php Basic Security
 
High-level Web Testing
High-level Web TestingHigh-level Web Testing
High-level Web Testing
 
Exploiting Php With Php
Exploiting Php With PhpExploiting Php With Php
Exploiting Php With Php
 
PHP 102: Out with the Bad, In with the Good
PHP 102: Out with the Bad, In with the GoodPHP 102: Out with the Bad, In with the Good
PHP 102: Out with the Bad, In with the Good
 
HTML5 Web Forms
HTML5 Web FormsHTML5 Web Forms
HTML5 Web Forms
 
Intro python
Intro pythonIntro python
Intro python
 
JSP Custom Tags
JSP Custom TagsJSP Custom Tags
JSP Custom Tags
 
Sencha Touch Intro
Sencha Touch IntroSencha Touch Intro
Sencha Touch Intro
 
JQuery 101
JQuery 101JQuery 101
JQuery 101
 
Drupal Lightning FAPI Jumpstart
Drupal Lightning FAPI JumpstartDrupal Lightning FAPI Jumpstart
Drupal Lightning FAPI Jumpstart
 
JQuery Basics
JQuery BasicsJQuery Basics
JQuery Basics
 
03 Php Array String Functions
03 Php Array String Functions03 Php Array String Functions
03 Php Array String Functions
 
Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)
 
Forum Presentation
Forum PresentationForum Presentation
Forum Presentation
 
Intro to #memtech PHP 2011-12-05
Intro to #memtech PHP   2011-12-05Intro to #memtech PHP   2011-12-05
Intro to #memtech PHP 2011-12-05
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
 
Haml & Sass presentation
Haml & Sass presentationHaml & Sass presentation
Haml & Sass presentation
 
Ods Markup And Tagsets: A Tutorial
Ods Markup And Tagsets: A TutorialOds Markup And Tagsets: A Tutorial
Ods Markup And Tagsets: A Tutorial
 
Introduction into Struts2 jQuery Grid Tags
Introduction into Struts2 jQuery Grid TagsIntroduction into Struts2 jQuery Grid Tags
Introduction into Struts2 jQuery Grid Tags
 
Mojolicious on Steroids
Mojolicious on SteroidsMojolicious on Steroids
Mojolicious on Steroids
 

Último

Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 

Último (20)

Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 

Terms of endearment - the ElasticSearch Query DSL explained

  • 1. “ Terms of Endearment” The ElasticSearch query language explained Clinton Gormley, YAPC::EU 2011 DRTECH @clintongormley
  • 2. search for : “ DELETE QUERY ” We can
  • 3. search for : “ DELETE QUERY ” and find : “ deleteByQuery ” We can
  • 4. but you can only find what is stored in the database
  • 5. Normalise values “ deleteByQuery” 'delete' 'by' 'query' 'deletebyquery'
  • 6. Normalise values and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
  • 7. Normalise values and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
  • 8. Analyse values and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
  • 9. What is stored in ElasticSearch?
  • 10. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org", }, tags => [" perl" ,"opinion"], posts => 2, } Document:
  • 11. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org", }, tags => [" perl" ,"opinion"], posts => 2, } Fields:
  • 12. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org", }, tags => [" perl" ,"opinion"], posts => 2, } Values:
  • 13. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org" }, tags => [" perl" ,"opinion"], posts => 2, } Field types: # object # string # date # nested object # string # string # array of enums # integer
  • 14. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org", }, tags => [" perl" ,"opinion"], posts => 2, } Nested objects flattened:
  • 15. { tweet => "Perl is GREAT!", posted => "2011-08-15", user.name => "Clinton Gormley", user.email => "drtech@cpan.org", tags => [" perl" ,"opinion"], posts => 2, } Nested objects flattened
  • 16. { tweet => "Perl is GREAT!", posted => "2011-08-15", user.name => "Clinton Gormley", user.email => "drtech@cpan.org", tags => [" perl" ,"opinion"], posts => 2, } Values analyzed into terms
  • 17. { tweet => ['perl','great'], posted => [Date(2011-08-15)], user.name => ['clinton','gormley'], user.email => ['drtech','cpan.org'], tags => [' perl' ,'opinion'], posts => [2], } Values analyzed into terms
  • 18. database table row ⇒ many tables ⇒ many rows ⇒ one schema ⇒ many columns In MySQL
  • 19. index type document ⇒ many types ⇒ many documents ⇒ one mapping ⇒ many fields In ElasticSearch
  • 20. Create index with mappings $es-> create_index ( index => 'twitter', mappings => { tweet => { properties => { title => { type => 'string' }, created => { type => 'date' } } } } );
  • 21. Add a mapping $es-> put_mapping ( index => 'twitter', type => ' user ', mapping => { properties => { name => { type => 'string' }, created => { type => 'date' }, } } );
  • 22. Can add to existing mapping
  • 23. Can add to existing mapping Cannot change mapping for field
  • 24. Core field types { type => 'string', }
  • 25. Core field types { type => 'string', # byte|short|integer|long|double|float # date, ip addr, geolocation # boolean # binary (as base 64) }
  • 26. Core field types { type => 'string', index => ' analyzed ', # 'Foo Bar' ⇒ [ 'foo', 'bar' ] }
  • 27. Core field types { type => 'string', index => ' not_analyzed ', # 'Foo Bar' ⇒ [ 'Foo Bar' ] }
  • 28. Core field types { type => 'string', index => ' no ', # 'Foo Bar' ⇒ [ ] }
  • 29. Core field types { type => 'string', index => 'analyzed', analyzer => 'default', }
  • 30. Core field types { type => 'string', index => 'analyzed', index_ analyzer => 'default', search_ analyzer => 'default', }
  • 31. Core field types { type => 'string', index => 'analyzed', analyzer => 'default', boost => 2, }
  • 32. Core field types { type => 'string', index => 'analyzed', analyzer => 'default', boost => 2, include_in_all => 1 |0 }
  • 33.
  • 36. Stop
  • 37.
  • 41. The Brown-Cow's Part_No. #A.BC123-456 joe@bloggs.com keyword: The Brown-Cow's Part_No. #A.BC123-456 joe@bloggs.com whitespace: The, Brown-Cow's, Part_No., #A.BC123-456, joe@bloggs.com simple: the, brown, cow, s, part, no, a, bc, joe, bloggs, com standard: brown, cow's, part_no, a.bc123, 456, joe, bloggs.com snowball (English): brown, cow, part_no, a.bc123, 456, joe, bloggs.com
  • 42.
  • 46. NGram
  • 50. Stop
  • 51.
  • 52. KStem
  • 61. Custom Analyzer $c->create_index( index => 'twitter', settings => { analysis => { analyzer => { ascii_html => { type => 'custom', tokenizer => 'standard', filter => [ qw( standard lowercase asciifolding stop ) ], char_filter => ['html_strip'] } } }} );
  • 62. Searching $result = $es->search( index => 'twitter', type => 'tweet', );
  • 63. Searching $result = $es->search( index => ['twitter','facebook'] , type => ['tweet','post'] , );
  • 64. Searching $result = $es->search( # all indices # all types );
  • 65. Searching $result = $es->search( index => 'twitter', type => 'tweet', query => { text => { _all => 'foo' }}, );
  • 66. Searching $result = $es->search( index => 'twitter', type => 'tweet', query b => 'foo' , # b == ElasticSearch::SearchBuilder );
  • 67. Searching $result = $es->search( index => 'twitter', type => 'tweet', query => { text => { _all => 'foo' }}, sort => [{ '_score': 'desc' }] );
  • 68. Searching $result = $es->search( index => 'twitter', type => 'tweet', query => { text => { _all => 'foo' }}, sort => [{ '_score': 'desc' }] from => 0, size => 10, );
  • 70. Queries vs Filters
  • 71.
  • 72.
  • 73.
  • 75.
  • 77.
  • 80.
  • 83.
  • 87.
  • 90.
  • 93. cacheable Use filters for anything that doesn't affect the relevance score!
  • 94. Query only Query DSL: $es->search( query => { text => { title => 'perl' } } ); SearchBuilder: $es->search( query b => { title => 'perl' } );
  • 95. Filter only Query DSL: $es->search( query => { constant_score => { filter => { term => { tag => 'perl } } } }); SearchBuilder: $es->search( query b => { -filter => { tag => 'perl' } });
  • 96. Query and filter Query DSL: $es->search( query => { filtered => { query => { text => { title => 'perl' } }, filter =>{ term => { tag => 'perl' } } } }); SearchBuilder: $es->search( query b => { title => 'perl', -filter => { tag => 'perl' } });
  • 98. Filters : equality Query DSL: { term => { tags => 'perl' }} { terms => { tags => ['perl','ruby'] }} SearchBuilder: { tags => 'perl' } { tags => ['perl','ruby'] }
  • 99. Filters : range Query DSL: { range => { date => { gte => '2010-11-01', lt => '2010-12-01' }} SearchBuilder: { date => { gte => '2010-11-01', lt => '2011-12-01' }}
  • 100. Filters : range (many values) Query DSL: { numeric_range => { date => { gte => '2010-11-01', lt => '2010-12-01 }} SearchBuilder: { date => { ' >= ' => '2010-11-01', ' < ' => '2011-12-01' }}
  • 101. Filters : and | or | not Query DSL: { and => [ {term=>{X=>1}}, {term=>{Y=>2}} ]} { or => [ {term=>{X=>1}}, {term=>{Y=>2}} ]} { not => { or => [ {term=>{X=>1}}, {term=>{Y=>2}} ] }} SearchBuilder: { X => 1, Y => 2 } [ X => 1, Y => 2 ] { -not => { X => 1, Y => 2 } } # and { -not => [ X => 1, Y => 2 ] } # or
  • 102. Filters : exists | missing Query DSL: { exists => { field => 'title' }} { missing => { field => 'title' }} SearchBuilder: { -exists => 'title' } { -missing => 'title' }
  • 103. Filter example SearchBuilder: { -filter => [ featured => 1, { created_at => { gt => '2011-08-01' }, status => { '!=' => 'pending' }, }, ] }
  • 104. Filter example Query DSL: { constant_score => { filter => { or => [ { term => { featured => 1 }}, { and => [ { not => { term => { status => 'pending' }}, { range => { created_at => { gt => '2011-08-01' }}}, ] } ] } } }
  • 105.
  • 106. nested
  • 108. query
  • 110. prefix
  • 111.
  • 112. type
  • 117.
  • 120.
  • 121. range
  • 122. prefix
  • 123. fuzzy
  • 125. ids
  • 126.
  • 128.
  • 129.
  • 131.
  • 134.
  • 137.
  • 138. range
  • 139. prefix
  • 140. fuzzy
  • 142. ids
  • 143.
  • 145.
  • 146.
  • 148.
  • 153. Text/Analyzed Queries analyzed ⇒ text query using search_analyzer
  • 154. Text-Query Family Query DSL: { text => { title => 'great perl' }} Search Builder: { title => 'great perl' }
  • 155. Text-Query Family Query DSL: { text => { title => { query => 'great perl' }}} Search Builder: { title => { '=' => { query => 'great perl' }}}
  • 156. Text-Query Family Query DSL: { text => { title => { query => 'great perl' , operator => 'and' }}} Search Builder: { title => { '=' => { query => 'great perl', operator => 'and' }}}
  • 157. Text-Query Family Query DSL: { text => { title => { query => 'great perl' , fuzziness => 0.5 }}} Search Builder: { title => { '=' => { query => 'great perl', fuzziness => 0.5 }}}
  • 158. Text-Query Family Query DSL: { text => { title => { query => 'great perl', type => 'phrase' }}} Search Builder: { title => { '==' => { query => 'great perl', }}}
  • 159. Text-Query Family Query DSL: { text => { title => { query => ' great perl ', type => 'phrase' }}} Search Builder: { title => { '==' => { query => ' great perl ', }}}
  • 160. Text-Query Family Query DSL: { text => { title => { query => ' perl is great ', type => 'phrase' }}} Search Builder: { title => { '==' => { query => ' perl is great ', }}}
  • 161. Text-Query Family Query DSL: { text => { title => { query => ' perl great ', type => 'phrase', slop => 3 }}} Search Builder: { title => { '==' => { query => ' perl great ', slop => 3 }}}
  • 162. Text-Query Family Query DSL: { text => { title => { query => ' perl is gr ', type => ' phrase_prefix ', }}} Search Builder: { title => { '^' => { query => ' perl is gr ', }}}
  • 163. Query string / Field Lucene Query Syntax aware “ perl is great”~5 AND author:clint* -deleted
  • 164. Query string / Field Syntax errors: AND perl is great ” author : clint* -
  • 165. Query string / Field Syntax errors: AND perl is great ” author : clint* - ElasticSearch::QueryParser
  • 166. Combining: Bool Query DSL: { bool => { must => [ { term => { foo => 1}}, ... ], must_not => [ { term => { bar => 1}}, ... ], should => [ { term => { X => 2}}, { term => { Y => 2}},... ], minimum_number_should_match => 1, }}
  • 167. Combining: Bool SearchBuilder: { foo => 1, bar => { '!=' => 1}, -or => [ X => 2, Y => 2], } { -bool => { must => { foo => 1 }, must_not => { bar => 1 }, should => [{ X => 2}, { Y => 2 }], minimum_number_should_match => 1, }}
  • 168. Combining: DisMax Query DSL: { dis_max => { queries => [ { term => { foo => 1}}, { term => { bar => 1}}, ] }} SearchBuilder: { -dis_max => [ { term => { foo => 1}}, { term => { bar => 1}}, ], }
  • 169. Bool: combines scores DisMax: uses highest score from all matching clauses
  • 172. Boosting: at index time { properties => { content => { type => “string” }, title => { type => “string” }, }
  • 173. Boosting: at index time { properties => { content => { type => “string” }, title => { type => “string”, boost => 2, }, }, }
  • 174. Boosting: at index time { properties => { content => { type => “string” }, title => { type => “string”, boost => 2, }, rank => { type => “integer” }, }, _boost => { name => 'rank', null_value => 1.0 }, }
  • 175. Boosting: at search time Query DSL: { bool => { should => [ { text => { content => 'perl' }}, { text => { title => 'perl' }}, ] }} SearchBuilder: { content => 'perl', title => 'perl' }
  • 176. Boosting: at search time Query DSL: { bool => { should => [ { text => { content => 'perl' }}, { text => { title => { query => 'perl', }}, ] }} SearchBuilder: { content => 'perl', title => { '=' => { query => 'perl' }} }
  • 177. Boosting: at search time Query DSL: { bool => { should => [ { text => { content => 'perl' }}, { text => { title => { query => 'perl', boost => 2 }}, ] }} SearchBuilder: { content => 'perl', title => { '=' => { query => 'perl', boost=> 2 }} }
  • 178. Boosting: custom_score Query DSL: { custom_score => { query => { text => { title => 'perl' }}, script => “_score * foo /doc['rank'].value”, }} SearchBuilder: { -custom_score => { query => { title => 'perl' }, script => “_score * foo /doc['rank'].value”, }}
  • 179. Query example SearchBuilder: { -or => [ title => { '=' => { query => 'custom score', boost => 2 }}, content => 'custom score', ], -filter => { repo => 'elasticsearch/elasticsearch', created_at => { '>=' => '2011-07-01', '<' => '2011-08-01'}, -or => [ creator_id => 123, assignee_id => 123, ], labels => ['bug','breaking'] } }
  • 180. Query example Query DSL: { query => { filtered => { query => { bool => { should => [ { text => { content => &quot;custom score&quot; } }, { text => { title => { boost => 2, query => &quot;custom score&quot; } } }, ], }, }, filter => { and => [ { or => [ { term => { creator_id => 123 } }, { term => { assignee_id => 123 } }, ]}, { terms => { labels => [&quot;bug&quot;, &quot;breaking&quot;] } }, { term => { repo => &quot;elasticsearch/elasticsearch&quot; } }, { numeric_range => { created_at => { gte => &quot;2011-07-01&quot;, lt => &quot;2011-08-01&quot; }}}, ]}, }}
  • 181.