SlideShare una empresa de Scribd logo
1 de 48
Descargar para leer sin conexión
Elasticsearch?
clustered and sharded
document storage with powerful
language analysing features
and a query language,

all wrapped by a REST API
Getting Started
• install elasticsearch
• needs some JDK
• start it
Getting Started
• https://github.com/elastic/elasticsearch-rails
• gems for Rails:
• elasticsearch-model & elasticsearch-rails
• without Rails / AR:
• elasticsearch-persistence
class Event < ActiveRecord::Base
include Elasticsearch::Model
class Event < ActiveRecord::Base
include Elasticsearch::Model
def as_indexed_json(options={})
{ title: title,
description: description,
starts_at: starts_at.iso8601 }
end
Event.import
Event.import
PUT /events/event/31710
{
"title": "Finding the right stuff, ...",
"description": "Searching in data sets with ...",
"starts_at": “2015-10-08T19:00:00+09:00"
}
Event.import
PUT /events/event/31710
{
"title": "Finding the right stuff, ...",
"description": "Searching in data sets with ...",
"starts_at": “2015-10-08T19:00:00+09:00"
}
index
Event.import
PUT /events/event/31710
{
"title": "Finding the right stuff, ...",
"description": "Searching in data sets with ...",
"starts_at": “2015-10-08T19:00:00+09:00"
}
index
type
Event.import
PUT /events/event/31710
{
"title": "Finding the right stuff, ...",
"description": "Searching in data sets with ...",
"starts_at": “2015-10-08T19:00:00+09:00"
}
index
type
ID
Event.search 'tokyo rubyist'
response = Event.search 'tokyo rubyist'
response.took
# => 28
response.results.total
# => 2075
response.results.first._score
# => 0.921177
response.results.first._source.title
# => "Drop in Ruby"
response = Event.search 'tokyo rubyist'
response.took
# => 28
response.results.total
# => 2075
response.results.first._score
# => 0.921177
response.results.first._source.title
# => "Drop in Ruby"
GET /events/event/_search?q=tokyo%20rubyist
response = Event.search 'tokyo rubyist'
response.records.to_a
# => [#<Event id: 12409, ...>, ...]
response.page(2).results
response.page(2).records
response = Event.search 'tokyo rubyist'
response.records.to_a
# => [#<Event id: 12409, ...>, ...]
response.page(2).results
response.page(2).records
supports kaminari /
will_paginate
response = Event.search 'tokyo rubyist'
response.records.each_with_hit do |rec,hit|
puts "* #{rec.title}: #{hit._score}"
end
# * Drop in Ruby: 0.9205564
# * Javascript meets Ruby in Kamakura: 0.8947
# * Meetup at EC Navi: 0.8766844
# * Pair Programming Session #3: 0.8603562
# * Kickoff Party: 0.8265461
# * Tales of a Ruby Committer: 0.74487066
# * One Year Anniversary Party: 0.7298212
Event.search 'tokyo rubyist'
Event.search 'tokyo rubyist'
only upcoming events?
Event.search 'tokyo rubyist'
only upcoming events?
sorted by start date?
Event.search query: {
filtered: {
query: {
simple_query_string: {
query: "tokyo rubyist",
default_operator: "and"
}
},
filter: {
and: [
{ range: { starts_at: { gte: Time.now } } },
{ term: { featured: true } }
]
}
}
}, sort: { starts_at: { order: "asc" } }
Event.search query: {
filtered: {
query: {
simple_query_string: {
query: "tokyo rubyist",
default_operator: "and"
}
},
filter: {
and: [
{ range: { starts_at: { gte: Time.now } } },
{ term: { featured: true } }
]
}
}
}, sort: { starts_at: { order: "asc" } }
basically same as
before
Event.search query: {
filtered: {
query: {
simple_query_string: {
query: "tokyo rubyist",
default_operator: "and"
}
},
filter: {
and: [
{ range: { starts_at: { gte: Time.now } } },
{ term: { featured: true } }
]
}
}
}, sort: { starts_at: { order: "asc" } }
basically same as
before
filtered by conditions
Event.search query: {
filtered: {
query: {
simple_query_string: {
query: "tokyo rubyist",
default_operator: "and"
}
},
filter: {
and: [
{ range: { starts_at: { gte: Time.now } } },
{ term: { featured: true } }
]
}
}
}, sort: { starts_at: { order: "asc" } }
basically same as
before
filtered by conditions
sorted by start time
Query DSL
• query: { <query_type>: <arguments> }
• valid arguments depend on query type
• "Filtered Query" takes a query and a filter
• "Simple Query String Query" does not allow
nested queries
Event.search query: {
filtered: {
query: {
simple_query_string: {
query: "tokyo rubyist",
default_operator: "and"
}
},
filter: {
and: [
{ range: { starts_at: { gte: Time.now } } },
{ term: { featured: true } }
]
}
}
}, sort: { starts_at: { order: "asc" } }
Query DSL
• filter: { <filter_type>: <arguments> }
• valid arguments depend on filter type
• "And filter" takes an array of filters
• "Range filter" takes a property and lt(e), gt(e)
• "Term filter" takes a property and a value
Match Query
Multi Match Query
Bool Query
Boosting Query
Common Terms Query
Constant Score Query
Dis Max Query
Filtered Query
Fuzzy Like This Query
Fuzzy Like This Field Query
Function Score Query
Fuzzy Query
GeoShape Query
Has Child Query
Has Parent Query
Ids Query
Indices Query
Match All Query
More Like This Query
Nested Query
Prefix Query
Query String Query
Simple Query String Query
Range Query
Regexp Query
Span First Query
Span Multi Term Query
Span Near Query
Span Not Query
Span Or Query
Span Term Query
Term Query
Terms Query
Top Children Query
Wildcard Query
Minimum Should Match
Multi Term Query Rewrite
Template Query
And Filter
Bool Filter
Exists Filter
Geo Bounding Box Filter
Geo Distance Filter
Geo Distance Range Filter
Geo Polygon Filter
GeoShape Filter
Geohash Cell Filter
Has Child Filter
Has Parent Filter
Ids Filter
Indices Filter
Limit Filter
Match All Filter
Missing Filter
Nested Filter
Not Filter
Or Filter
Prefix Filter
Query Filter
Range Filter
Regexp Filter
Script Filter
Term Filter
Terms Filter
Type Filter
Event.search query: {
filtered: {
query: {
simple_query_string: {
query: "tokyo rubyist",
default_operator: "and"
}
},
filter: {
and: [
{ range: { starts_at: { gte: Time.now } } },
{ term: { featured: true } }
]
}
}
}, sort: { starts_at: { order: "asc" } }
class Event < ActiveRecord::Base
include Elasticsearch::Model
def as_indexed_json(options={})
{ title: title,
description: description,
starts_at: starts_at.iso8601,
featured: group.featured? }
end
settings do
mapping dynamic: 'false' do
indexes :title, type: 'string'
indexes :description, type: 'string'
indexes :starts_at, type: 'date'
indexes :featured, type: 'boolean'
end
end
Event.import force: true
deletes existing index,
creates new index with settings,
imports documents
Event.search query: {
filtered: {
query: {
simple_query_string: {
query: "tokyo rubyist",
default_operator: "and"
}
},
filter: {
and: [
{ range: { starts_at: { gte: Time.now } } },
{ term: { featured: true } }
]
}
}
}, sort: { starts_at: { order: "asc" } }
Event.search query: {
bool: {
should: [
{
simple_query_string: {
query: "tokyo rubyist",
default_operator: "and"
}
}, {
function_score: {
filter: {
and: [
{ range: { starts_at: { lte: 'now' } } },
{ term: { featured: true } }
]
},
gauss: {
starts_at: {
origin: 'now',
scale: '10d',
decay: 0.5
},
},
boost_mode: "sum"
}
}
],
minimum_should_match: 2
}
}
Event.search '東京rubyist'
Dealing with different
languages
built in analysers for arabic, armenian, basque,
brazilian, bulgarian, catalan, cjk, czech, danish,
dutch, english, finnish, french, galician, german,
greek, hindi, hungarian, indonesian, irish, italian,
latvian, norwegian, persian, portuguese, romanian,
russian, sorani, spanish, swedish, turkish, thai.
Japanese?
• install kuromoji plugin
• https://github.com/elastic/elasticsearch-
analysis-kuromoji
• plugin install elasticsearch/elasticsearch-analysis-kuromoji/2.7.0
class Event < ActiveRecord::Base
include Elasticsearch::Model
def as_indexed_json(options={})
{ title: { en: title_en, ja: title_ja },
description: { en: description_en, ja: description_ja },
starts_at: starts_at.iso8601,
featured: group.featured? }
end
settings do
mapping dynamic: 'false' do
indexes :title do
indexes :en, type: 'string', analyzer: 'english'
indexes :ja, type: 'string', analyzer: 'kuromoji'
end
indexes :description do
indexes :en, type: 'string', analyzer: 'english'
indexes :ja, type: 'string', analyzer: 'kuromoji'
end
indexes :starts_at, type: 'date'
indexes :featured, type: 'boolean'
end
end
Event.search 'tokyo rubyist'
with data from other
models?
class Event < ActiveRecord::Base
include Elasticsearch::Model
def as_indexed_json(options={})
{ title: { en: title_en, ja: title_ja },
description: { en: description_en, ja: description_ja },
group_name: { en: group.name_en, ja: group.name_ja },
starts_at: starts_at.iso8601,
featured: group.featured? }
end
settings do
mapping dynamic: 'false' do
indexes :title do
indexes :en, type: 'string', analyzer: 'english'
indexes :ja, type: 'string', analyzer: 'kuromoji'
end
indexes :description do
indexes :en, type: 'string', analyzer: 'english'
indexes :ja, type: 'string', analyzer: 'kuromoji'
end
indexes :group_name do
indexes :en, type: 'string', analyzer: 'english'
indexes :ja, type: 'string', analyzer: 'kuromoji'
end
indexes :starts_at, type: 'date'
indexes :featured, type: 'boolean'
end
end
Automated Tests
class Event < ActiveRecord::Base
include Elasticsearch::Model
index_name "drkpr_#{Rails.env}_events"
Index names with
environment
Test Helpers
• https://gist.github.com/mreinsch/094dc9cf63362314cef4
• Helpers: 

wait_for_elasticsearch

wait_for_elasticsearch_removal

clear_elasticsearch!
• specs: Tag tests which require elasticsearch
Production Ready?
• use elastic.co/found or AWS ES
• use two instances for redundancy
• elasticsearch could go away
• usually only impacts search
• keep impact at a minimum
class Event < ActiveRecord::Base
include Elasticsearch::Model
after_save do
IndexerJob.perform_later(
'update', self.class.name, self.id)
end
after_destroy do
IndexerJob.perform_later(
'delete', self.class.name, self.id)
end
...
class IndexerJob < ActiveJob::Base
queue_as :default
def perform(action, record_type, record_id)
record_class = record_type.constantize
record_data = {
index: record_class.index_name,
type: record_class.document_type,
id: record_id
}
client = record_class.__elasticsearch__.client
case action.to_s
when 'update'
record = record_class.find(record_id)
client.index record_data.merge(body: record.as_indexed_json)
when 'delete'
client.delete record_data.merge(ignore: 404)
end
end
end
https://gist.github.com/mreinsch/acb2f6c58891e5cd4f13
Questions?
Elastic Docs

https://www.elastic.co/guide/index.html
Ruby Gem Docs

https://github.com/elastic/elasticsearch-rails
Resources
or ask me later:
michael@doorkeeper.jp
@mreinsch

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

J s-o-n-120219575328402-3
J s-o-n-120219575328402-3J s-o-n-120219575328402-3
J s-o-n-120219575328402-3
 
Side by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and SolrSide by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and Solr
 
Beyond the Node: Arkestration with Noah
Beyond the Node: Arkestration with NoahBeyond the Node: Arkestration with Noah
Beyond the Node: Arkestration with Noah
 
Doctrine ORM with eZ Platform REST API and GraphQL
Doctrine ORM with eZ Platform REST API and GraphQLDoctrine ORM with eZ Platform REST API and GraphQL
Doctrine ORM with eZ Platform REST API and GraphQL
 
Second Level Cache in JPA Explained
Second Level Cache in JPA ExplainedSecond Level Cache in JPA Explained
Second Level Cache in JPA Explained
 
Lazy vs. Eager Loading Strategies in JPA 2.1
Lazy vs. Eager Loading Strategies in JPA 2.1Lazy vs. Eager Loading Strategies in JPA 2.1
Lazy vs. Eager Loading Strategies in JPA 2.1
 
Developing cacheable backend applications - Appdevcon 2019
Developing cacheable backend applications - Appdevcon 2019Developing cacheable backend applications - Appdevcon 2019
Developing cacheable backend applications - Appdevcon 2019
 
Open Policy Agent (OPA) と Kubernetes Policy
Open Policy Agent (OPA) と Kubernetes PolicyOpen Policy Agent (OPA) と Kubernetes Policy
Open Policy Agent (OPA) と Kubernetes Policy
 
Lucene
LuceneLucene
Lucene
 
Sane Sharding with Akka Cluster
Sane Sharding with Akka ClusterSane Sharding with Akka Cluster
Sane Sharding with Akka Cluster
 
MongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() OutputMongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() Output
 
Introduction to solr
Introduction to solrIntroduction to solr
Introduction to solr
 
Thinking Beyond ORM in JPA
Thinking Beyond ORM in JPAThinking Beyond ORM in JPA
Thinking Beyond ORM in JPA
 
Client server part 12
Client server part 12Client server part 12
Client server part 12
 
Internal Hive
Internal HiveInternal Hive
Internal Hive
 
Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached
Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached
Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached
 
Durable functions
Durable functionsDurable functions
Durable functions
 
Simple search with elastic search
Simple search with elastic searchSimple search with elastic search
Simple search with elastic search
 
The Ring programming language version 1.5.2 book - Part 44 of 181
The Ring programming language version 1.5.2 book - Part 44 of 181The Ring programming language version 1.5.2 book - Part 44 of 181
The Ring programming language version 1.5.2 book - Part 44 of 181
 
Hive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TDHive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TD
 

Destacado

GNCTR 2012 Sponsors
GNCTR 2012 SponsorsGNCTR 2012 Sponsors
GNCTR 2012 Sponsors
osnorton
 
Jordan Michael Shaver's Resumé
Jordan Michael Shaver's Resumé Jordan Michael Shaver's Resumé
Jordan Michael Shaver's Resumé
Jordan Shaver
 

Destacado (18)

GNCTR 2012 Sponsors
GNCTR 2012 SponsorsGNCTR 2012 Sponsors
GNCTR 2012 Sponsors
 
Ireland and Society, Lecture Three: Culture of Everyday Life
Ireland and Society, Lecture Three: Culture of Everyday LifeIreland and Society, Lecture Three: Culture of Everyday Life
Ireland and Society, Lecture Three: Culture of Everyday Life
 
Presentación12
Presentación12Presentación12
Presentación12
 
Sally.Hosseini-R.2015
Sally.Hosseini-R.2015Sally.Hosseini-R.2015
Sally.Hosseini-R.2015
 
async makes your tests fun - acceptance testing with capybara
async makes your tests fun - acceptance testing with capybaraasync makes your tests fun - acceptance testing with capybara
async makes your tests fun - acceptance testing with capybara
 
Ireland and Society, Lecture One: Religion
Ireland and Society, Lecture One:  ReligionIreland and Society, Lecture One:  Religion
Ireland and Society, Lecture One: Religion
 
ColdEezeAdCampaign
ColdEezeAdCampaignColdEezeAdCampaign
ColdEezeAdCampaign
 
Randeep
RandeepRandeep
Randeep
 
Mary Rose Buyoc Mangahas CV
Mary Rose Buyoc Mangahas CVMary Rose Buyoc Mangahas CV
Mary Rose Buyoc Mangahas CV
 
11.05.2016 r. wyniki Lokum Deweloper I kw. 2016 r.
11.05.2016 r. wyniki Lokum Deweloper I kw. 2016 r.11.05.2016 r. wyniki Lokum Deweloper I kw. 2016 r.
11.05.2016 r. wyniki Lokum Deweloper I kw. 2016 r.
 
Jordan Michael Shaver's Resumé
Jordan Michael Shaver's Resumé Jordan Michael Shaver's Resumé
Jordan Michael Shaver's Resumé
 
Presentación1
Presentación1Presentación1
Presentación1
 
Emerging Trends in Social Media 2013
Emerging Trends in Social Media 2013Emerging Trends in Social Media 2013
Emerging Trends in Social Media 2013
 
Positive Thinking
Positive ThinkingPositive Thinking
Positive Thinking
 
Perio esthetics
Perio estheticsPerio esthetics
Perio esthetics
 
Questionnaire on luxury brand shopping
Questionnaire on luxury brand shoppingQuestionnaire on luxury brand shopping
Questionnaire on luxury brand shopping
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysis
 
Periodontal management of medically compromised paients/dental courses
Periodontal management of medically compromised paients/dental coursesPeriodontal management of medically compromised paients/dental courses
Periodontal management of medically compromised paients/dental courses
 

Similar a Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
Alexei Gorobets
 
Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]
Karel Minarik
 
Google apps script database abstraction exposed version
Google apps script database abstraction   exposed versionGoogle apps script database abstraction   exposed version
Google apps script database abstraction exposed version
Bruce McPherson
 
Persisting Your Objects In The Database World @ AlphaCSP Professional OSS Con...
Persisting Your Objects In The Database World @ AlphaCSP Professional OSS Con...Persisting Your Objects In The Database World @ AlphaCSP Professional OSS Con...
Persisting Your Objects In The Database World @ AlphaCSP Professional OSS Con...
Baruch Sadogursky
 

Similar a Finding the right stuff, an intro to Elasticsearch with Ruby/Rails (20)

Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
 
Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]
 
Elastic tire demo
Elastic tire demoElastic tire demo
Elastic tire demo
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
 
Learning to rank search results
Learning to rank search resultsLearning to rank search results
Learning to rank search results
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET Developers
 
Google apps script database abstraction exposed version
Google apps script database abstraction   exposed versionGoogle apps script database abstraction   exposed version
Google apps script database abstraction exposed version
 
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearch
 
Overview of GraphQL & Clients
Overview of GraphQL & ClientsOverview of GraphQL & Clients
Overview of GraphQL & Clients
 
Compass Framework
Compass FrameworkCompass Framework
Compass Framework
 
Persisting Your Objects In The Database World @ AlphaCSP Professional OSS Con...
Persisting Your Objects In The Database World @ AlphaCSP Professional OSS Con...Persisting Your Objects In The Database World @ AlphaCSP Professional OSS Con...
Persisting Your Objects In The Database World @ AlphaCSP Professional OSS Con...
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
 
Introduction to source{d} Engine and source{d} Lookout
Introduction to source{d} Engine and source{d} Lookout Introduction to source{d} Engine and source{d} Lookout
Introduction to source{d} Engine and source{d} Lookout
 
Cross Domain Web
Mashups with JQuery and Google App Engine
Cross Domain Web
Mashups with JQuery and Google App EngineCross Domain Web
Mashups with JQuery and Google App Engine
Cross Domain Web
Mashups with JQuery and Google App Engine
 
Elasticsearch intro output
Elasticsearch intro outputElasticsearch intro output
Elasticsearch intro output
 
Query DSL In Elasticsearch
Query DSL In ElasticsearchQuery DSL In Elasticsearch
Query DSL In Elasticsearch
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Finding the right stuff, an intro to Elasticsearch with Ruby/Rails

  • 1.
  • 3. clustered and sharded document storage with powerful language analysing features and a query language,
 all wrapped by a REST API
  • 4. Getting Started • install elasticsearch • needs some JDK • start it
  • 5. Getting Started • https://github.com/elastic/elasticsearch-rails • gems for Rails: • elasticsearch-model & elasticsearch-rails • without Rails / AR: • elasticsearch-persistence
  • 6. class Event < ActiveRecord::Base include Elasticsearch::Model
  • 7. class Event < ActiveRecord::Base include Elasticsearch::Model def as_indexed_json(options={}) { title: title, description: description, starts_at: starts_at.iso8601 } end
  • 9. Event.import PUT /events/event/31710 { "title": "Finding the right stuff, ...", "description": "Searching in data sets with ...", "starts_at": “2015-10-08T19:00:00+09:00" }
  • 10. Event.import PUT /events/event/31710 { "title": "Finding the right stuff, ...", "description": "Searching in data sets with ...", "starts_at": “2015-10-08T19:00:00+09:00" } index
  • 11. Event.import PUT /events/event/31710 { "title": "Finding the right stuff, ...", "description": "Searching in data sets with ...", "starts_at": “2015-10-08T19:00:00+09:00" } index type
  • 12. Event.import PUT /events/event/31710 { "title": "Finding the right stuff, ...", "description": "Searching in data sets with ...", "starts_at": “2015-10-08T19:00:00+09:00" } index type ID
  • 14. response = Event.search 'tokyo rubyist' response.took # => 28 response.results.total # => 2075 response.results.first._score # => 0.921177 response.results.first._source.title # => "Drop in Ruby"
  • 15. response = Event.search 'tokyo rubyist' response.took # => 28 response.results.total # => 2075 response.results.first._score # => 0.921177 response.results.first._source.title # => "Drop in Ruby" GET /events/event/_search?q=tokyo%20rubyist
  • 16. response = Event.search 'tokyo rubyist' response.records.to_a # => [#<Event id: 12409, ...>, ...] response.page(2).results response.page(2).records
  • 17. response = Event.search 'tokyo rubyist' response.records.to_a # => [#<Event id: 12409, ...>, ...] response.page(2).results response.page(2).records supports kaminari / will_paginate
  • 18. response = Event.search 'tokyo rubyist' response.records.each_with_hit do |rec,hit| puts "* #{rec.title}: #{hit._score}" end # * Drop in Ruby: 0.9205564 # * Javascript meets Ruby in Kamakura: 0.8947 # * Meetup at EC Navi: 0.8766844 # * Pair Programming Session #3: 0.8603562 # * Kickoff Party: 0.8265461 # * Tales of a Ruby Committer: 0.74487066 # * One Year Anniversary Party: 0.7298212
  • 21. Event.search 'tokyo rubyist' only upcoming events? sorted by start date?
  • 22. Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }
  • 23. Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } } basically same as before
  • 24. Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } } basically same as before filtered by conditions
  • 25. Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } } basically same as before filtered by conditions sorted by start time
  • 26. Query DSL • query: { <query_type>: <arguments> } • valid arguments depend on query type • "Filtered Query" takes a query and a filter • "Simple Query String Query" does not allow nested queries
  • 27. Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }
  • 28. Query DSL • filter: { <filter_type>: <arguments> } • valid arguments depend on filter type • "And filter" takes an array of filters • "Range filter" takes a property and lt(e), gt(e) • "Term filter" takes a property and a value
  • 29. Match Query Multi Match Query Bool Query Boosting Query Common Terms Query Constant Score Query Dis Max Query Filtered Query Fuzzy Like This Query Fuzzy Like This Field Query Function Score Query Fuzzy Query GeoShape Query Has Child Query Has Parent Query Ids Query Indices Query Match All Query More Like This Query Nested Query Prefix Query Query String Query Simple Query String Query Range Query Regexp Query Span First Query Span Multi Term Query Span Near Query Span Not Query Span Or Query Span Term Query Term Query Terms Query Top Children Query Wildcard Query Minimum Should Match Multi Term Query Rewrite Template Query
  • 30. And Filter Bool Filter Exists Filter Geo Bounding Box Filter Geo Distance Filter Geo Distance Range Filter Geo Polygon Filter GeoShape Filter Geohash Cell Filter Has Child Filter Has Parent Filter Ids Filter Indices Filter Limit Filter Match All Filter Missing Filter Nested Filter Not Filter Or Filter Prefix Filter Query Filter Range Filter Regexp Filter Script Filter Term Filter Terms Filter Type Filter
  • 31. Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }
  • 32. class Event < ActiveRecord::Base include Elasticsearch::Model def as_indexed_json(options={}) { title: title, description: description, starts_at: starts_at.iso8601, featured: group.featured? } end settings do mapping dynamic: 'false' do indexes :title, type: 'string' indexes :description, type: 'string' indexes :starts_at, type: 'date' indexes :featured, type: 'boolean' end end
  • 33. Event.import force: true deletes existing index, creates new index with settings, imports documents
  • 34. Event.search query: { filtered: { query: { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, filter: { and: [ { range: { starts_at: { gte: Time.now } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }
  • 35. Event.search query: { bool: { should: [ { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, { function_score: { filter: { and: [ { range: { starts_at: { lte: 'now' } } }, { term: { featured: true } } ] }, gauss: { starts_at: { origin: 'now', scale: '10d', decay: 0.5 }, }, boost_mode: "sum" } } ], minimum_should_match: 2 } }
  • 37. Dealing with different languages built in analysers for arabic, armenian, basque, brazilian, bulgarian, catalan, cjk, czech, danish, dutch, english, finnish, french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, latvian, norwegian, persian, portuguese, romanian, russian, sorani, spanish, swedish, turkish, thai.
  • 38. Japanese? • install kuromoji plugin • https://github.com/elastic/elasticsearch- analysis-kuromoji • plugin install elasticsearch/elasticsearch-analysis-kuromoji/2.7.0
  • 39. class Event < ActiveRecord::Base include Elasticsearch::Model def as_indexed_json(options={}) { title: { en: title_en, ja: title_ja }, description: { en: description_en, ja: description_ja }, starts_at: starts_at.iso8601, featured: group.featured? } end settings do mapping dynamic: 'false' do indexes :title do indexes :en, type: 'string', analyzer: 'english' indexes :ja, type: 'string', analyzer: 'kuromoji' end indexes :description do indexes :en, type: 'string', analyzer: 'english' indexes :ja, type: 'string', analyzer: 'kuromoji' end indexes :starts_at, type: 'date' indexes :featured, type: 'boolean' end end
  • 40. Event.search 'tokyo rubyist' with data from other models?
  • 41. class Event < ActiveRecord::Base include Elasticsearch::Model def as_indexed_json(options={}) { title: { en: title_en, ja: title_ja }, description: { en: description_en, ja: description_ja }, group_name: { en: group.name_en, ja: group.name_ja }, starts_at: starts_at.iso8601, featured: group.featured? } end settings do mapping dynamic: 'false' do indexes :title do indexes :en, type: 'string', analyzer: 'english' indexes :ja, type: 'string', analyzer: 'kuromoji' end indexes :description do indexes :en, type: 'string', analyzer: 'english' indexes :ja, type: 'string', analyzer: 'kuromoji' end indexes :group_name do indexes :en, type: 'string', analyzer: 'english' indexes :ja, type: 'string', analyzer: 'kuromoji' end indexes :starts_at, type: 'date' indexes :featured, type: 'boolean' end end
  • 43. class Event < ActiveRecord::Base include Elasticsearch::Model index_name "drkpr_#{Rails.env}_events" Index names with environment
  • 44. Test Helpers • https://gist.github.com/mreinsch/094dc9cf63362314cef4 • Helpers: 
 wait_for_elasticsearch
 wait_for_elasticsearch_removal
 clear_elasticsearch! • specs: Tag tests which require elasticsearch
  • 45. Production Ready? • use elastic.co/found or AWS ES • use two instances for redundancy • elasticsearch could go away • usually only impacts search • keep impact at a minimum
  • 46. class Event < ActiveRecord::Base include Elasticsearch::Model after_save do IndexerJob.perform_later( 'update', self.class.name, self.id) end after_destroy do IndexerJob.perform_later( 'delete', self.class.name, self.id) end ...
  • 47. class IndexerJob < ActiveJob::Base queue_as :default def perform(action, record_type, record_id) record_class = record_type.constantize record_data = { index: record_class.index_name, type: record_class.document_type, id: record_id } client = record_class.__elasticsearch__.client case action.to_s when 'update' record = record_class.find(record_id) client.index record_data.merge(body: record.as_indexed_json) when 'delete' client.delete record_data.merge(ignore: 404) end end end https://gist.github.com/mreinsch/acb2f6c58891e5cd4f13
  • 48. Questions? Elastic Docs
 https://www.elastic.co/guide/index.html Ruby Gem Docs
 https://github.com/elastic/elasticsearch-rails Resources or ask me later: michael@doorkeeper.jp @mreinsch