SlideShare una empresa de Scribd logo
1 de 193
Descargar para leer sin conexión
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Clinton Gormley
@clintongormley
Scaling real time
search and analytics with
elasticsearch
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch.org/guide
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch
• real-time
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch
• real-time
• distributed
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch
• real-time
• distributed
• search
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch
• real-time
• distributed
• search
• analytics
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
how to use it?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
how to use it?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
how does it work?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 1:
making text searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
where content like
“%darling%buds%”
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
slow & inflexible
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Term Doc	
  1 Doc	
  2 Doc	
  3
breathe
brings
buds
but
by
can
…
damasked
darling
date
day
deaf
death
declines
delight
sorted list of
unique terms
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Term Doc	
  1 Doc	
  2 Doc	
  3
breathe
brings
buds
but
by
can
…
damasked
darling
date
day
deaf
death
declines
delight
where
they
occur
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Term Doc	
  1 Doc	
  2 Doc	
  3
breathe
brings
buds
but
by
can
…
damasked
darling
date
day
deaf
death
declines
delight
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Term Doc	
  1 Doc	
  2 Doc	
  3
breathe
brings
buds
but
by
can
…
damasked
darling
date
day
deaf
death
declines
delight
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies » relevance
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
• text length
» relevance
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
• text length
» relevance
» doc weight
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
• text length
• term positions
» relevance
» doc weight
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
• text length
• term positions
» relevance
» doc weight
» word proximity
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
• text length
• term positions
• char offsets
» relevance
» doc weight
» word proximity
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
• text length
• term positions
• char offsets
» relevance
» doc weight
» word proximity
» highlighting
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
not just for text
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
numbers, dates, bools, enums
geopoints, geoshapes, etc
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 2:
analytics
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
for search
map values → doc_ids
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
for search
map values → doc_ids
for analytics
map doc_ids → values
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
uninvert the index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
uninvert the index
cache values in memory
called “fielddata”
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
uninvert the index
data access from RAM
very fast
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
on-the-fly analytics
in the context of
a user’s query
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
on-the-fly analytics
relevant analytics
for each user
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
calculate metrics
count, min, max, sum, avg,
percentiles, cardinality,
stddev, variance, sum of squares
!
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
grouped by
popular terms, significant terms,
ranges, dates, geolocation, etc
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
grouped by
groups can
… contain subgroups
… which contain subgroups
etc
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 3:
building the inverted index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
immutable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
immutable
• cache friendly
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
immutable
• cache friendly
• reads from RAM
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
immutable
• cache friendly
• reads from RAM
• fielddata never changes
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
immutable
• cache friendly
• reads from RAM
• fielddata never changes
• compressible
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
immutable
• cache friendly
• reads from RAM
• fielddata never changes
• compressible
• no locking
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but, immutable…
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 4:
dynamic inverted index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
in-memory buffer
commit
segment
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
commit point
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
commit point
segment
searchable
commit
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
commit point
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
commit point
segment
searchable
commit
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
commit point
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
• write new segment
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
• write new segment
• write new commit point
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
• write new segment
• write new commit point
• fsync
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
• write new segment
• write new commit point
• fsync
• clear buffer
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
• write new segment
• write new commit point
• fsync
• clear buffer
• reopen index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
• write new segment
• write new commit point
• fsync ← expensive!
• clear buffer
• reopen index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 5:
near real-time search
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
in-memory buffer
flush
segment
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
flush
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
flush
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
commit
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
commit point
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene flush
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene flush
• write new segment
• clear buffer
• reopen index
!
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene flush
• write new segment
• clear buffer
• reopen index
• no fsync
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene flush
• write new segment
• clear buffer
• reopen index
• no fsync → lightweight
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
data not safe until fsync’ed!
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 6:
don’t lose data
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 6:
don’t lose data
→ transaction log
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
in-memory buffer
flush
segment
translog
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
translog
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
flush
translog
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
translog
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
translog
commit
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
translog
commit point
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch “refresh”
• lucene “flush”
• makes changes searchable
• lightweight
!
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• lucene “commit”
• clears transaction log
• persists changes
• heavy
!
elasticsearch “flush”
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
refresh every second
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
near real-time search!
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
near real-time search!
near real-time analytics!
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• slow searches
• poor term frequencies
• poor compression
!
!
too many segments
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 7:
reduce segments
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
merge process
• many small → one big
• removes deleted docs
• runs in background
• throttled
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“Any wonder it broke down” by Brian Snelson is licensed under CC BY 2.0
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
sometimes you
need another truck
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 8:
scale out, not up
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
shard your data
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
shard your data
transparent in elasticsearch
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
many segments
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
one shard
ss
many segments →
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
one shard
ss
many segments
ssssssss
many shards
ss
→
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
one shard
ss
many segments
one index
IIssssssss
many shards
ss
→
→
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“node”
running instance of elasticsearch
≈ one server
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“shard”
bucket of data
lives on one node
physical worker unit
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“index”
logical namespace
points to one or more shards
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“index”
logical namespace
points to one or more shards
shard = hash(_id) % no_of_shards
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
PUT doc _id:1
hash(1) % 3 shard_2
node_A
shard_0
node_B
shard_1
node_C
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
GET doc _id:2
hash(2) % 3 shard_0
node_A
shard_0
node_B
shard_1
node_C
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Search all docs
shard = hash(_id) % no_of_shards
node_A
shard_0
node_B
shard_1
node_C
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 9:
scaling elastically
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
start small
node_A
shard_0
shard_1
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
add more nodes
node_A
shard_0
shard_1
shard_2
node_B node_C
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
shards migrate
node_A
shard_0
shard_1
shard_2
node_B
shard_1
node_C
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
rebalanced
node_A
shard_0
node_B
shard_1
node_C
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
add new index
node_A
shard_0
shard_1
node_B
shard_1
shard_2
node_C
shard_0
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
more hardware?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
more hardware?
more hardware failure
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
at 3am on sunday…
node_A
shard_0
shard_1
node_B
shard_1
shard_2
node_C
shard_0
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
boom!
node_A
shard_0
shard_1
node_B
shard_1
shard_2
node_C
shard_0
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 10:
add redundancy
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
for every shard
…make a copy
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“primary shard”
main shard
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“replica shard(s)”
copy of primary shard
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
one node
node_A
P0
P1
P2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
add a node
node_A
P0
P1
P2
node_B
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
add a node
node_A
P0
P1
P2
node_B
R0
R1
R2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
redundancy
node_A
P0
P1
P2
node_B
R0
R1
R2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
add a node
node_A
P0
P1
P2
node_B
R0
R1
R2
node_C
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
add a node
node_A
P0
P1
P2
node_B
R0
R1
R2
R1
node_C
P0
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
rebalanced
node_A
P0
P1
P2
node_B
R0
R1
R2
node_C
P0
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lose a node
node_A
P0
P1
P2
node_B
R0
R2
R1
node_C
P0
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
replica primary
node_A
P0
P1
P2
node_B
R0
R2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
replica primary
node_A
P0
P1
P2
node_B
P0
R2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
allocate replicas
node_A
P0
P1
P2
node_B
P0
R2
R0
R1
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
rebalanced
node_A
P0
P1
P2
node_B
P0
R2
R0
R1
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
primary shard
• just a role
• receives doc changes first
• forwards new doc to replicas in parallel
• number of primaries fixed
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
replica shard
• copy of primary shard
• serves read/search requests
• number of replicas can be changed
• more replicas → more read throughput

*if you have more hardware*
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
who controls all this?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 11:
the master node
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“Master Yoda” by Gonzalo Martín is licensed under CC BY-SA 2.0
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“node”
running instance of elastic search
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“node”
running instance of elastic search
node_A
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“cluster”
one or more nodes
with same cluster name
working together
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“cluster”
node_A node_B
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
discover a cluster
with multicast/unicast
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
discover a cluster
with multicast/unicast
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
request routing
send request to any node
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
request routing
forwards to correct node
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
how?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
how?
every node knows where
every document is
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
cluster state
every node knows where
every document is
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
cluster state
cluster level information
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
cluster state
cluster level information
indices shards nodes
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
cluster state
can only be updated by
the master node
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A
master node
elected when cluster forms
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B
master node
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B
master node
node_C
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
master node
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
master node
just a role
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
master node
re-elected if master fails
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_B node_C
master node
node_A
re-elected if master fails
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_B node_C
master node
re-elected if master fails
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
master node
only manages
cluster level changes
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
master node
not doc-level

get/put/search
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
the result?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
distributed
real-time
search & analytics
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
which works in the same way
on your laptop…
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
…as on your
1,000 node cluster
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
who is using it?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• full text search
• highlighted search snippets
• search-as-you-type
• did-you-mean suggestions
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• combine visitor logs with 

social network data
• real-time feedback to editors
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• combines full text search with
geolocation
• uses more-like-this to find 

related questions and answers
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• search repositories, users, 

issues, pull requests
• search 130 billion lines of code
• track all alerts, events, logs
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• index and analyse 

5TB of log data every day
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
thank you
@clintongormley
elasticsearch.org/downloads
elasticsearch.com/support
elasticsearch.com/jobs

Más contenido relacionado

Destacado

Webmining[final]
Webmining[final]Webmining[final]
Webmining[final]
Hari Hari
 

Destacado (20)

Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
What's new in Elasticsearch v5
What's new in Elasticsearch v5What's new in Elasticsearch v5
What's new in Elasticsearch v5
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning Elasticsearch
 
Building an ETL pipeline for Elasticsearch using Spark
Building an ETL pipeline for Elasticsearch using SparkBuilding an ETL pipeline for Elasticsearch using Spark
Building an ETL pipeline for Elasticsearch using Spark
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
Elasticsearch in 15 minutes
Elasticsearch in 15 minutesElasticsearch in 15 minutes
Elasticsearch in 15 minutes
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic Introduction
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
Microservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital OneMicroservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital One
 
Webmining[final]
Webmining[final]Webmining[final]
Webmining[final]
 
To infinity and beyond
To infinity and beyondTo infinity and beyond
To infinity and beyond
 
Campaign Technology
Campaign TechnologyCampaign Technology
Campaign Technology
 
Unit Testing and Tools - ADNUG
Unit Testing and Tools - ADNUGUnit Testing and Tools - ADNUG
Unit Testing and Tools - ADNUG
 
Show me the problem- Our insights journey at Netflix
Show me the problem- Our insights journey at NetflixShow me the problem- Our insights journey at Netflix
Show me the problem- Our insights journey at Netflix
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search
 
Elastic search Walkthrough
Elastic search WalkthroughElastic search Walkthrough
Elastic search Walkthrough
 
8 ways to leverage AWS Lambda in your Big Data workloads
8 ways to leverage AWS Lambda in your Big Data workloads8 ways to leverage AWS Lambda in your Big Data workloads
8 ways to leverage AWS Lambda in your Big Data workloads
 
Elasticsearch 5.0
Elasticsearch 5.0Elasticsearch 5.0
Elasticsearch 5.0
 

Similar a Scaling real-time search and analytics with Elasticsearch

Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
DataWorks Summit
 

Similar a Scaling real-time search and analytics with Elasticsearch (11)

Making sense of your data to give new insight - Elasticsearch at Findability ...
Making sense of your data to give new insight - Elasticsearch at Findability ...Making sense of your data to give new insight - Elasticsearch at Findability ...
Making sense of your data to give new insight - Elasticsearch at Findability ...
 
OSMC 2014 | Using Elasticsearch, Logstash & Kibana in system administration b...
OSMC 2014 | Using Elasticsearch, Logstash & Kibana in system administration b...OSMC 2014 | Using Elasticsearch, Logstash & Kibana in system administration b...
OSMC 2014 | Using Elasticsearch, Logstash & Kibana in system administration b...
 
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
 
OSMC 2014: Using elasticsearch, logstash & kibana in system administration | ...
OSMC 2014: Using elasticsearch, logstash & kibana in system administration | ...OSMC 2014: Using elasticsearch, logstash & kibana in system administration | ...
OSMC 2014: Using elasticsearch, logstash & kibana in system administration | ...
 
Linked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need ReconciliationLinked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need Reconciliation
 
(Webinar) Content Marketing: Neuromarketing Science 2014
(Webinar) Content Marketing: Neuromarketing Science 2014(Webinar) Content Marketing: Neuromarketing Science 2014
(Webinar) Content Marketing: Neuromarketing Science 2014
 
Digital Textbooks
Digital TextbooksDigital Textbooks
Digital Textbooks
 
Digital Textbook Presentation
Digital Textbook PresentationDigital Textbook Presentation
Digital Textbook Presentation
 
Semantic Integration with Apache Jena and Stanbol
Semantic Integration with Apache Jena and StanbolSemantic Integration with Apache Jena and Stanbol
Semantic Integration with Apache Jena and Stanbol
 
SPPTChap003.ppt
SPPTChap003.pptSPPTChap003.ppt
SPPTChap003.ppt
 
Top-Punctuation-Howlers
Top-Punctuation-HowlersTop-Punctuation-Howlers
Top-Punctuation-Howlers
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Scaling real-time search and analytics with Elasticsearch

  • 1. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. Clinton Gormley @clintongormley Scaling real time search and analytics with elasticsearch
  • 2. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
  • 3. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch.org/guide
  • 4. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch
  • 5. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch • real-time
  • 6. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch • real-time • distributed
  • 7. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch • real-time • distributed • search
  • 8. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch • real-time • distributed • search • analytics
  • 9. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. how to use it?
  • 10. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. how to use it?
  • 11. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. how does it work?
  • 12. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 1: making text searchable
  • 13. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
  • 14. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. where content like “%darling%buds%”
  • 15. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. slow & inflexible
  • 16. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
  • 17. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. Term Doc  1 Doc  2 Doc  3 breathe brings buds but by can … damasked darling date day deaf death declines delight sorted list of unique terms
  • 18. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. Term Doc  1 Doc  2 Doc  3 breathe brings buds but by can … damasked darling date day deaf death declines delight where they occur
  • 19. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. Term Doc  1 Doc  2 Doc  3 breathe brings buds but by can … damasked darling date day deaf death declines delight
  • 20. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. Term Doc  1 Doc  2 Doc  3 breathe brings buds but by can … damasked darling date day deaf death declines delight
  • 21. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index
  • 22. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies
  • 23. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies » relevance
  • 24. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies • text length » relevance
  • 25. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies • text length » relevance » doc weight
  • 26. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies • text length • term positions » relevance » doc weight
  • 27. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies • text length • term positions » relevance » doc weight » word proximity
  • 28. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies • text length • term positions • char offsets » relevance » doc weight » word proximity
  • 29. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies • text length • term positions • char offsets » relevance » doc weight » word proximity » highlighting
  • 30. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index not just for text
  • 31. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index numbers, dates, bools, enums geopoints, geoshapes, etc
  • 32. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 2: analytics
  • 33. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. for search map values → doc_ids
  • 34. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. for search map values → doc_ids for analytics map doc_ids → values
  • 35. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. uninvert the index
  • 36. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. uninvert the index cache values in memory called “fielddata”
  • 37. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. uninvert the index data access from RAM very fast
  • 38. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. on-the-fly analytics in the context of a user’s query
  • 39. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. on-the-fly analytics relevant analytics for each user
  • 40. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. calculate metrics count, min, max, sum, avg, percentiles, cardinality, stddev, variance, sum of squares !
  • 41. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. grouped by popular terms, significant terms, ranges, dates, geolocation, etc
  • 42. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. grouped by groups can … contain subgroups … which contain subgroups etc
  • 43. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 3: building the inverted index
  • 44. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index
  • 45. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. immutable
  • 46. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. immutable • cache friendly
  • 47. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. immutable • cache friendly • reads from RAM
  • 48. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. immutable • cache friendly • reads from RAM • fielddata never changes
  • 49. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. immutable • cache friendly • reads from RAM • fielddata never changes • compressible
  • 50. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. immutable • cache friendly • reads from RAM • fielddata never changes • compressible • no locking
  • 51. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but, immutable…
  • 52. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 4: dynamic inverted index
  • 53. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. in-memory buffer commit segment
  • 54. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. commit point segment searchable
  • 55. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. commit point segment searchable commit
  • 56. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. commit point segment searchable
  • 57. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. commit point segment searchable commit
  • 58. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. commit point segment searchable
  • 59. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit
  • 60. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit • write new segment
  • 61. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit • write new segment • write new commit point
  • 62. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit • write new segment • write new commit point • fsync
  • 63. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit • write new segment • write new commit point • fsync • clear buffer
  • 64. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit • write new segment • write new commit point • fsync • clear buffer • reopen index
  • 65. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit • write new segment • write new commit point • fsync ← expensive! • clear buffer • reopen index
  • 66. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 5: near real-time search
  • 67. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. in-memory buffer flush segment
  • 68. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable
  • 69. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable flush
  • 70. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable
  • 71. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable flush
  • 72. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable
  • 73. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable commit
  • 74. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. commit point segment searchable
  • 75. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene flush
  • 76. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene flush • write new segment • clear buffer • reopen index !
  • 77. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene flush • write new segment • clear buffer • reopen index • no fsync
  • 78. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene flush • write new segment • clear buffer • reopen index • no fsync → lightweight
  • 79. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but… data not safe until fsync’ed!
  • 80. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 6: don’t lose data
  • 81. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 6: don’t lose data → transaction log
  • 82. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. in-memory buffer flush segment translog
  • 83. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable translog
  • 84. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable flush translog
  • 85. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable translog
  • 86. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable translog commit
  • 87. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable translog commit point
  • 88. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch “refresh” • lucene “flush” • makes changes searchable • lightweight !
  • 89. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • lucene “commit” • clears transaction log • persists changes • heavy ! elasticsearch “flush”
  • 90. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. refresh every second
  • 91. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. near real-time search!
  • 92. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. near real-time search! near real-time analytics!
  • 93. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but…
  • 94. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • slow searches • poor term frequencies • poor compression ! ! too many segments
  • 95. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 7: reduce segments
  • 96. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 97. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 98. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 99. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 100. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 101. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 102. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 103. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 104. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 105. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 106. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 107. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 108. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. merge process • many small → one big • removes deleted docs • runs in background • throttled
  • 109. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but…
  • 110. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “Any wonder it broke down” by Brian Snelson is licensed under CC BY 2.0
  • 111. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. sometimes you need another truck
  • 112. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 8: scale out, not up
  • 113. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. shard your data
  • 114. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. shard your data transparent in elasticsearch
  • 115. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. many segments
  • 116. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. one shard ss many segments →
  • 117. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. one shard ss many segments ssssssss many shards ss →
  • 118. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. one shard ss many segments one index IIssssssss many shards ss → →
  • 119. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “node” running instance of elasticsearch ≈ one server
  • 120. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “shard” bucket of data lives on one node physical worker unit
  • 121. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “index” logical namespace points to one or more shards
  • 122. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “index” logical namespace points to one or more shards shard = hash(_id) % no_of_shards
  • 123. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. PUT doc _id:1 hash(1) % 3 shard_2 node_A shard_0 node_B shard_1 node_C shard_2
  • 124. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. GET doc _id:2 hash(2) % 3 shard_0 node_A shard_0 node_B shard_1 node_C shard_2
  • 125. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. Search all docs shard = hash(_id) % no_of_shards node_A shard_0 node_B shard_1 node_C shard_2
  • 126. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 9: scaling elastically
  • 127. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. start small node_A shard_0 shard_1 shard_2
  • 128. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. add more nodes node_A shard_0 shard_1 shard_2 node_B node_C
  • 129. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. shards migrate node_A shard_0 shard_1 shard_2 node_B shard_1 node_C shard_2
  • 130. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. rebalanced node_A shard_0 node_B shard_1 node_C shard_2
  • 131. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. add new index node_A shard_0 shard_1 node_B shard_1 shard_2 node_C shard_0 shard_2
  • 132. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but…
  • 133. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but… more hardware?
  • 134. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but… more hardware? more hardware failure
  • 135. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. at 3am on sunday… node_A shard_0 shard_1 node_B shard_1 shard_2 node_C shard_0 shard_2
  • 136. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. boom! node_A shard_0 shard_1 node_B shard_1 shard_2 node_C shard_0 shard_2
  • 137. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 10: add redundancy
  • 138. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. for every shard …make a copy
  • 139. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “primary shard” main shard
  • 140. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “replica shard(s)” copy of primary shard
  • 141. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. one node node_A P0 P1 P2
  • 142. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. add a node node_A P0 P1 P2 node_B
  • 143. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. add a node node_A P0 P1 P2 node_B R0 R1 R2
  • 144. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. redundancy node_A P0 P1 P2 node_B R0 R1 R2
  • 145. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. add a node node_A P0 P1 P2 node_B R0 R1 R2 node_C
  • 146. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. add a node node_A P0 P1 P2 node_B R0 R1 R2 R1 node_C P0
  • 147. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. rebalanced node_A P0 P1 P2 node_B R0 R1 R2 node_C P0
  • 148. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lose a node node_A P0 P1 P2 node_B R0 R2 R1 node_C P0
  • 149. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. replica primary node_A P0 P1 P2 node_B R0 R2
  • 150. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. replica primary node_A P0 P1 P2 node_B P0 R2
  • 151. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. allocate replicas node_A P0 P1 P2 node_B P0 R2 R0 R1
  • 152. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. rebalanced node_A P0 P1 P2 node_B P0 R2 R0 R1
  • 153. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. primary shard • just a role • receives doc changes first • forwards new doc to replicas in parallel • number of primaries fixed
  • 154. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. replica shard • copy of primary shard • serves read/search requests • number of replicas can be changed • more replicas → more read throughput
 *if you have more hardware*
  • 155. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but…
  • 156. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but… who controls all this?
  • 157. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 11: the master node
  • 158. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “Master Yoda” by Gonzalo Martín is licensed under CC BY-SA 2.0
  • 159. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “node” running instance of elastic search
  • 160. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “node” running instance of elastic search node_A
  • 161. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “cluster” one or more nodes with same cluster name working together
  • 162. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “cluster” node_A node_B
  • 163. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C discover a cluster with multicast/unicast
  • 164. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C discover a cluster with multicast/unicast
  • 165. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C request routing send request to any node
  • 166. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C request routing forwards to correct node
  • 167. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. how?
  • 168. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. how? every node knows where every document is
  • 169. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. cluster state every node knows where every document is
  • 170. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. cluster state cluster level information
  • 171. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. cluster state cluster level information indices shards nodes
  • 172. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. cluster state can only be updated by the master node
  • 173. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A master node elected when cluster forms
  • 174. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B master node
  • 175. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B master node node_C
  • 176. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C master node
  • 177. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C master node just a role
  • 178. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C master node re-elected if master fails
  • 179. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_B node_C master node node_A re-elected if master fails
  • 180. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_B node_C master node re-elected if master fails
  • 181. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. master node only manages cluster level changes
  • 182. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. master node not doc-level
 get/put/search
  • 183. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. the result?
  • 184. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. distributed real-time search & analytics
  • 185. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. which works in the same way on your laptop…
  • 186. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. …as on your 1,000 node cluster
  • 187. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. who is using it?
  • 188. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • full text search • highlighted search snippets • search-as-you-type • did-you-mean suggestions
  • 189. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • combine visitor logs with 
 social network data • real-time feedback to editors
  • 190. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • combines full text search with geolocation • uses more-like-this to find 
 related questions and answers
  • 191. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • search repositories, users, 
 issues, pull requests • search 130 billion lines of code • track all alerts, events, logs
  • 192. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • index and analyse 
 5TB of log data every day
  • 193. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. thank you @clintongormley elasticsearch.org/downloads elasticsearch.com/support elasticsearch.com/jobs