SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
You know, for search
  February 18th, 2011, RivieraJUG
About me
●   Lukáš Vlček ( @lukasvlcek )
●   Java developer since 2001
●   Joined Red Hat (JBoss division) in 2010
●   Member of JBoss.org team, focusing on search
In the beginning there was...
Zzzzz...




...Shay Banon is not sleeping but dreaming!
Highly-available
                {JSON}



   RESTful
Search Engine                                     Distributed




                                              CLOUD
 Asynchronous




       Open Sourced
                         ... buzzword dreaming!
          (ASL2)
!                              *
                                 :-)

*
        ... and @ElasticSearch
                     was born!
Dreams do come true




https://github.com/elasticsearch/elasticsearch




        http://www.elasticsearch.org
What is ElasticSearch ?
●   Distributed
●   Highly-available
●   RESTful search engine (on top of Lucene)
●   Designed to speak JSON (JSON in, JSON out)
●   and more...


    ... but first, let's check simple examples.
Demo #1




RESTful JSON teaser
RESTful
●   Network interface for data indexing, searching
    and administration.

curl ­XGET 'http://localhost:9200/index1,index2/typeA,typeB/_search' ­d '{
  “query“ : { “match_all“ : {} }
}'




You can query one or more indices.
Indices can have aliases, you can also
use _all for all indices.


Each index have one or more types, something
like columns in DB table.
Highly available
●   For each index you can specify:
    ●   Number of shards
        –   Each index has fixed number of shards
    ●   Number of replicas
        –   Each shard can have 0-many replicas, can be changed
            dynamically
Distributed
●   Check next slides...
ZEN Discovery




      Node 1                    Node 2              Node 3              Node 4



                      A                    B                        C
A: { shards: 3, replicas: 2 }       B: { shards: 2, replicas: 3 }       C: { shards: 1, replicas: 0 }

 A1      A1      A1
                                     B1      B1      B1      B1
 A2      A2      A2                                                      C1
                                     B2      B2      B2      B2
 A3      A3      A3



        Gateway (longterm persistency of cluster data & metadata)
ZEN Discovery




      Node 1                    Node 2              Node 3              Node 4



                      A                    B                        C
A: { shards: 3, replicas: 2 }       B: { shards: 2, replicas: 3 }       C: { shards: 1, replicas: 4 }

 A1      A1      A1
                                     B1      B1      B1      B1
 A2      A2      A2                                                      C1     C1     C1      C1       C1
                                     B2      B2      B2      B2
 A3      A3      A3
                                                                                Can not allocate all replicas!
                                                                                Check Health API


        Gateway (longterm persistency of cluster data & metadata)
Talking to the cluster
●   Native client in Java and Groovy

          Client




                    Node 1   Node 2    Node 3


●   Client type:
    ● Node client
    ● Transport Client
Talking to the cluster
●   REST client

             Client




                       Node 1      Node 2      Node 3


●   Many clients built on top of REST API
    ●   Perl, PHP, Python, Ruby, Erlang, ... etc
Nodes do not have to be equal
●   Can be a master
●   Can be a data node
●   Can allow for REST transport interface
    ●     Http, memcached, thrift
●   Index store (file, memory)               Node 1
●   JMX enabled
●   Thread pool type
●   ...                             Node 2            Node 3
Gateway
●   Long time persistency allows for whole (and
    partial) cluster backup and recovery.

    Types:
    ●   Local (default)
    ●   NFS
    ●   HDFS
    ●   AWS: S3
Distributed queries
●   You can control the type of the search query
    per search request:
    ●   Query and Fetch
    ●   Query then Fetch
    ●   Dfs, Query and Fetch
    ●   Dfs, Query then Fetch
Demo #2




 Dynamic allocation of indices,
shards, replicas and Health API
Admin API
●   Indices
       –   Status
       –   CRUD operation
       –   Mapping, Open/Close, Update settings
       –   Flush, Refresh, Snapshot, Optimize
●   Cluster
       –   Health
       –   State
       –   Node Info and stats
       –   Shutdown
Demo #3




Admin API: getting JVM and OS stats
Rich query API
●   There is rich Query DSL for search, includes:
    ●   Queries
         –   Boolean, Fuzzy, MLT, Prefix, DisMax, ...
    ●   Filters
         –   And/Or/Not, Boolean, Geo, Missing, Exists, ...
    ●   Highlighting
    ●   Sort
    ●   Facets
         –   on a next slide...
Facets
●   Facets allows to provide aggregated data on
    the search request.
    ●   query
    ●   filter
    ●   terms
    ●   range
    ●   (date) histogram
    ●   statistical
    ●   geo distance
Scripting support
●   There is a support for using scripting languages
    in many places (for example for custom scoring,
    script fields, script key in facets ...)
    ●   mvel (default)
    ●   JS
    ●   Groovy
    ●   Python
Demo #4




      Java API: Indexing data
REST API: Faceted search, Highlighting
Parent / Child
●   The parent/child support allows to define a
    parent relationship from a child to a parent type.
    ●   has_child (query, filter)
    ●   top_children (filter)
River
●   Let's listen on stream of changes and index the
    data...
    ●   CouchDB
    ●   RabbitMQ
    ●   Twitter
    ●   Wikipedia
Versioning (new in 0.15)
●   “update if current” functionality
●   ie: I can get a document, change it and then put
    it back in (referencing the version ID I fetched)
    and it will either index or fail (if the document
    has been modified in the interim)
●   Completely real-time
Percolator (new in 0.15)
●   The percolator API allows to register queries
    against an index, and then send a percolate
    request which includes a document, and getting
    back the queries that match on that document
    out of set of registered queries.
Q&A
Thank you!

Más contenido relacionado

La actualidad más candente

Simple search with elastic search
Simple search with elastic searchSimple search with elastic search
Simple search with elastic searchmarkstory
 
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with ElasticsearchSamantha Quiñones
 
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiRobert Calcavecchia
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseRobert Lujo
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in actionCodemotion
 
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Karel Minarik
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1Maruf Hassan
 
Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015Roy Russo
 
Query DSL In Elasticsearch
Query DSL In ElasticsearchQuery DSL In Elasticsearch
Query DSL In ElasticsearchKnoldus Inc.
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseAlexandre Rafalovitch
 
Elasticsearch - under the hood
Elasticsearch - under the hoodElasticsearch - under the hood
Elasticsearch - under the hoodSmartCat
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchclintongormley
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Philips Kokoh Prasetyo
 
Elastic search apache_solr
Elastic search apache_solrElastic search apache_solr
Elastic search apache_solrmacrochen
 
Elasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningElasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningPetar Djekic
 
Elastic search Walkthrough
Elastic search WalkthroughElastic search Walkthrough
Elastic search WalkthroughSuhel Meman
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Prajal Kulkarni
 
Introduction to ELK
Introduction to ELKIntroduction to ELK
Introduction to ELKYuHsuan Chen
 

La actualidad más candente (19)

Simple search with elastic search
Simple search with elastic searchSimple search with elastic search
Simple search with elastic search
 
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with Elasticsearch
 
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
 
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
 
Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015Elasticsearch - DevNexus 2015
Elasticsearch - DevNexus 2015
 
Query DSL In Elasticsearch
Query DSL In ElasticsearchQuery DSL In Elasticsearch
Query DSL In Elasticsearch
 
Elastic search
Elastic searchElastic search
Elastic search
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by Case
 
Elasticsearch - under the hood
Elasticsearch - under the hoodElasticsearch - under the hood
Elasticsearch - under the hood
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
Elastic search apache_solr
Elastic search apache_solrElastic search apache_solr
Elastic search apache_solr
 
Elasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningElasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuning
 
Elastic search Walkthrough
Elastic search WalkthroughElastic search Walkthrough
Elastic search Walkthrough
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.
 
Introduction to ELK
Introduction to ELKIntroduction to ELK
Introduction to ELK
 

Destacado

Elastic search & patent information @ mtc
Elastic search & patent information @ mtcElastic search & patent information @ mtc
Elastic search & patent information @ mtcArne Krueger
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic IntroductionMayur Rathod
 
Power of Elastic Search - nLocate
Power of Elastic Search - nLocatePower of Elastic Search - nLocate
Power of Elastic Search - nLocateAayush Shrestha
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...Rahul K Chauhan
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overviewABC Talks
 
First-passage percolation on random planar maps
First-passage percolation on random planar mapsFirst-passage percolation on random planar maps
First-passage percolation on random planar mapsTimothy Budd
 
mtc All Hands 8/15 Werte
mtc All Hands 8/15 Wertemtc All Hands 8/15 Werte
mtc All Hands 8/15 WerteArne Krueger
 
20131011 - Los Gatos - Netflix - Big Data Design Patterns
20131011 - Los Gatos - Netflix - Big Data Design Patterns20131011 - Los Gatos - Netflix - Big Data Design Patterns
20131011 - Los Gatos - Netflix - Big Data Design PatternsAllen Day, PhD
 
Paper Review: An exact mapping between the Variational Renormalization Group ...
Paper Review: An exact mapping between the Variational Renormalization Group ...Paper Review: An exact mapping between the Variational Renormalization Group ...
Paper Review: An exact mapping between the Variational Renormalization Group ...Kai-Wen Zhao
 
Artificial intelligence 2015: Quo Vadis?
Artificial intelligence 2015: Quo Vadis?Artificial intelligence 2015: Quo Vadis?
Artificial intelligence 2015: Quo Vadis?Sergey Shelpuk
 
Machine Learning and Logging for Monitoring Microservices
Machine Learning and Logging for Monitoring Microservices Machine Learning and Logging for Monitoring Microservices
Machine Learning and Logging for Monitoring Microservices Daniel Berman
 
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...Shu Tanaka
 

Destacado (20)

(Elastic)search in big data
(Elastic)search in big data(Elastic)search in big data
(Elastic)search in big data
 
Elastic search
Elastic searchElastic search
Elastic search
 
Elastic search & patent information @ mtc
Elastic search & patent information @ mtcElastic search & patent information @ mtc
Elastic search & patent information @ mtc
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic Introduction
 
Power of Elastic Search - nLocate
Power of Elastic Search - nLocatePower of Elastic Search - nLocate
Power of Elastic Search - nLocate
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
 
Elastic pivorak
Elastic pivorakElastic pivorak
Elastic pivorak
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
Logging in moodle
Logging in moodleLogging in moodle
Logging in moodle
 
Percolation Model and Controllability
Percolation Model and ControllabilityPercolation Model and Controllability
Percolation Model and Controllability
 
Machine Learning at Scale
Machine Learning at ScaleMachine Learning at Scale
Machine Learning at Scale
 
First-passage percolation on random planar maps
First-passage percolation on random planar mapsFirst-passage percolation on random planar maps
First-passage percolation on random planar maps
 
mtc All Hands 8/15 Werte
mtc All Hands 8/15 Wertemtc All Hands 8/15 Werte
mtc All Hands 8/15 Werte
 
20131011 - Los Gatos - Netflix - Big Data Design Patterns
20131011 - Los Gatos - Netflix - Big Data Design Patterns20131011 - Los Gatos - Netflix - Big Data Design Patterns
20131011 - Los Gatos - Netflix - Big Data Design Patterns
 
Percolation
PercolationPercolation
Percolation
 
Paper Review: An exact mapping between the Variational Renormalization Group ...
Paper Review: An exact mapping between the Variational Renormalization Group ...Paper Review: An exact mapping between the Variational Renormalization Group ...
Paper Review: An exact mapping between the Variational Renormalization Group ...
 
Artificial intelligence 2015: Quo Vadis?
Artificial intelligence 2015: Quo Vadis?Artificial intelligence 2015: Quo Vadis?
Artificial intelligence 2015: Quo Vadis?
 
Machine Learning and Logging for Monitoring Microservices
Machine Learning and Logging for Monitoring Microservices Machine Learning and Logging for Monitoring Microservices
Machine Learning and Logging for Monitoring Microservices
 
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
 

Similar a Elastic Search

Graph databases in computational bioloby: case of neo4j and TitanDB
Graph databases in computational bioloby: case of neo4j and TitanDBGraph databases in computational bioloby: case of neo4j and TitanDB
Graph databases in computational bioloby: case of neo4j and TitanDBAndrei KUCHARAVY
 
SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for ElasticsearchJodok Batlogg
 
Stripe CTF3 wrap-up
Stripe CTF3 wrap-upStripe CTF3 wrap-up
Stripe CTF3 wrap-upStripe
 
I know Java, why should I consider Clojure?
I know Java, why should I consider Clojure?I know Java, why should I consider Clojure?
I know Java, why should I consider Clojure?sbjug
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseC4Media
 
Chapter 8. Partial updates and retrievals.pdf
Chapter 8. Partial updates and retrievals.pdfChapter 8. Partial updates and retrievals.pdf
Chapter 8. Partial updates and retrievals.pdfRick Hwang
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextRafał Kuć
 
Grand Central Dispatch
Grand Central DispatchGrand Central Dispatch
Grand Central DispatchRobert Brown
 
Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch BasicsShifa Khan
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...Dataconomy Media
 
NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?Eric Evans
 
From Lisp to Clojure/Incanter and RAn Introduction
From Lisp to Clojure/Incanter and RAn IntroductionFrom Lisp to Clojure/Incanter and RAn Introduction
From Lisp to Clojure/Incanter and RAn Introductionelliando dias
 
Scalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAMScalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAMfnothaft
 
Doug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemDoug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemCloudera, Inc.
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLEric Evans
 
Introduction to ArangoDB (nosql matters Barcelona 2012)
Introduction to ArangoDB (nosql matters Barcelona 2012)Introduction to ArangoDB (nosql matters Barcelona 2012)
Introduction to ArangoDB (nosql matters Barcelona 2012)ArangoDB Database
 
Concurrecy in Ruby
Concurrecy in RubyConcurrecy in Ruby
Concurrecy in RubyVesna Doknic
 

Similar a Elastic Search (20)

Graph databases in computational bioloby: case of neo4j and TitanDB
Graph databases in computational bioloby: case of neo4j and TitanDBGraph databases in computational bioloby: case of neo4j and TitanDB
Graph databases in computational bioloby: case of neo4j and TitanDB
 
SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for Elasticsearch
 
Stripe CTF3 wrap-up
Stripe CTF3 wrap-upStripe CTF3 wrap-up
Stripe CTF3 wrap-up
 
I know Java, why should I consider Clojure?
I know Java, why should I consider Clojure?I know Java, why should I consider Clojure?
I know Java, why should I consider Clojure?
 
DEVIEW 2013
DEVIEW 2013DEVIEW 2013
DEVIEW 2013
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
Chapter 8. Partial updates and retrievals.pdf
Chapter 8. Partial updates and retrievals.pdfChapter 8. Partial updates and retrievals.pdf
Chapter 8. Partial updates and retrievals.pdf
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
 
Grand Central Dispatch
Grand Central DispatchGrand Central Dispatch
Grand Central Dispatch
 
Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch Basics
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
 
NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?
 
Work items
Work itemsWork items
Work items
 
Work items
Work itemsWork items
Work items
 
From Lisp to Clojure/Incanter and RAn Introduction
From Lisp to Clojure/Incanter and RAn IntroductionFrom Lisp to Clojure/Incanter and RAn Introduction
From Lisp to Clojure/Incanter and RAn Introduction
 
Scalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAMScalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAM
 
Doug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemDoug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop Ecosystem
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQL
 
Introduction to ArangoDB (nosql matters Barcelona 2012)
Introduction to ArangoDB (nosql matters Barcelona 2012)Introduction to ArangoDB (nosql matters Barcelona 2012)
Introduction to ArangoDB (nosql matters Barcelona 2012)
 
Concurrecy in Ruby
Concurrecy in RubyConcurrecy in Ruby
Concurrecy in Ruby
 

Más de Lukas Vlcek

Elasticsearch Monitoring in Openshift
Elasticsearch Monitoring in OpenshiftElasticsearch Monitoring in Openshift
Elasticsearch Monitoring in OpenshiftLukas Vlcek
 
JBug_React_and_Flux_2015
JBug_React_and_Flux_2015JBug_React_and_Flux_2015
JBug_React_and_Flux_2015Lukas Vlcek
 
Elasticsearch @JBoss.org, 2014
Elasticsearch @JBoss.org, 2014Elasticsearch @JBoss.org, 2014
Elasticsearch @JBoss.org, 2014Lukas Vlcek
 
An Introduction to Apache Hadoop, Mahout and HBase
An Introduction to Apache Hadoop, Mahout and HBaseAn Introduction to Apache Hadoop, Mahout and HBase
An Introduction to Apache Hadoop, Mahout and HBaseLukas Vlcek
 
Building search app with ElasticSearch
Building search app with ElasticSearchBuilding search app with ElasticSearch
Building search app with ElasticSearchLukas Vlcek
 
Compass Framework
Compass FrameworkCompass Framework
Compass FrameworkLukas Vlcek
 

Más de Lukas Vlcek (7)

Elasticsearch Monitoring in Openshift
Elasticsearch Monitoring in OpenshiftElasticsearch Monitoring in Openshift
Elasticsearch Monitoring in Openshift
 
JBug_React_and_Flux_2015
JBug_React_and_Flux_2015JBug_React_and_Flux_2015
JBug_React_and_Flux_2015
 
Elasticsearch @JBoss.org, 2014
Elasticsearch @JBoss.org, 2014Elasticsearch @JBoss.org, 2014
Elasticsearch @JBoss.org, 2014
 
An Introduction to Apache Hadoop, Mahout and HBase
An Introduction to Apache Hadoop, Mahout and HBaseAn Introduction to Apache Hadoop, Mahout and HBase
An Introduction to Apache Hadoop, Mahout and HBase
 
Building search app with ElasticSearch
Building search app with ElasticSearchBuilding search app with ElasticSearch
Building search app with ElasticSearch
 
JBoss Snowdrop
JBoss SnowdropJBoss Snowdrop
JBoss Snowdrop
 
Compass Framework
Compass FrameworkCompass Framework
Compass Framework
 

Elastic Search

  • 1. You know, for search February 18th, 2011, RivieraJUG
  • 2. About me ● Lukáš Vlček ( @lukasvlcek ) ● Java developer since 2001 ● Joined Red Hat (JBoss division) in 2010 ● Member of JBoss.org team, focusing on search
  • 3. In the beginning there was...
  • 4.
  • 5. Zzzzz... ...Shay Banon is not sleeping but dreaming!
  • 6. Highly-available {JSON} RESTful Search Engine Distributed CLOUD Asynchronous Open Sourced ... buzzword dreaming! (ASL2)
  • 7. ! * :-) * ... and @ElasticSearch was born!
  • 8. Dreams do come true https://github.com/elasticsearch/elasticsearch http://www.elasticsearch.org
  • 9. What is ElasticSearch ? ● Distributed ● Highly-available ● RESTful search engine (on top of Lucene) ● Designed to speak JSON (JSON in, JSON out) ● and more... ... but first, let's check simple examples.
  • 11. RESTful ● Network interface for data indexing, searching and administration. curl ­XGET 'http://localhost:9200/index1,index2/typeA,typeB/_search' ­d '{   “query“ : { “match_all“ : {} } }' You can query one or more indices. Indices can have aliases, you can also use _all for all indices. Each index have one or more types, something like columns in DB table.
  • 12. Highly available ● For each index you can specify: ● Number of shards – Each index has fixed number of shards ● Number of replicas – Each shard can have 0-many replicas, can be changed dynamically
  • 13. Distributed ● Check next slides...
  • 14. ZEN Discovery Node 1 Node 2 Node 3 Node 4 A B C A: { shards: 3, replicas: 2 } B: { shards: 2, replicas: 3 } C: { shards: 1, replicas: 0 } A1 A1 A1 B1 B1 B1 B1 A2 A2 A2 C1 B2 B2 B2 B2 A3 A3 A3 Gateway (longterm persistency of cluster data & metadata)
  • 15. ZEN Discovery Node 1 Node 2 Node 3 Node 4 A B C A: { shards: 3, replicas: 2 } B: { shards: 2, replicas: 3 } C: { shards: 1, replicas: 4 } A1 A1 A1 B1 B1 B1 B1 A2 A2 A2 C1 C1 C1 C1 C1 B2 B2 B2 B2 A3 A3 A3 Can not allocate all replicas! Check Health API Gateway (longterm persistency of cluster data & metadata)
  • 16. Talking to the cluster ● Native client in Java and Groovy Client Node 1 Node 2 Node 3 ● Client type: ● Node client ● Transport Client
  • 17. Talking to the cluster ● REST client Client Node 1 Node 2 Node 3 ● Many clients built on top of REST API ● Perl, PHP, Python, Ruby, Erlang, ... etc
  • 18. Nodes do not have to be equal ● Can be a master ● Can be a data node ● Can allow for REST transport interface ● Http, memcached, thrift ● Index store (file, memory) Node 1 ● JMX enabled ● Thread pool type ● ... Node 2 Node 3
  • 19. Gateway ● Long time persistency allows for whole (and partial) cluster backup and recovery. Types: ● Local (default) ● NFS ● HDFS ● AWS: S3
  • 20. Distributed queries ● You can control the type of the search query per search request: ● Query and Fetch ● Query then Fetch ● Dfs, Query and Fetch ● Dfs, Query then Fetch
  • 21. Demo #2 Dynamic allocation of indices, shards, replicas and Health API
  • 22. Admin API ● Indices – Status – CRUD operation – Mapping, Open/Close, Update settings – Flush, Refresh, Snapshot, Optimize ● Cluster – Health – State – Node Info and stats – Shutdown
  • 23. Demo #3 Admin API: getting JVM and OS stats
  • 24. Rich query API ● There is rich Query DSL for search, includes: ● Queries – Boolean, Fuzzy, MLT, Prefix, DisMax, ... ● Filters – And/Or/Not, Boolean, Geo, Missing, Exists, ... ● Highlighting ● Sort ● Facets – on a next slide...
  • 25. Facets ● Facets allows to provide aggregated data on the search request. ● query ● filter ● terms ● range ● (date) histogram ● statistical ● geo distance
  • 26. Scripting support ● There is a support for using scripting languages in many places (for example for custom scoring, script fields, script key in facets ...) ● mvel (default) ● JS ● Groovy ● Python
  • 27. Demo #4 Java API: Indexing data REST API: Faceted search, Highlighting
  • 28. Parent / Child ● The parent/child support allows to define a parent relationship from a child to a parent type. ● has_child (query, filter) ● top_children (filter)
  • 29. River ● Let's listen on stream of changes and index the data... ● CouchDB ● RabbitMQ ● Twitter ● Wikipedia
  • 30. Versioning (new in 0.15) ● “update if current” functionality ● ie: I can get a document, change it and then put it back in (referencing the version ID I fetched) and it will either index or fail (if the document has been modified in the interim) ● Completely real-time
  • 31. Percolator (new in 0.15) ● The percolator API allows to register queries against an index, and then send a percolate request which includes a document, and getting back the queries that match on that document out of set of registered queries.
  • 32. Q&A