SlideShare una empresa de Scribd logo
1 de 17
Descargar para leer sin conexión
Monitoring tools for
  ElasticSearch
     SF Meetup
     2013.03.06

                  Sushant Shankar
                  Shyam Kuttikkad
• Why and how we use ElasticSearch
• Monitoring
  – Tools
  – Index Building
  – Query Performance
Who is asdfas
• Social Sharing and Content Discovery platform
   – We help >600,000 publishers with content distribution, user
     engagement, and advertising monetization
   – 450 Fortune 1000 brand marketers leverage our unique social signals
     to deliver impactful advertising
• We develop Machine Learning algorithms operating on Big
  Data to:
   – Provide content sharing insights to Publishers
   – Build customized audience segments for advertising campaigns
   – Extract actionable insights out of social and interest data




www.33Across.com
www.tynt.com
Data firehose of 30B monthly
   events, 1.25B cookies
                     - Interaction with web
                     content
                     - Shares – images,
                     copies
                     - Searches

                           Build, understand,
                           analyze
                           Real-time view
                                    ElasticSearch!
                      Social Audiences
                      Behavior
                      Context
                      Knowledge
Production ElasticSearch cluster

Hardware
6 nodes, 24GB RAM
16GB for ES service
4 cores
3x 1.5TB drive

Index                  Build index
>1TB/index             using MR job
(replicated)           and Bulk API
~300M documents
~5KB / document
~3 hours
System monitoring using Zabbix

               Index Build
ElasticSearch specific monitoring
                     using SPM




Scalable Performance Monitoring (http://sematext.com/spm/index.html)
•   Index stats – Total/Refreshed/Merged documents
•   Shards – Total/Active/Relocating/Initializing
•   Search - Request rate and latency
•   Cache – {Filter, field} cache {count, evictions, size}
•   Machine – CPU, Memory, JVM, GC, Network, Disk
Index Building Optimization using
             Zabbix and SPM
Amount bulk indexed




                      Time taken
                       CPU util.
                       Mem util.
                        Disk I/O
                       Network



                                   # Shards
in practice…
Debugging and Validating using SPM
Index Building: Learnings
• 2 shards / CPU
• 10,000 documents (users) per indexing
  request

• Bulk API for our use case
• No replicas
• Refresh off (index.refresh_interval = -1)
Query Performance: Learnings
•   1-2 Replicas (and for reliability)
•   Turn refresh on again (5s default)
•   Warm up effect (Index Warm up API 0.20+)
•   Optimize API
•   Simulate multiple users
QUERIES?
Sushant Shankar
sushant.shankar@33across.com


     Shyam Kuttikkad
shyam.kuttikkad@33across.com
Why we really need a search engine
         Batch! Good for complicated tasks
         (Machine Learning, Graph Algorithms, etc.)




                          …                           …
Warm Up: load into memory and cache
Other cool features
• Custom Scoring functions
• Scripts – MVEL, Python
• Facets

•   Exploring:
•   Real-time indexing
•   Indexing images, files, etc.
•   Parent-child relationships

Más contenido relacionado

Destacado

The Automation Factory
The Automation FactoryThe Automation Factory
The Automation FactoryNathan Milford
 
Applying machine learning to product categorization
Applying machine learning to product categorizationApplying machine learning to product categorization
Applying machine learning to product categorizationSushant Shankar
 
Elasticsearch in Production (London version)
Elasticsearch in Production (London version)Elasticsearch in Production (London version)
Elasticsearch in Production (London version)foundsearch
 
E-commerce product classification with deep learning
E-commerce product classification with deep learning E-commerce product classification with deep learning
E-commerce product classification with deep learning Christopher Bonnett Ph.D
 
LogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeLogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeJames Turnbull
 
Down and dirty with Elasticsearch
Down and dirty with ElasticsearchDown and dirty with Elasticsearch
Down and dirty with Elasticsearchclintongormley
 
Machine Learning with Applications in Categorization, Popularity and Sequence...
Machine Learning with Applications in Categorization, Popularity and Sequence...Machine Learning with Applications in Categorization, Popularity and Sequence...
Machine Learning with Applications in Categorization, Popularity and Sequence...Nicolas Nicolov
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in NetflixDanny Yuan
 
Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)
Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)
Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)Nederlandstalige Zabbix Gebruikersgroep
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for SysadminsNathan Milford
 

Destacado (11)

The Automation Factory
The Automation FactoryThe Automation Factory
The Automation Factory
 
Applying machine learning to product categorization
Applying machine learning to product categorizationApplying machine learning to product categorization
Applying machine learning to product categorization
 
Elasticsearch in Production (London version)
Elasticsearch in Production (London version)Elasticsearch in Production (London version)
Elasticsearch in Production (London version)
 
E-commerce product classification with deep learning
E-commerce product classification with deep learning E-commerce product classification with deep learning
E-commerce product classification with deep learning
 
LogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeLogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesome
 
Down and dirty with Elasticsearch
Down and dirty with ElasticsearchDown and dirty with Elasticsearch
Down and dirty with Elasticsearch
 
Machine Learning with Applications in Categorization, Popularity and Sequence...
Machine Learning with Applications in Categorization, Popularity and Sequence...Machine Learning with Applications in Categorization, Popularity and Sequence...
Machine Learning with Applications in Categorization, Popularity and Sequence...
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
Cassandra+Hadoop
Cassandra+HadoopCassandra+Hadoop
Cassandra+Hadoop
 
Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)
Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)
Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 

Similar a SF ElasticSearch Meetup 2013.04.06 - Monitoring

SF ElasticSearch Meetup 2012.10.03
SF ElasticSearch Meetup 2012.10.03SF ElasticSearch Meetup 2012.10.03
SF ElasticSearch Meetup 2012.10.03Sushant Shankar
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Petter Skodvin-Hvammen
 
Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04marc_harrison
 
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014clairvoyantllc
 
Log analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and KibanaLog analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and KibanaAvinash Ramineni
 
SharePoint 2013 Search Architecture with Russ Houberg
SharePoint 2013  Search Architecture with Russ HoubergSharePoint 2013  Search Architecture with Russ Houberg
SharePoint 2013 Search Architecture with Russ Houbergknowledgelakemarketing
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018Apache MXNet
 
ElasticSearch as (only) datastore
ElasticSearch as (only) datastoreElasticSearch as (only) datastore
ElasticSearch as (only) datastoreTomas Sirny
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity PlanningMongoDB
 
Exploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better TogetherExploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better TogetherObjectRocket
 
Log Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & KibanaLog Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & KibanaAmazon Web Services
 
Web search engines and search technology
Web search engines and search technologyWeb search engines and search technology
Web search engines and search technologyStefanos Anastasiadis
 
SharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 PerformanceSharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 PerformanceBrian Culver
 
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)Jeremy Zawodny
 
Real-time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-time Data Exploration and Analytics with Amazon Elasticsearch ServiceReal-time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-time Data Exploration and Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
Thing you didn't know you could do in Spark
Thing you didn't know you could do in SparkThing you didn't know you could do in Spark
Thing you didn't know you could do in SparkSnappyData
 

Similar a SF ElasticSearch Meetup 2013.04.06 - Monitoring (20)

SF ElasticSearch Meetup 2012.10.03
SF ElasticSearch Meetup 2012.10.03SF ElasticSearch Meetup 2012.10.03
SF ElasticSearch Meetup 2012.10.03
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04
 
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
 
Log analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and KibanaLog analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and Kibana
 
SharePoint 2013 Search Architecture with Russ Houberg
SharePoint 2013  Search Architecture with Russ HoubergSharePoint 2013  Search Architecture with Russ Houberg
SharePoint 2013 Search Architecture with Russ Houberg
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
 
ElasticSearch as (only) datastore
ElasticSearch as (only) datastoreElasticSearch as (only) datastore
ElasticSearch as (only) datastore
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
Exploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better TogetherExploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better Together
 
Log Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & KibanaLog Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & Kibana
 
Web search engines and search technology
Web search engines and search technologyWeb search engines and search technology
Web search engines and search technology
 
Traitement d'événements
Traitement d'événementsTraitement d'événements
Traitement d'événements
 
SharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 PerformanceSharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 Performance
 
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
 
AzureSynapse.pptx
AzureSynapse.pptxAzureSynapse.pptx
AzureSynapse.pptx
 
Real-time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-time Data Exploration and Analytics with Amazon Elasticsearch ServiceReal-time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-time Data Exploration and Analytics with Amazon Elasticsearch Service
 
Thing you didn't know you could do in Spark
Thing you didn't know you could do in SparkThing you didn't know you could do in Spark
Thing you didn't know you could do in Spark
 

Último

4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sectoritnewsafrica
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...itnewsafrica
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 

Último (20)

4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 

SF ElasticSearch Meetup 2013.04.06 - Monitoring

  • 1. Monitoring tools for ElasticSearch SF Meetup 2013.03.06 Sushant Shankar Shyam Kuttikkad
  • 2. • Why and how we use ElasticSearch • Monitoring – Tools – Index Building – Query Performance
  • 3. Who is asdfas • Social Sharing and Content Discovery platform – We help >600,000 publishers with content distribution, user engagement, and advertising monetization – 450 Fortune 1000 brand marketers leverage our unique social signals to deliver impactful advertising • We develop Machine Learning algorithms operating on Big Data to: – Provide content sharing insights to Publishers – Build customized audience segments for advertising campaigns – Extract actionable insights out of social and interest data www.33Across.com www.tynt.com
  • 4. Data firehose of 30B monthly events, 1.25B cookies - Interaction with web content - Shares – images, copies - Searches Build, understand, analyze Real-time view ElasticSearch! Social Audiences Behavior Context Knowledge
  • 5. Production ElasticSearch cluster Hardware 6 nodes, 24GB RAM 16GB for ES service 4 cores 3x 1.5TB drive Index Build index >1TB/index using MR job (replicated) and Bulk API ~300M documents ~5KB / document ~3 hours
  • 6. System monitoring using Zabbix Index Build
  • 7. ElasticSearch specific monitoring using SPM Scalable Performance Monitoring (http://sematext.com/spm/index.html) • Index stats – Total/Refreshed/Merged documents • Shards – Total/Active/Relocating/Initializing • Search - Request rate and latency • Cache – {Filter, field} cache {count, evictions, size} • Machine – CPU, Memory, JVM, GC, Network, Disk
  • 8. Index Building Optimization using Zabbix and SPM Amount bulk indexed Time taken CPU util. Mem util. Disk I/O Network # Shards
  • 11. Index Building: Learnings • 2 shards / CPU • 10,000 documents (users) per indexing request • Bulk API for our use case • No replicas • Refresh off (index.refresh_interval = -1)
  • 12. Query Performance: Learnings • 1-2 Replicas (and for reliability) • Turn refresh on again (5s default) • Warm up effect (Index Warm up API 0.20+) • Optimize API • Simulate multiple users
  • 14. Sushant Shankar sushant.shankar@33across.com Shyam Kuttikkad shyam.kuttikkad@33across.com
  • 15. Why we really need a search engine Batch! Good for complicated tasks (Machine Learning, Graph Algorithms, etc.) … …
  • 16. Warm Up: load into memory and cache
  • 17. Other cool features • Custom Scoring functions • Scripts – MVEL, Python • Facets • Exploring: • Real-time indexing • Indexing images, files, etc. • Parent-child relationships

Notas del editor

  1. http://www.zabbix.com/ - ‘’Enterprise class monitoring solution for everyone’
  2. http://www.zabbix.com/ - ‘’Enterprise class monitoring solution for everyone’
  3. Collect information over 1B users internationally – text copied from over 600K publisher sites, images, searches, pages visitedDifferent slices of data – now!