SlideShare a Scribd company logo
1 of 36
Download to read offline
Is your Elasticsearch Cluster
Production Ready?
Itamar Syn-Hershko
http://code972.com | @synhershko
http://BigDataBoutique.co.il
Me?
http://bdbq.co.il
What does it take?
• Cluster deployed using best
practices
• Thorough monitoring
• Inspect. Fix. Repeat.
• Good capacity planning
• Memory management
• Indexing and sharding strategy
• Security
Cluster Topology
Master-eligible
nodes (3)
Data nodes
(sizing by data)
Client nodes, aka
coordinating nodes
(scalable, sizing by
traffic)
Deployments
• Prefer immutable images & scripted deployments
• For AWS see https://github.com/synhershko/elasticsearch-
cloud-deploy/
• GCP coming soon
Backups
• Very efficient
• Very important
• Several storages supported
• To a shared file system
• HDFS
• Azure / GCP / AWS repositories via plugins
What to monitor (on the cluster, per
host)?
• CPU load
• Memory utilization
• Heap utilization
• GC time
• Disk utilization
• Disk IOPs
• Merges
• Deleted docs
• Requests per sec (indexing, search)
• Load average < number of cores
• Network in / out
• Thread pool rejections
• Number of nodes
• Cache sizes
• Cache evictions
• Cluster state / health
• Number of shards per type
X-Pack monitoring (aka Marvel)
Grafana
dashboards
• More fine-grained, cluster-wide view
• Provided with metrics polling script (Python)
https://github.com/synhershko/elasticsearch-grafana-monitoring
Monitoring Destination
• To the same cluster
• To a different cluster (Recommended)
• External systems (e.g. graphite) – only if already in org
• X-Pack subscribers can now send metrics to Elastic Cloud
Typical garbage collection sawtooth
CPU
monitoring
Correlating metrics
• Shards on the same node have issues?
• During merges?
• CPU and GC
• HTTP traffic and indexing or search operations
Threadpools & Throughput
Boosting slow operations
• Search or Indexing heavy?
• Measure operations also from applications side!
• Slow searches
• Queries need optimization
• Scoring (not using filters)
• Numeric ranges pre-5
• Scripts
• Slow indexing
• Sharding strategy
• Use bulk indexing (optimize for 10-15MB of data, regardless of
number of documents / operations)
• Slow analyzers affects both! (e.g. n-grams)
Don’t use NGrams!
• Being used for “contains” search
• You ain’t gonna need it, use WordDelimiter Token Filter instead
• Useful for fuzzy search / auto-correction
• Best used via Elasticsearch’s Suggesters
• Useful for languages without spaces, or with compound
words
• min_gram , max_gram
Caches
• Query cache
• Request cache
• Measure evictions rate & cache usage
Memory Allocation
• ES_HEAP_SIZE
• DocValues used?
• Fielddata usage
• Query cache (for queries in filter context)
• Request cache (for aggregations and count queries)
• Never over 32GB!
• Default cache sizes not always fit usage
• Set appropriate static configs in elasticsearch.yml
• At least 50% of memory to file-system cache
• Usually more
Server Sizing
• Master nodes
• 1-2 cores, 2-4 GB memory, 50% ES_HEAP_SIZE
• Data nodes
• > 4 cores, measure and preserve disk/mem ratio (can start with
1/24)
• ES_HEAP_SIZE as per previous slide
• Client nodes
• CPU and network heavy, 4GB memory should be enough for most
use cases
Index Management Patterns
• A Monolith Index
• Search façade on top of your data
• Record linkage
• Anomaly detection
• Rolling indexes (time based events)
• Centralized logging
• Auditing
• IoT
logs-2016.11.20 logs-2016.11.21 logs-2016.11.22 logs-2016.11.23logs-2016.11.19
Optimal shard size
• Few millions in document size, for search performance
• A bit more if only doing aggregations
• 5-8GB on disk max, for startup times and network
reallocation
• doc_values are enabled by default, turn off for non-aggs fields to
save space
Sharding
• Index Shards
• Resharding / auto-sharding not supported
• Index-level sharding
• Avoid using types (deprecated > 6.x)
• Multi-tenancy
• Rollover API (> 5.x)
• Cluster level
• Cluster per project
• Cross-cluster search capability
Multitenancy
• Silos – Every tenant get their own index
• Index sizes vary
• Potentially wasting resources
• Pool – All tenants are in one big index
• Sharding isn’t dynamic
• Effects on tf/idf, aggregations, throughput
• Hybrid – Big tenants in their own index, pool(s) for small
ones
Use Explicit Mapping
(aka Avoid Schemaless)
• In one of two ways:
• Disable dynamic mapping in settings (index.mapper.dynamic: false). Will
refuse indexing.
• Create catch-all dynamic template with enabled:false mapping
• Why?
• Avoids hundreds of fields by mistake
• Saves effort on indexing and disk space
• Defaults are bad anyhow, don’t rely on them
• Prefer using index templates (especially for rolling indices)
Re-balancing is your enemy
• Lock down shard rebalancing
• cluster.routing.rebalance.enable
• none
• cluster.routing.allocation.enable
• primaries
• new_primaries
• none
More safe configs
• action.disable_delete_all_indices: true
• action.auto_create_index: false
Deep paging (don’t!)
• Don’t from-size
• search_after (> 5.x)
• Scroll and sliced-scroll (> 5.x)
• Not for normal operation
Deletions
• Deletions have an overhead
• Slow searches
• Segmentation
• More work on segment merging
• Non-exact tf/idf
• Every document update is a deletion
• No need to avoid it completely, just design accordingly
Geographic Distribution
• Never with the same cluster!
• Cross-cluster search (formerly Tribe Node)
• For geographic sharding
• Different indexes in different regions
• xDCR for HA / DR
• Can be solved by infra – replicating queues (Kafka), DBs
• Solution coming in X-Pack
Your ingestion architecture?
• Favor external ingestion, relieve Elastic from that responsibility
• Upgrade Logstash to 5.x
• Consider using FileBeat instead of logstash for log-tailing
• Prefer logstash machines over ingest nodes
• Use queues (Kafka, Redis) to protect against surges
Security
Protecting your cluster
• Don’t bind to a public IP
• Use only private IP/DNSs, preferably in subnets (e.g. AWS VPC)
• network.host in elasticsearch.yml
• Proxy all client requests to ES
• Disable HTTP where not needed
• + Don’t use default ports
• Secure publicly available client nodes
• Access via VPN only
• At the very least SSL + authentication if VPN not an option
• Disable dynamic scripting (pre-5.x)
Securing Indexes and Documents
• Heavy Kibana user?
• Authentication and authorization
• Index, Document and Field level security
• Requires X-Pack Security
• Application level authentication and authorization
• Application filtering of content (fields, documents)
• Index level (e.g. index per tenant)
• Document level (using permissions)
• Inter-node comms, encryption at rest (X-Pack only)
Upcoming in ES land
• Elasticsearch 6
• Machine Learning
• Anomaly detection on time series data
• Enterprise Cloud
• Elastic Cloud deployed on-premise
• Any plugin authors in the crowd?
Elasticsearch Training
Elasticsearch for Developers &
Maintaining Elasticsearch in Production
• September (10,11,17/9)
• November (12,13,16/11)
http://bdbq.co.il/courses
Consultancy and Development services
http://bdbq.co.il/services/elasticsearch
Questions?
@synhershko on social (Twitter, github, …)
Blog at http://code972.com
Training and consultancy at
http://BigDataBoutique.co.il

More Related Content

What's hot

Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScyllaDB
 
Interactive ad-hoc analysis at petabyte scale with HDInsight Interactive Query
Interactive ad-hoc analysis at petabyte scale with HDInsight Interactive QueryInteractive ad-hoc analysis at petabyte scale with HDInsight Interactive Query
Interactive ad-hoc analysis at petabyte scale with HDInsight Interactive QueryAshish Thapliyal
 
Five essential new enhancements in azure HDnsight
Five essential new enhancements in azure HDnsightFive essential new enhancements in azure HDnsight
Five essential new enhancements in azure HDnsightAshish Thapliyal
 
Webinar: Solr & Fusion for Big Data
Webinar: Solr & Fusion for Big DataWebinar: Solr & Fusion for Big Data
Webinar: Solr & Fusion for Big DataLucidworks
 
CosmosDB for DBAs & Developers
CosmosDB for DBAs & DevelopersCosmosDB for DBAs & Developers
CosmosDB for DBAs & DevelopersNiko Neugebauer
 
MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014Avinash Ramineni
 
Move your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudMove your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudCAMMS
 
Persistent Storage for Containerized Applications
Persistent Storage for Containerized ApplicationsPersistent Storage for Containerized Applications
Persistent Storage for Containerized ApplicationsColleen Corrice
 
Selecting the right persistent storage options for apps in containers Open So...
Selecting the right persistent storage options for apps in containers Open So...Selecting the right persistent storage options for apps in containers Open So...
Selecting the right persistent storage options for apps in containers Open So...bipin kunal
 
Elasticsearch in production Boston Meetup October 2014
Elasticsearch in production Boston Meetup October 2014Elasticsearch in production Boston Meetup October 2014
Elasticsearch in production Boston Meetup October 2014beiske
 
Azure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosqlAzure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosqlRiccardo Cappello
 
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Alluxio, Inc.
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep divelucenerevolution
 
InfluxDB Internals
InfluxDB InternalsInfluxDB Internals
InfluxDB InternalsInfluxData
 
Compare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBAmar Das
 
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOLradiocats
 

What's hot (20)

Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
 
Interactive ad-hoc analysis at petabyte scale with HDInsight Interactive Query
Interactive ad-hoc analysis at petabyte scale with HDInsight Interactive QueryInteractive ad-hoc analysis at petabyte scale with HDInsight Interactive Query
Interactive ad-hoc analysis at petabyte scale with HDInsight Interactive Query
 
Five essential new enhancements in azure HDnsight
Five essential new enhancements in azure HDnsightFive essential new enhancements in azure HDnsight
Five essential new enhancements in azure HDnsight
 
Webinar: Solr & Fusion for Big Data
Webinar: Solr & Fusion for Big DataWebinar: Solr & Fusion for Big Data
Webinar: Solr & Fusion for Big Data
 
CosmosDB for DBAs & Developers
CosmosDB for DBAs & DevelopersCosmosDB for DBAs & Developers
CosmosDB for DBAs & Developers
 
MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014
 
Move your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudMove your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in Cloud
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
Persistent Storage for Containerized Applications
Persistent Storage for Containerized ApplicationsPersistent Storage for Containerized Applications
Persistent Storage for Containerized Applications
 
Azure CosmosDB
Azure CosmosDBAzure CosmosDB
Azure CosmosDB
 
Selecting the right persistent storage options for apps in containers Open So...
Selecting the right persistent storage options for apps in containers Open So...Selecting the right persistent storage options for apps in containers Open So...
Selecting the right persistent storage options for apps in containers Open So...
 
Elasticsearch in production Boston Meetup October 2014
Elasticsearch in production Boston Meetup October 2014Elasticsearch in production Boston Meetup October 2014
Elasticsearch in production Boston Meetup October 2014
 
Azure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosqlAzure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosql
 
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep dive
 
InfluxDB Internals
InfluxDB InternalsInfluxDB Internals
InfluxDB Internals
 
NoSQL benchmarking
NoSQL benchmarkingNoSQL benchmarking
NoSQL benchmarking
 
Compare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDB
 
How and when to use NoSQL
How and when to use NoSQLHow and when to use NoSQL
How and when to use NoSQL
 
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOL
 

Similar to Is your Elastic Cluster Stable and Production Ready?

Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchJoe Alex
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataacelyc1112009
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache KuduAndriy Zabavskyy
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learnJohn D Almon
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practicelarsgeorge
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyCeph Community
 
Apache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldCloudera, Inc.
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.Taras Matyashovsky
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectureshypertable
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inRahulBhole12
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Emprovise
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit
 
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMRAmazon Web Services
 
Casual mass parallel computing
Casual mass parallel computingCasual mass parallel computing
Casual mass parallel computingaragozin
 

Similar to Is your Elastic Cluster Stable and Production Ready? (20)

Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
 
Elasticsearch 5.0
Elasticsearch 5.0Elasticsearch 5.0
Elasticsearch 5.0
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case Study
 
Apache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode Meetup, London
Apache Geode Meetup, London
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectures
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Hazelcast 101
Hazelcast 101Hazelcast 101
Hazelcast 101
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
 
Casual mass parallel computing
Casual mass parallel computingCasual mass parallel computing
Casual mass parallel computing
 

More from DoiT International

Terraform Modules Restructured
Terraform Modules RestructuredTerraform Modules Restructured
Terraform Modules RestructuredDoiT International
 
GAN training with Tensorflow and Tensor Cores
GAN training with Tensorflow and Tensor CoresGAN training with Tensorflow and Tensor Cores
GAN training with Tensorflow and Tensor CoresDoiT International
 
Orchestrating Redis & K8s Operators
Orchestrating Redis & K8s OperatorsOrchestrating Redis & K8s Operators
Orchestrating Redis & K8s OperatorsDoiT International
 
K8s best practices from the field!
K8s best practices from the field!K8s best practices from the field!
K8s best practices from the field!DoiT International
 
An Open-Source Platform to Connect, Manage, and Secure Microservices
An Open-Source Platform to Connect, Manage, and Secure MicroservicesAn Open-Source Platform to Connect, Manage, and Secure Microservices
An Open-Source Platform to Connect, Manage, and Secure MicroservicesDoiT International
 
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
 
AWS Cyber Security Best Practices
AWS Cyber Security Best PracticesAWS Cyber Security Best Practices
AWS Cyber Security Best PracticesDoiT International
 
Amazon Athena Hands-On Workshop
Amazon Athena Hands-On WorkshopAmazon Athena Hands-On Workshop
Amazon Athena Hands-On WorkshopDoiT International
 
AWS Athena vs. Google BigQuery for interactive SQL Queries
AWS Athena vs. Google BigQuery for interactive SQL QueriesAWS Athena vs. Google BigQuery for interactive SQL Queries
AWS Athena vs. Google BigQuery for interactive SQL QueriesDoiT International
 
Google BigQuery 101 & What’s New
Google BigQuery 101 & What’s NewGoogle BigQuery 101 & What’s New
Google BigQuery 101 & What’s NewDoiT International
 
Running Production-Grade Kubernetes on AWS
Running Production-Grade Kubernetes on AWSRunning Production-Grade Kubernetes on AWS
Running Production-Grade Kubernetes on AWSDoiT International
 
Scaling Jenkins with Kubernetes by Ami Mahloof
Scaling Jenkins with Kubernetes by Ami MahloofScaling Jenkins with Kubernetes by Ami Mahloof
Scaling Jenkins with Kubernetes by Ami MahloofDoiT International
 
CI Implementation with Kubernetes at LivePerson by Saar Demri
CI Implementation with Kubernetes at LivePerson by Saar DemriCI Implementation with Kubernetes at LivePerson by Saar Demri
CI Implementation with Kubernetes at LivePerson by Saar DemriDoiT International
 
Kubernetes @ Nanit by Chen Fisher
Kubernetes @ Nanit by Chen FisherKubernetes @ Nanit by Chen Fisher
Kubernetes @ Nanit by Chen FisherDoiT International
 
Dataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
 
Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)DoiT International
 

More from DoiT International (19)

Terraform Modules Restructured
Terraform Modules RestructuredTerraform Modules Restructured
Terraform Modules Restructured
 
GAN training with Tensorflow and Tensor Cores
GAN training with Tensorflow and Tensor CoresGAN training with Tensorflow and Tensor Cores
GAN training with Tensorflow and Tensor Cores
 
Orchestrating Redis & K8s Operators
Orchestrating Redis & K8s OperatorsOrchestrating Redis & K8s Operators
Orchestrating Redis & K8s Operators
 
K8s best practices from the field!
K8s best practices from the field!K8s best practices from the field!
K8s best practices from the field!
 
An Open-Source Platform to Connect, Manage, and Secure Microservices
An Open-Source Platform to Connect, Manage, and Secure MicroservicesAn Open-Source Platform to Connect, Manage, and Secure Microservices
An Open-Source Platform to Connect, Manage, and Secure Microservices
 
Applying ML for Log Analysis
Applying ML for Log AnalysisApplying ML for Log Analysis
Applying ML for Log Analysis
 
GCP for AWS Professionals
GCP for AWS ProfessionalsGCP for AWS Professionals
GCP for AWS Professionals
 
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
 
AWS Cyber Security Best Practices
AWS Cyber Security Best PracticesAWS Cyber Security Best Practices
AWS Cyber Security Best Practices
 
Google Cloud Spanner Preview
Google Cloud Spanner PreviewGoogle Cloud Spanner Preview
Google Cloud Spanner Preview
 
Amazon Athena Hands-On Workshop
Amazon Athena Hands-On WorkshopAmazon Athena Hands-On Workshop
Amazon Athena Hands-On Workshop
 
AWS Athena vs. Google BigQuery for interactive SQL Queries
AWS Athena vs. Google BigQuery for interactive SQL QueriesAWS Athena vs. Google BigQuery for interactive SQL Queries
AWS Athena vs. Google BigQuery for interactive SQL Queries
 
Google BigQuery 101 & What’s New
Google BigQuery 101 & What’s NewGoogle BigQuery 101 & What’s New
Google BigQuery 101 & What’s New
 
Running Production-Grade Kubernetes on AWS
Running Production-Grade Kubernetes on AWSRunning Production-Grade Kubernetes on AWS
Running Production-Grade Kubernetes on AWS
 
Scaling Jenkins with Kubernetes by Ami Mahloof
Scaling Jenkins with Kubernetes by Ami MahloofScaling Jenkins with Kubernetes by Ami Mahloof
Scaling Jenkins with Kubernetes by Ami Mahloof
 
CI Implementation with Kubernetes at LivePerson by Saar Demri
CI Implementation with Kubernetes at LivePerson by Saar DemriCI Implementation with Kubernetes at LivePerson by Saar Demri
CI Implementation with Kubernetes at LivePerson by Saar Demri
 
Kubernetes @ Nanit by Chen Fisher
Kubernetes @ Nanit by Chen FisherKubernetes @ Nanit by Chen Fisher
Kubernetes @ Nanit by Chen Fisher
 
Dataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data Processing
 
Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)
 

Recently uploaded

Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Servicegwenoracqe6
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445ruhi
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...tanu pandey
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Delhi Call girls
 
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663Call Girls Mumbai
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)Damian Radcliffe
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...singhpriety023
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.soniya singh
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableSeo
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 

Recently uploaded (20)

Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶
@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶
@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 

Is your Elastic Cluster Stable and Production Ready?

  • 1. Is your Elasticsearch Cluster Production Ready? Itamar Syn-Hershko http://code972.com | @synhershko http://BigDataBoutique.co.il
  • 3. What does it take? • Cluster deployed using best practices • Thorough monitoring • Inspect. Fix. Repeat. • Good capacity planning • Memory management • Indexing and sharding strategy • Security
  • 4. Cluster Topology Master-eligible nodes (3) Data nodes (sizing by data) Client nodes, aka coordinating nodes (scalable, sizing by traffic)
  • 5. Deployments • Prefer immutable images & scripted deployments • For AWS see https://github.com/synhershko/elasticsearch- cloud-deploy/ • GCP coming soon
  • 6. Backups • Very efficient • Very important • Several storages supported • To a shared file system • HDFS • Azure / GCP / AWS repositories via plugins
  • 7. What to monitor (on the cluster, per host)? • CPU load • Memory utilization • Heap utilization • GC time • Disk utilization • Disk IOPs • Merges • Deleted docs • Requests per sec (indexing, search) • Load average < number of cores • Network in / out • Thread pool rejections • Number of nodes • Cache sizes • Cache evictions • Cluster state / health • Number of shards per type
  • 9. Grafana dashboards • More fine-grained, cluster-wide view • Provided with metrics polling script (Python) https://github.com/synhershko/elasticsearch-grafana-monitoring
  • 10. Monitoring Destination • To the same cluster • To a different cluster (Recommended) • External systems (e.g. graphite) – only if already in org • X-Pack subscribers can now send metrics to Elastic Cloud
  • 13. Correlating metrics • Shards on the same node have issues? • During merges? • CPU and GC • HTTP traffic and indexing or search operations
  • 15. Boosting slow operations • Search or Indexing heavy? • Measure operations also from applications side! • Slow searches • Queries need optimization • Scoring (not using filters) • Numeric ranges pre-5 • Scripts • Slow indexing • Sharding strategy • Use bulk indexing (optimize for 10-15MB of data, regardless of number of documents / operations) • Slow analyzers affects both! (e.g. n-grams)
  • 16. Don’t use NGrams! • Being used for “contains” search • You ain’t gonna need it, use WordDelimiter Token Filter instead • Useful for fuzzy search / auto-correction • Best used via Elasticsearch’s Suggesters • Useful for languages without spaces, or with compound words • min_gram , max_gram
  • 17. Caches • Query cache • Request cache • Measure evictions rate & cache usage
  • 18. Memory Allocation • ES_HEAP_SIZE • DocValues used? • Fielddata usage • Query cache (for queries in filter context) • Request cache (for aggregations and count queries) • Never over 32GB! • Default cache sizes not always fit usage • Set appropriate static configs in elasticsearch.yml • At least 50% of memory to file-system cache • Usually more
  • 19. Server Sizing • Master nodes • 1-2 cores, 2-4 GB memory, 50% ES_HEAP_SIZE • Data nodes • > 4 cores, measure and preserve disk/mem ratio (can start with 1/24) • ES_HEAP_SIZE as per previous slide • Client nodes • CPU and network heavy, 4GB memory should be enough for most use cases
  • 20. Index Management Patterns • A Monolith Index • Search façade on top of your data • Record linkage • Anomaly detection • Rolling indexes (time based events) • Centralized logging • Auditing • IoT logs-2016.11.20 logs-2016.11.21 logs-2016.11.22 logs-2016.11.23logs-2016.11.19
  • 21. Optimal shard size • Few millions in document size, for search performance • A bit more if only doing aggregations • 5-8GB on disk max, for startup times and network reallocation • doc_values are enabled by default, turn off for non-aggs fields to save space
  • 22. Sharding • Index Shards • Resharding / auto-sharding not supported • Index-level sharding • Avoid using types (deprecated > 6.x) • Multi-tenancy • Rollover API (> 5.x) • Cluster level • Cluster per project • Cross-cluster search capability
  • 23. Multitenancy • Silos – Every tenant get their own index • Index sizes vary • Potentially wasting resources • Pool – All tenants are in one big index • Sharding isn’t dynamic • Effects on tf/idf, aggregations, throughput • Hybrid – Big tenants in their own index, pool(s) for small ones
  • 24. Use Explicit Mapping (aka Avoid Schemaless) • In one of two ways: • Disable dynamic mapping in settings (index.mapper.dynamic: false). Will refuse indexing. • Create catch-all dynamic template with enabled:false mapping • Why? • Avoids hundreds of fields by mistake • Saves effort on indexing and disk space • Defaults are bad anyhow, don’t rely on them • Prefer using index templates (especially for rolling indices)
  • 25. Re-balancing is your enemy • Lock down shard rebalancing • cluster.routing.rebalance.enable • none • cluster.routing.allocation.enable • primaries • new_primaries • none
  • 26. More safe configs • action.disable_delete_all_indices: true • action.auto_create_index: false
  • 27. Deep paging (don’t!) • Don’t from-size • search_after (> 5.x) • Scroll and sliced-scroll (> 5.x) • Not for normal operation
  • 28. Deletions • Deletions have an overhead • Slow searches • Segmentation • More work on segment merging • Non-exact tf/idf • Every document update is a deletion • No need to avoid it completely, just design accordingly
  • 29. Geographic Distribution • Never with the same cluster! • Cross-cluster search (formerly Tribe Node) • For geographic sharding • Different indexes in different regions • xDCR for HA / DR • Can be solved by infra – replicating queues (Kafka), DBs • Solution coming in X-Pack
  • 30. Your ingestion architecture? • Favor external ingestion, relieve Elastic from that responsibility • Upgrade Logstash to 5.x • Consider using FileBeat instead of logstash for log-tailing • Prefer logstash machines over ingest nodes • Use queues (Kafka, Redis) to protect against surges
  • 32. Protecting your cluster • Don’t bind to a public IP • Use only private IP/DNSs, preferably in subnets (e.g. AWS VPC) • network.host in elasticsearch.yml • Proxy all client requests to ES • Disable HTTP where not needed • + Don’t use default ports • Secure publicly available client nodes • Access via VPN only • At the very least SSL + authentication if VPN not an option • Disable dynamic scripting (pre-5.x)
  • 33. Securing Indexes and Documents • Heavy Kibana user? • Authentication and authorization • Index, Document and Field level security • Requires X-Pack Security • Application level authentication and authorization • Application filtering of content (fields, documents) • Index level (e.g. index per tenant) • Document level (using permissions) • Inter-node comms, encryption at rest (X-Pack only)
  • 34. Upcoming in ES land • Elasticsearch 6 • Machine Learning • Anomaly detection on time series data • Enterprise Cloud • Elastic Cloud deployed on-premise • Any plugin authors in the crowd?
  • 35. Elasticsearch Training Elasticsearch for Developers & Maintaining Elasticsearch in Production • September (10,11,17/9) • November (12,13,16/11) http://bdbq.co.il/courses Consultancy and Development services http://bdbq.co.il/services/elasticsearch
  • 36. Questions? @synhershko on social (Twitter, github, …) Blog at http://code972.com Training and consultancy at http://BigDataBoutique.co.il