SlideShare una empresa de Scribd logo
1 de 15
Descargar para leer sin conexión
Scaling Elasticsearch on Kubernetes
By Ryan Staatz
Fast multi-cloud logging
What is Elasticsearch (ES) and why would I use it?
● Elasticsearch is a distributed full-text search engine that is queryable via a JSON API
● It’s the ‘E’ in the popular ELK stack and allows easy searching of unstructured data
● Native distributed clustering support makes adding Elasticsearch nodes easy
● You’ve been watching the Elasticsearch hype train and want to hop aboard
In brief:
Presentation by Ryan Staatz
Fast multi-cloud logging
What is Kubernetes (k8s) and why would I run ES on it?
● Kubernetes is an open-source container orchestration platform developed by Google
● Scheduling & distributing application workloads onto hardware resources is automatic
● Configuration as code & static docker images enforce consistent pod behaviors
● You’ve been watching the Kubernetes hype train ship and want to hop aboard
In brief:
Presentation by Ryan Staatz
Fast multi-cloud logging
At LogDNA we run ES on k8s at scale
● We needed a consistent way to deploy our software across varying infrastructures
● There are a number of custom modifications we have made to Elasticsearch interfaces
● We run in-house versions of the L (Logstash) and K (Kibana) of the ELK stack
● Kubernetes enables easier automation for versioning, CI/CD, and maintenance
Both cloud and on-prem!
Presentation by Ryan Staatz
Fast multi-cloud logging
So managing ES with Kubernetes should be easy, right?
● Choose the appropriate Elasticsearch version and select the correct settings (there are hundreds of settings)
● Learn the expansive query language for Elasticsearch and integrate it into your workflows
● Set up a Kubernetes environment with access to appropriately sized hardware
● Configure the Elasticsearch k8s workload to request the appropriate resources, including disks
● Ensure the correct index templates and cluster settings are applied after launching your ES cluster
● Create k8s services such that Elasticsearch pods can find each other
● Troubleshoot all remaining issues as they arise and continue to manage and scale the cluster
These are some of the steps involved in running ES on Kubernetes:
Sounds great, let’s get started!
Presentation by Ryan Staatz
Fast multi-cloud logging
Getting started
● ES version 5.5 & Kubernetes cluster v1.11+ (for preemption)
● Hardware resources (k8s nodes) with at least 64 GB of RAM and 16 vCPUs (depends on your volume)
● Statefulsets and Services yaml configurations (we need identity, disks, and networking)
● Basic, but important cluster settings & a good starter index template
● Deploy an ES cluster management GUI (cerebro) to help with troubleshooting
Maybe let’s just start with some sane defaults
Presentation by Ryan Staatz
Fast multi-cloud logging
A tale of too (many) yamls
● Two ConfigMaps:
○ The elasticsearch configuration file
○ A start script used to configure ulimits, permissions, and JVM heap size
● Three ES role types (statefulsets)
○ Master - handles lightweight cluster-wide actions (does not require disk)
○ Hot - handles incoming writes to active indices (higher cpu to disk ratio)
○ Cold - stores and queries older indices (lower cpu to disk ratio)
There’s going to be a lot of these, but configuration as code is good!
Presentation by Ryan Staatz
Fast multi-cloud logging
Important ES configuration notes
● Use the alpine flavor of ES to reduce image size: elasticsearch:5.5.2-alpine
● Configure volumeClaimTemplates to dynamically provision disks
● Ensure the correct security context settings are specified in each statefulset
● Use k8s pre-emption to ensure your ES pods get scheduling priority
● Create a startup script to set the correct configuration prior to starting the JVM
Pro tip: this slide contains several pro tips
Presentation by Ryan Staatz
Fast multi-cloud logging
Service discovery
● ES hot and cold have a single load balanced cluster IP service endpoint for insertions
● ES masters have 2 services
○ 1 load-balanced cluster IP for transport (9300) and http API requests (9200)
○ 1 clusterIP: None used for ES unicast discovery
● 2 important settings for clusterIP: None
○ Ensure DNS is publishable immediately
○ No sessionAffinity ensures up-to-date addresses
Leverage Kubernetes’ native services
Presentation by Ryan Staatz
Fast multi-cloud logging
ES startup settings
● Ensure memory_lock is on
● Adjust the min master nodes based on the
total number of masters you have
● The clusterIP: None service from the last
slide is referenced by unicast settings
● Set the correct ES role
● Specify the number of cores
Just the ones we use
Presentation by Ryan Staatz
Fast multi-cloud logging
Configuring an index template
● Configure index.total_shards_per_node based on your expected load
○ Optimizing shards can increase performance and reduced cluster state overhead
● Set a refresh_interval that works for you
○ Higher refresh intervals offer better throughput performance at the cost of latency
○ We typically use 15-30 seconds
● Change translog.durability to async (allow asynchronous translog writes)
○ We regret not discovering this setting sooner, as it gave us 5-10x increase in performance
● Note: index templates MUST be applied AFTER the ES masters are already running
Index templates can have a huge impact on your cluster performance
Presentation by Ryan Staatz
Fast multi-cloud logging
Manage ES the GUI way: Cerebro
● Cerebro connects to your ES service endpoint(s)
● Contains an ES node/pod list and their health stats
● View indices and shards across the available data nodes
● Modify index settings, templates, and data
● Move shards around (important)
● Not all options are accessible via Cerebro
Previously kopf if you’re using ES v2.X or lower
Presentation by Ryan Staatz
Fast multi-cloud logging
Manage ES the API way
● We use Insomnia (a REST API GUI to share API calls)
● Curl works too!
● API calls we commonly use:
○ /_cluster/health
○ /_cat/pending_tasks?v
○ /_flush?force & /_cluster/reroute?retry_failed=true
A bit more work to start on, but automation is much easier
Presentation by Ryan Staatz
Fast multi-cloud logging
Wrap up
● ES requires some coaxing to properly run inside a container
○ Use the correct security context, ulimit, and vm settings
● There are native concepts in Kubernetes than can make running ES easier
○ Service discovery, volumeClaimTemplates, pre-emption, and more
○ ...or you could just use an operator! (your mileage may vary)
● Index templates have a big impact on how well your ES cluster runs
● GUIs (cerebro) and ES APIs are extremely useful for tuning performance
That was a lot of info, but here’s what to walk away with:
Presentation by Ryan Staatz
Fast Multi-Cloud Logging
Visit Booth #215
ryan@logdna.com

Más contenido relacionado

La actualidad más candente

How OpenStack is Built - Anton Weiss - OpenStack Day Israel 2016
How OpenStack is Built - Anton Weiss - OpenStack Day Israel 2016How OpenStack is Built - Anton Weiss - OpenStack Day Israel 2016
How OpenStack is Built - Anton Weiss - OpenStack Day Israel 2016Cloud Native Day Tel Aviv
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Docker, Inc.
 
DevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
DevOpsDays Taipei 2019 - Mastering IaC the DevOps WayDevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
DevOpsDays Taipei 2019 - Mastering IaC the DevOps Waysmalltown
 
Data Engineer's Lunch #46: Node.js and API calls
Data Engineer's Lunch #46: Node.js and API callsData Engineer's Lunch #46: Node.js and API calls
Data Engineer's Lunch #46: Node.js and API callsAnant Corporation
 
Openstack Infrastructure Containerization
Openstack Infrastructure ContainerizationOpenstack Infrastructure Containerization
Openstack Infrastructure ContainerizationKeith Tobin
 
Stateful set in kubernetes implementation & usecases
Stateful set in kubernetes implementation & usecases Stateful set in kubernetes implementation & usecases
Stateful set in kubernetes implementation & usecases Krishna-Kumar
 
The Kubernetes Operator Pattern - ContainerConf Nov 2017
The Kubernetes Operator Pattern - ContainerConf Nov 2017The Kubernetes Operator Pattern - ContainerConf Nov 2017
The Kubernetes Operator Pattern - ContainerConf Nov 2017Jakob Karalus
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High AvailabilityJakub Pavlik
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleSudhir Tonse
 
Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1Dmitry Skaredov
 
Heat - keep the clouds up
Heat - keep the clouds upHeat - keep the clouds up
Heat - keep the clouds upKiran Murari
 
What Does Kubernetes Look Like?: Performance Monitoring & Visualization with ...
What Does Kubernetes Look Like?: Performance Monitoring & Visualization with ...What Does Kubernetes Look Like?: Performance Monitoring & Visualization with ...
What Does Kubernetes Look Like?: Performance Monitoring & Visualization with ...InfluxData
 
Herding Kats - Netflix’s Journey to Kubernetes Public
Herding Kats - Netflix’s Journey to Kubernetes PublicHerding Kats - Netflix’s Journey to Kubernetes Public
Herding Kats - Netflix’s Journey to Kubernetes Publicaspyker
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
An Introduction to OpenStack Heat
An Introduction to OpenStack HeatAn Introduction to OpenStack Heat
An Introduction to OpenStack HeatMirantis
 
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG TorinoDistributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG TorinoCodemotion Tel Aviv
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Sourceaspyker
 

La actualidad más candente (18)

How Kubernetes make OpenStack & Ceph better
How Kubernetes make OpenStack & Ceph betterHow Kubernetes make OpenStack & Ceph better
How Kubernetes make OpenStack & Ceph better
 
How OpenStack is Built - Anton Weiss - OpenStack Day Israel 2016
How OpenStack is Built - Anton Weiss - OpenStack Day Israel 2016How OpenStack is Built - Anton Weiss - OpenStack Day Israel 2016
How OpenStack is Built - Anton Weiss - OpenStack Day Israel 2016
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
 
DevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
DevOpsDays Taipei 2019 - Mastering IaC the DevOps WayDevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
DevOpsDays Taipei 2019 - Mastering IaC the DevOps Way
 
Data Engineer's Lunch #46: Node.js and API calls
Data Engineer's Lunch #46: Node.js and API callsData Engineer's Lunch #46: Node.js and API calls
Data Engineer's Lunch #46: Node.js and API calls
 
Openstack Infrastructure Containerization
Openstack Infrastructure ContainerizationOpenstack Infrastructure Containerization
Openstack Infrastructure Containerization
 
Stateful set in kubernetes implementation & usecases
Stateful set in kubernetes implementation & usecases Stateful set in kubernetes implementation & usecases
Stateful set in kubernetes implementation & usecases
 
The Kubernetes Operator Pattern - ContainerConf Nov 2017
The Kubernetes Operator Pattern - ContainerConf Nov 2017The Kubernetes Operator Pattern - ContainerConf Nov 2017
The Kubernetes Operator Pattern - ContainerConf Nov 2017
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High Availability
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scale
 
Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1
 
Heat - keep the clouds up
Heat - keep the clouds upHeat - keep the clouds up
Heat - keep the clouds up
 
What Does Kubernetes Look Like?: Performance Monitoring & Visualization with ...
What Does Kubernetes Look Like?: Performance Monitoring & Visualization with ...What Does Kubernetes Look Like?: Performance Monitoring & Visualization with ...
What Does Kubernetes Look Like?: Performance Monitoring & Visualization with ...
 
Herding Kats - Netflix’s Journey to Kubernetes Public
Herding Kats - Netflix’s Journey to Kubernetes PublicHerding Kats - Netflix’s Journey to Kubernetes Public
Herding Kats - Netflix’s Journey to Kubernetes Public
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
An Introduction to OpenStack Heat
An Introduction to OpenStack HeatAn Introduction to OpenStack Heat
An Introduction to OpenStack Heat
 
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG TorinoDistributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 

Similar a How LogDNA Scaled Elasticsearch on Kubernetes

Kubernetes: Managed or Not Managed?
Kubernetes: Managed or Not Managed?Kubernetes: Managed or Not Managed?
Kubernetes: Managed or Not Managed?Mathieu Herbert
 
Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes Elasticsearch
 
Docker on AWS - the Right Way
Docker on AWS - the Right WayDocker on AWS - the Right Way
Docker on AWS - the Right WayAllCloud
 
Q&a on running the elastic stack on kubernetes
Q&a on running the elastic stack on kubernetesQ&a on running the elastic stack on kubernetes
Q&a on running the elastic stack on kubernetesDaliya Spasova
 
Docker on Amazon ECS
Docker on Amazon ECSDocker on Amazon ECS
Docker on Amazon ECSDeepak Kumar
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...javier ramirez
 
Deliver Docker Containers Continuously On AWS - DevOpsCon Munich 2016
Deliver Docker Containers Continuously On AWS - DevOpsCon Munich 2016Deliver Docker Containers Continuously On AWS - DevOpsCon Munich 2016
Deliver Docker Containers Continuously On AWS - DevOpsCon Munich 2016Philipp Garbe
 
Survey of open source cloud architectures
Survey of open source cloud architecturesSurvey of open source cloud architectures
Survey of open source cloud architecturesabhinav vedanbhatla
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Dave Holland
 
Build cloud like Rackspace with OpenStack Ansible
Build cloud like Rackspace with OpenStack AnsibleBuild cloud like Rackspace with OpenStack Ansible
Build cloud like Rackspace with OpenStack AnsibleJirayut Nimsaeng
 
Taking Care of Business at Office Depot with Elastic Cloud Enterprise
Taking Care of Business at Office Depot with Elastic Cloud Enterprise Taking Care of Business at Office Depot with Elastic Cloud Enterprise
Taking Care of Business at Office Depot with Elastic Cloud Enterprise Elasticsearch
 
[WSO2Con USA 2018] Deploying Applications in K8S and Docker
[WSO2Con USA 2018] Deploying Applications in K8S and Docker[WSO2Con USA 2018] Deploying Applications in K8S and Docker
[WSO2Con USA 2018] Deploying Applications in K8S and DockerWSO2
 
Database as a Service (DBaaS) on Kubernetes
Database as a Service (DBaaS) on KubernetesDatabase as a Service (DBaaS) on Kubernetes
Database as a Service (DBaaS) on KubernetesObjectRocket
 
Docker Container automatisiert nach AWS deployen - Continuous Lifecycle 2016
Docker Container automatisiert nach AWS deployen  - Continuous Lifecycle 2016Docker Container automatisiert nach AWS deployen  - Continuous Lifecycle 2016
Docker Container automatisiert nach AWS deployen - Continuous Lifecycle 2016Philipp Garbe
 
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
[WSO2Con Asia 2018] Deploying Applications in K8S and DockerWSO2
 
Velocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ NetflixVelocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ Netflixaspyker
 
Netflix and Containers: Not A Stranger Thing
Netflix and Containers:  Not A Stranger ThingNetflix and Containers:  Not A Stranger Thing
Netflix and Containers: Not A Stranger Thingaspyker
 
Netflix and Containers: Not Stranger Things
Netflix and Containers: Not Stranger ThingsNetflix and Containers: Not Stranger Things
Netflix and Containers: Not Stranger ThingsAll Things Open
 

Similar a How LogDNA Scaled Elasticsearch on Kubernetes (20)

Kubernetes: Managed or Not Managed?
Kubernetes: Managed or Not Managed?Kubernetes: Managed or Not Managed?
Kubernetes: Managed or Not Managed?
 
Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes
 
Docker on AWS - the Right Way
Docker on AWS - the Right WayDocker on AWS - the Right Way
Docker on AWS - the Right Way
 
Q&a on running the elastic stack on kubernetes
Q&a on running the elastic stack on kubernetesQ&a on running the elastic stack on kubernetes
Q&a on running the elastic stack on kubernetes
 
Docker on Amazon ECS
Docker on Amazon ECSDocker on Amazon ECS
Docker on Amazon ECS
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
Deliver Docker Containers Continuously On AWS - DevOpsCon Munich 2016
Deliver Docker Containers Continuously On AWS - DevOpsCon Munich 2016Deliver Docker Containers Continuously On AWS - DevOpsCon Munich 2016
Deliver Docker Containers Continuously On AWS - DevOpsCon Munich 2016
 
Survey of open source cloud architectures
Survey of open source cloud architecturesSurvey of open source cloud architectures
Survey of open source cloud architectures
 
Kubernetes intro
Kubernetes introKubernetes intro
Kubernetes intro
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Build cloud like Rackspace with OpenStack Ansible
Build cloud like Rackspace with OpenStack AnsibleBuild cloud like Rackspace with OpenStack Ansible
Build cloud like Rackspace with OpenStack Ansible
 
Taking Care of Business at Office Depot with Elastic Cloud Enterprise
Taking Care of Business at Office Depot with Elastic Cloud Enterprise Taking Care of Business at Office Depot with Elastic Cloud Enterprise
Taking Care of Business at Office Depot with Elastic Cloud Enterprise
 
[WSO2Con USA 2018] Deploying Applications in K8S and Docker
[WSO2Con USA 2018] Deploying Applications in K8S and Docker[WSO2Con USA 2018] Deploying Applications in K8S and Docker
[WSO2Con USA 2018] Deploying Applications in K8S and Docker
 
Database as a Service (DBaaS) on Kubernetes
Database as a Service (DBaaS) on KubernetesDatabase as a Service (DBaaS) on Kubernetes
Database as a Service (DBaaS) on Kubernetes
 
Docker Container automatisiert nach AWS deployen - Continuous Lifecycle 2016
Docker Container automatisiert nach AWS deployen  - Continuous Lifecycle 2016Docker Container automatisiert nach AWS deployen  - Continuous Lifecycle 2016
Docker Container automatisiert nach AWS deployen - Continuous Lifecycle 2016
 
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
 
Velocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ NetflixVelocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ Netflix
 
Netflix and Containers: Not A Stranger Thing
Netflix and Containers:  Not A Stranger ThingNetflix and Containers:  Not A Stranger Thing
Netflix and Containers: Not A Stranger Thing
 
Netflix and Containers: Not Stranger Things
Netflix and Containers: Not Stranger ThingsNetflix and Containers: Not Stranger Things
Netflix and Containers: Not Stranger Things
 

Último

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Último (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

How LogDNA Scaled Elasticsearch on Kubernetes

  • 1. Scaling Elasticsearch on Kubernetes By Ryan Staatz
  • 2. Fast multi-cloud logging What is Elasticsearch (ES) and why would I use it? ● Elasticsearch is a distributed full-text search engine that is queryable via a JSON API ● It’s the ‘E’ in the popular ELK stack and allows easy searching of unstructured data ● Native distributed clustering support makes adding Elasticsearch nodes easy ● You’ve been watching the Elasticsearch hype train and want to hop aboard In brief: Presentation by Ryan Staatz
  • 3. Fast multi-cloud logging What is Kubernetes (k8s) and why would I run ES on it? ● Kubernetes is an open-source container orchestration platform developed by Google ● Scheduling & distributing application workloads onto hardware resources is automatic ● Configuration as code & static docker images enforce consistent pod behaviors ● You’ve been watching the Kubernetes hype train ship and want to hop aboard In brief: Presentation by Ryan Staatz
  • 4. Fast multi-cloud logging At LogDNA we run ES on k8s at scale ● We needed a consistent way to deploy our software across varying infrastructures ● There are a number of custom modifications we have made to Elasticsearch interfaces ● We run in-house versions of the L (Logstash) and K (Kibana) of the ELK stack ● Kubernetes enables easier automation for versioning, CI/CD, and maintenance Both cloud and on-prem! Presentation by Ryan Staatz
  • 5. Fast multi-cloud logging So managing ES with Kubernetes should be easy, right? ● Choose the appropriate Elasticsearch version and select the correct settings (there are hundreds of settings) ● Learn the expansive query language for Elasticsearch and integrate it into your workflows ● Set up a Kubernetes environment with access to appropriately sized hardware ● Configure the Elasticsearch k8s workload to request the appropriate resources, including disks ● Ensure the correct index templates and cluster settings are applied after launching your ES cluster ● Create k8s services such that Elasticsearch pods can find each other ● Troubleshoot all remaining issues as they arise and continue to manage and scale the cluster These are some of the steps involved in running ES on Kubernetes: Sounds great, let’s get started! Presentation by Ryan Staatz
  • 6. Fast multi-cloud logging Getting started ● ES version 5.5 & Kubernetes cluster v1.11+ (for preemption) ● Hardware resources (k8s nodes) with at least 64 GB of RAM and 16 vCPUs (depends on your volume) ● Statefulsets and Services yaml configurations (we need identity, disks, and networking) ● Basic, but important cluster settings & a good starter index template ● Deploy an ES cluster management GUI (cerebro) to help with troubleshooting Maybe let’s just start with some sane defaults Presentation by Ryan Staatz
  • 7. Fast multi-cloud logging A tale of too (many) yamls ● Two ConfigMaps: ○ The elasticsearch configuration file ○ A start script used to configure ulimits, permissions, and JVM heap size ● Three ES role types (statefulsets) ○ Master - handles lightweight cluster-wide actions (does not require disk) ○ Hot - handles incoming writes to active indices (higher cpu to disk ratio) ○ Cold - stores and queries older indices (lower cpu to disk ratio) There’s going to be a lot of these, but configuration as code is good! Presentation by Ryan Staatz
  • 8. Fast multi-cloud logging Important ES configuration notes ● Use the alpine flavor of ES to reduce image size: elasticsearch:5.5.2-alpine ● Configure volumeClaimTemplates to dynamically provision disks ● Ensure the correct security context settings are specified in each statefulset ● Use k8s pre-emption to ensure your ES pods get scheduling priority ● Create a startup script to set the correct configuration prior to starting the JVM Pro tip: this slide contains several pro tips Presentation by Ryan Staatz
  • 9. Fast multi-cloud logging Service discovery ● ES hot and cold have a single load balanced cluster IP service endpoint for insertions ● ES masters have 2 services ○ 1 load-balanced cluster IP for transport (9300) and http API requests (9200) ○ 1 clusterIP: None used for ES unicast discovery ● 2 important settings for clusterIP: None ○ Ensure DNS is publishable immediately ○ No sessionAffinity ensures up-to-date addresses Leverage Kubernetes’ native services Presentation by Ryan Staatz
  • 10. Fast multi-cloud logging ES startup settings ● Ensure memory_lock is on ● Adjust the min master nodes based on the total number of masters you have ● The clusterIP: None service from the last slide is referenced by unicast settings ● Set the correct ES role ● Specify the number of cores Just the ones we use Presentation by Ryan Staatz
  • 11. Fast multi-cloud logging Configuring an index template ● Configure index.total_shards_per_node based on your expected load ○ Optimizing shards can increase performance and reduced cluster state overhead ● Set a refresh_interval that works for you ○ Higher refresh intervals offer better throughput performance at the cost of latency ○ We typically use 15-30 seconds ● Change translog.durability to async (allow asynchronous translog writes) ○ We regret not discovering this setting sooner, as it gave us 5-10x increase in performance ● Note: index templates MUST be applied AFTER the ES masters are already running Index templates can have a huge impact on your cluster performance Presentation by Ryan Staatz
  • 12. Fast multi-cloud logging Manage ES the GUI way: Cerebro ● Cerebro connects to your ES service endpoint(s) ● Contains an ES node/pod list and their health stats ● View indices and shards across the available data nodes ● Modify index settings, templates, and data ● Move shards around (important) ● Not all options are accessible via Cerebro Previously kopf if you’re using ES v2.X or lower Presentation by Ryan Staatz
  • 13. Fast multi-cloud logging Manage ES the API way ● We use Insomnia (a REST API GUI to share API calls) ● Curl works too! ● API calls we commonly use: ○ /_cluster/health ○ /_cat/pending_tasks?v ○ /_flush?force & /_cluster/reroute?retry_failed=true A bit more work to start on, but automation is much easier Presentation by Ryan Staatz
  • 14. Fast multi-cloud logging Wrap up ● ES requires some coaxing to properly run inside a container ○ Use the correct security context, ulimit, and vm settings ● There are native concepts in Kubernetes than can make running ES easier ○ Service discovery, volumeClaimTemplates, pre-emption, and more ○ ...or you could just use an operator! (your mileage may vary) ● Index templates have a big impact on how well your ES cluster runs ● GUIs (cerebro) and ES APIs are extremely useful for tuning performance That was a lot of info, but here’s what to walk away with: Presentation by Ryan Staatz
  • 15. Fast Multi-Cloud Logging Visit Booth #215 ryan@logdna.com