Arquitectura a escala

Elastic {ON} Tour
Architecture at Scale
Oscar Cabanillas
Solutions Architect

Advertencia:
¡Vamos a ir RAPIDO!
https://ela.st/elasticon-20-architecture

Aprendiendo a tocar
4
Convierte tus clusters en entornos seguros
Es gratis*, como la cerveza, 
que tendremos después 
 
Como mínimo establece Authentication & TLS
*Características de Seguridad incluidas en Licencia Básica 6.8 / 7.1
+
¡Seguridad!

Aprendiendo a tocar
5
Schema-on-Write vs. Schema-on-Read
¿Cuanto tiempo esperarás para obtener los datos que buscas?
vs.

Aprendiendo a tocar
6
Normalizar tus datos - Elastic Common Schema
ECS
Hello
Hello
Hello
Hello
Hello
Hello

Aprendiendo a tocar
7
Shards y Replicas
Shards (por índice)
• ¡Menos es mas! (50GB shard > 5 shards de 10GB cada uno)
• Buen rango de tamaño para los shards es 30-80GB
Réplicas (arranca con N+1)
• Añade más para mayor procesamiento de consultas & tolerancia a fallos

Manteniendo el ritmo
9
Infraestructura
Bare
Metal
Leverage Elastic HA / DR Capabilities
Any old Box
Virtualization
VM / SAN
Thin Provisioning
Hyper
Coveraged
Resource Contention

Sin instrumentos prestados
10
Solo algunos recursos pueden ser fácilmente compartidos
Machine
Learning
Data Master
Ingest Coord.
Data Master
Ingest Coord.
Data Master
Ingest Coord.
APM

11
Definición de los índices
• Siempre establece tus propios
mappings 
- Los campos pueden tener uno o
más tipos de datos, piensa cuales
son necesarios (text, keyword,
date, etc.)
- No todos los campos se necesitan
indexar
• Utilizar templates para simpliﬁcar/
estandarizar el proceso de creación
de índices
• Deﬁnir alías puede venirte muy bien
Aprendiendo las
Escalas

Saber cuando improvisar
12
Gestión Automatica de los campos de nuestros índices
• “dynamic” : true
- Los campos nuevos se añaden al mapping (default)
• “dynamic” : false
- Los campos nuevos se ignoran. No se puede buscar por ellos, pero
se devuelven y no se añaden al mapping
• “dynamic” : strict
- Si nuevos campos se añaden, se lanza una excepción y el
documento es rechazado. Nuevos campos pueden ser añadidos al
mapping

• Cluster
• Nodes
• Indices
• Kibana
• Logstash
• Beats
• APM
Grabar el Concierto
13
Monitorización

Ahora necesitamos ayuda con el equipo…
15
Reduce resource contention for discrete functions
• Data Nodes
• Master Nodes
• Coordinator Nodes
• Ingest Nodes
• Machine Learning Nodes
cluster
Ingest Node
Data Node
ML Node
Data Node
ML Node
Data Node
Data Node Data Node Data Node Data Node
Master Node
Data Node
Coordinating Node
Master Node
Data Node
Coordinating Node
Master Node
Data Node
Coordinating Node

Organizando un Grupo en Crecimiento
16
Shard routing - built-in traffic cop for directing your data
• Dirigir datos a nodos/
hardware especíﬁco
(Hot/Warm/Cold)
• Mantener resilience
mediante distribución de
réplicas
• Creación de arquitecturas
personalizadas.
cluster
Ingest Node
Coordinating Node ML Node
Hot Data Node Warm Data Node Cold Data Node
Master Node
ML Node
Master NodeMaster Node
Warm Data Node

Comprender las Voces
17
Mapear con los controles de seguridad de tu organización

Momento de modificar la lista de canciones
18
Automatizar la gestión del ciclo de vida de los datos con políticas
• Usar fechas o tamaños
para mover datos en
fases:
- Hot/Warm/Cold
- Frozen indices
• Index Lifestyle
Management
• Snapshot Lifestyle
Management

• min, max, avg, count, sum
• cardinality percentiles
• ﬂexible bucketing & ﬁltering:
◦ time
◦ histograms
◦ terms
Organizar la Lista de Canciones
19
Ahorrar espacio y ser más rápido en datos basados en series temporales
• Ahorrar espacio con
meta-metrics
• Consultas más
rápidas
• Agregaciones más
rápidas

Rollups para Consultas Rápidas sobre Metric Data Sets
20
Raw Minute Hour Day
Docs: 9,041,000 1,448,285 49,554 8,447
Size: 2.23 gb 1.25gb 48.40mb 9.10mb
Docs % Change: -83.98% -99.45% -99.91%
Size % Change: -43.68% -97.84% -99.59%
Ahorrar espacio y ser más rápido en datos basados en series temporales

Alcanzar la excelencia
21
Operaciones Avanzadas sobre Índices
• Rollover API 
https://www.elastic.co/guide/en/elasticsearch/reference/7.3/indices-shrink-index.html
• Split API 
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-split-index.html
• Shrink API 
https://www.elastic.co/guide/en/elasticsearch/reference/7.3/indices-shrink-index.html
• Index Sorting 
https://www.elastic.co/guide/en/elasticsearch/reference/7.3/index-modules-index-sorting.html

Distintas necesidades individuales
23
Múltiples Casos de Uso y Múltiples Clusters
METRICSSECURITY
OPERATIONAL
ANALYTICS
SEARCH
LOG
ANALYTICS
CUSTOM
APPS

KibanaES-CCS
Visibilidad de toda la Banda
24
Cross Cluster Search
Dev Team Elasticsearch Clusters Support Team
Kibana ES-CCS
Billing Team
Kibana ES-CCS
Marketing Team
KibanaES-CCS
Logging Security
Search
Metrics Apps

Cada primer instrumento necesita su segundo
25
Cross Cluster Replication
Disaster Recovery Data Locality Central Reporting
Pro DC
DR DC
Leader Follower
Central DC
Canada DC Singapore DC
Canada DC Singapore DC
Central
Reporting
DC

Mas artistas, mas retos
26
Mayores preocupaciones de Gestión
• Perﬁles Hardware
• Ciclo de vida de los datos
• Políticas de Actualización
• Escalado
• Integración de Seguridad
“¡Tengo ampollas en mis
dedos!”

Gestionar Trios, Cuartetos, & Más
27
Orquestación entre múltiples clusters
METRICSSECURITY
OPERATIONAL
ANALYTICS SEARCH
LOG
ANALYTICS
CUSTOM
APPS

La solución oﬁcial de
Gestión completa de
clusters de Elasticsearch y
Kibana
ESS
Disponible en AWS, GCP
y Azure
Elegir el lugar adecuado
28
Descarga el software e
instalado en tu entorno
Orquestación del Stack de
Elastic. Gestión
centralizada de múltiples
clusters y versiones.
ECE/ECKSelf-Managed
Desplegar en cualquier
sitio
Desplegar en cualquier
sitio
El mejor software y soporte Orquestación Completa Hosting por Elastic

Beneficios del Cloud
30
Self Managed ECE/ECK ESS
Shard Sizing & Mapping
Hardware Provisioning
Snapshot Repository Management * (unless you want to)
Scaling Deployments
Zero Downtime Upgrades
Hot/Warm Architecture
Shard Routing Across AZs
Secure Nodes Communication
Do it
Yourself
Done for
You

Learning to Play
33
Shards & replicas
Shards
• Start with 1 primary shard per index (default starting 7.0)
• How many per node?
- Max 20 Shards per GB of JVM Heap
- 30 GB Heap = MAXIMUM 600 Shards
• Add more to scale for ingest volume
• Run _forcemerge once the index becomes read only
Replicas
• Keep in mind more replicas = slower writes
• Only add more replicas if your use case is search heavy

Managing Indices with Shard Splitting
34
Add index capacity after the fact
• Fewer up-front concerns about choosing the best number of shards
• Scale up based on need
• Complements the shrink API
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
number_of_shards: 1
_split
_split
_split
_split

Shrinking Indices
35
Consolidating for long-term retention
• Save space on old indices with long-term retention

Index Sorting
36
Faster sorted queries by optimizing on-disk layout
Optimize on-disk format for
some search use cases
Improve query
performance at the cost of
index performance
Queries can return early if
sorted the same as the
index time sort
5.x
Player 1 Score: 600
6.x
Player 2 Score: 0
Player 3 Score: 200
Player 4 Score: 700
Player 5 Score: 300
Player 1907 Score: 800
…
Queryfortop3playerscores
Player 1907 Score: 800
Player 4 Score: 700
Player 1 Score: 600
Player 5 Score: 300
Player 3 Score: 200
Player 2 Score: 0
…
Queryfortop3playerscores

Index Sorting
37
Save space and execute faster on time series data

Add Dynamic Setting
38
Elasticsearch template
curl –XPUT localhost:9200/_template/template_1
{
"index_patterns" : ["windows-*"],
"order" : 0,
"settings" : {...},
"mappings" : {
"docs" : {
"dynamic": true, // default value
"properties": {...}
}
}
}

Rally
39
Elastic’s open-source benchmarking framework
https://github.com/elastic/rally

Good Foundation
40
Authentication is a must
Enable Authentication (on all nodes)
• Elasticsearch.yml
xpack.security.enabled: true
• Setup Passwords
bin/elasticsearch-setup-passwords interactive
It doesn’t get easier than that!

Good Foundation
41
Encryption in flight
Enable TLS on each node
• Generate a wildcard cert
bin/elasticsearch-certutil cert --ca myCert.p12
• elasticsearch.yml
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.veriﬁcation_mode: certiﬁcate
xpack.security.transport.ssl.keystore.path: certs/myCert.p12
xpack.security.transport.ssl.truststore.path: certs/myCert.p12

Good Foundation
42
Enable TLS in Kibana
• Generate a Kibana cert
➡ Set the certiﬁcate’s subjectAltName to the hostname, FQDN, or IP
address, or set the CN to the hostname or FQDN
• Edit kibana.yml on the Kibana node
server.ssl.enabled: true
server.ssl.key: /path/to/your/server.key
server.ssl.certiﬁcate: /path/to/your/server.crt

Good Foundation
43
Enable TLS in Beats
• e.g. metricbeat.yml
setup.kibana.host: “http://10.0.011”
setup.kibana.username: “elastic”
setup.kibana.password: “Str0ngPassw0rd”
output.elasticsearch.hosts: “http://10.0.010"]
output.elasticsearch.username: “elastic”
output.elasticsearch.password: “Str0ngPassw0rd”
Use these for ESS:
• cloud.id
• cloud.auth

Good Foundation
44
Enable TLS in Logstash
• logstash.conf
output {
elasticsearch {
hosts =>[“http://10.0.010”]
user => elastic
Password => Str0ngPassw0rd
}
}
Use these for ESS:
• cloud.id
• cloud.auth

Arquitectura a escala

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Arquitectura a escala

Similar a Arquitectura a escala (20)

Más de Elasticsearch

Más de Elasticsearch (20)

Último

Último (20)

Arquitectura a escala