SlideShare una empresa de Scribd logo
1 de 56
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Introduction to
Amazon Elasticsearch Service
Darin Briskman
AWS Technical Evangelist
briskman@amazon.com
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Real-time, distributed, search & analytics
engine:
• Built on top of Apache Lucene
• Schema free
• Developer friendly RESTful API
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is the ELK stack?
Logstash – simple tool for transforming and streaming data into ES
Elasticsearch – distributed search engine based on Lucene
Kibana – Easy to use tool for visualization of data in Elasticsearch
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Elasticsearch for trends and patterns
Aggregations:
• Buckets (like GROUP BY in SQL)
• Metrics (like COUNT, SUM, MAX etc)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Elasticsearch for full-text search
• Core search is provided through Apache Lucene
• Elasticsearch handles JSON structure, including nesting
• Aggregations to provide faceting++
• Supports core search features
• Suggestions
• Highlights
• Boolean expressions
• Adjustable ranking
• Fuzzy search and more
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Operating Elasticsearch is time-consuming
“Elasticsearch allows us to easily and quickly build bleeding edge big data
and analytics applications using the ELK stack. By offering direct access
to the Elasticsearch API while offloading administrative tasks, Amazon
Elasticsearch Service gives us the manageability, flexibility and control we
need ”
Sean Curtis,
SVP Engineering at Major League
Baseball Advanced Engineering
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Leading enterprises trust Amazon Elasticsearch
Service for their search and analytics applications
Media	&	
Entertainment
Online	
Services
Technology Other
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Adobe Developer Platform (Adobe I/O)
P R O B L E M
• Cost effective monitor
for XL amount of log
data
• Over 200,000 API calls
per second at peak -
destinations, response
times, bandwidth
• Integrate seamlessly
with other components
of AWS eco-system.
S O L U T I O N
• Log data is routed with
Amazon Kinesis to
Amazon Elasticsearch
Service, then displayed
using AES Kibana
• Adobe team can easily
see traffic patterns and
error rates, quickly
identifying anomalies and
potential challenges
B E N E F I T S
• Management and
operational simplicity
• Flexibility to try out
different cluster config
during dev and test
Amazon
Kinesis
Streams
Spark Streaming
Amazon
Elasticsearch
Service
Data
Sources
1
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
McGraw Hill Education
P R O B L E M
• Supporting a wide catalog
across multiple services in
multiple jurisdictions
• Over 100 million learning
events each month
• Tests, quizzes, learning
modules begun / completed
/ abandoned
S O L U T I O N
• Search and analyze test
results, student/teacher
interaction, teacher
effectiveness, student
progress
• Analytics of applications
and infrastructure are now
integrated to understand
operations in real time
B E N E F I T S
• Confidence to scale
throughout the school year.
From 0 to 32TB in 9 months
• Focus on their business, not
their infrastructure
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Elasticsearch fundamentals
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Building blocks - documents
• A document is the core entity of search – file,
database, log line, data structure etc., represented
with JSON
• Documents are referenced by an index
• Documents contain field-value pairs
• Documents are distributed across shards and
replicas
• You choose which fields in the document to index for
search, which to store, and with each query, which to
search
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Building blocks - shards
• A shard is an instance of Lucene, using compute, storage and other
system resources
• Elasticsearch deploys shards elastically, and dynamically to the
instances in the cluster, providing the means for horizontal scale
• An index always has at least one shard and may have any number
• Shards are primary (exactly one) or replica
(dynamic)
• Elasticsearch routes requests to the applicable
shards
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Building blocks - replicas
• A replica is a copy of a shard
• You can replicate a single shard multiple times for
availability or scale
• Replicas are never allocated on the same instance as
the original shard
• Replicas allow you to scale throughput as searches
can leverage the replicas in parallel
• The number of replicas can be changed
dynamically
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Building blocks - index
• An index is a collection of documents that has
fields that can be searched
• You access an index through any participating
instance via a restful interface
• The index spans across instances by
leveraging shards and replicas
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Building blocks - instance
• For AES, an instance is an Elasticsearch
node installed on EC2
• An instance can work with other AES
instances to facilitate distribution of work
or it can stand by itself
• In a cluster, multiple instances allow you
to scale the compute and storage that
holds your indexes
• Instances hold shards and replicas that
are referenced by the index
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Index and mappings/types
Index:
Product
Type:
cellphone
ID, make, color, etc.
ID, make, color, etc.
ID, make, color, etc.
ID, make, color, etc.
ID, make, color, etc.
Type:
review
ID, author, date, title, text
ID, author, date, title, text
ID, author, date, title, text
ID, author, date, title, text
ID, author, date, title, text
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Documents are distributed to shards randomly
• Document IDs are hashed
• You can control this behavior
Shard 1
Shard 2
Shard 3
Primary
shards
Index
Shard 1
Shard 2
Shard 3
Replica
shards
_bulk
API
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Elasticsearch Service
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Easy
Elasticsearch
cluster creation
and scaling
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon ES domain overview
Amazon Route
53
Elastic Load
Balancing
IAM
CloudWatch
Elasticsearch API
CloudTrail
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Route
53
Elastic Load
Balancing
IAM
CloudWatch
Elasticsearch API
CloudTrail
Amazon ES domain overview
Instances under management
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
IAM
CloudWatchCloudTrail
Elasticsearch API
Amazon Route
53
Elastic Load
Balancing
Amazon ES domain overview
Single endpoint, REST API
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CloudWatchCloudTrail
Elasticsearch API
Amazon Route
53
Elastic Load
Balancing
IAM
Amazon ES domain overview
IAM integration
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Elasticsearch API
Amazon Route
53
Elastic Load
Balancing
IAM
CloudWatchCloudTrail
Amazon ES domain overview
CloudWatch/CloudTrail for monitoring
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS CLI commands
add-tags
create-elasticsearch-domain
delete-elasticsearch-domain
describe-elasticsearch-domain
describe-elasticsearch-domain-
config
describe-elasticsearch-domains
list-domain-names
list-tags
remove-tags
update-elasticsearch-domain-config
aws es create-elasticsearch-domain --domain-name my-domain
--elasticsearch-cluster-config
InstanceType=m4.xlarge.elasticsearch,InstanceCount=4
--ebs-options
EBSEnabled=true,VolumeType=gp2,VolumeSize=512
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cluster reconfiguration
• Console or single API call
• You can change any parameters of the domain: Instance
types and counts, storage options, dedicated master
configuration, zone-awareness, etc.
• Amazon ES non-disruptively makes the changes
requested
• Expands the cluster with new instances
• Elasticsearch replicates data
• Old instances are dropped
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Dedicated
masters
improve
stability
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cluster with no dedicated masters
Amazon ES cluster
1
3
3
1
Instance 1,
Master
2
1
1
2
Instance 2
3
2
2
3
Instance 3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cluster with dedicated masters
Amazon ES cluster
1
3
3
1
Instance 1
2
1
1
2
Instance 2
3
2
2
3
Instance 3Dedicated master instances
Data instances: queries and updates
Adding dedicated
masters improves
stability by removing
the master function
from data instances
Amazon ES provides
1-click deployment
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Master node recommendations
Number of data nodes Master node instance type
< 10 m3.medium+
< 20 m4.large+
<= 50 c4.xlarge+
50-100 c4.2xlarge+
Always use an odd number of masters, >= 3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Zone awareness
for high
availability
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cluster with zone awareness
Amazon ES cluster
Availability Zone 1 Availability Zone 2
1
3
Instance 1
2
1 2
Instance 2
3
2
1
Instance 3
2
1
Instance 4
3
3
Cluster instances are split evenly between two zones
Primary and replica shards go to different zones
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Scale for your
workload
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Instance type recommendations
Instance Workload
T2 Entry point. Dev and test. OK for dedicated masters.
M3, M4 Equal read and write volumes.
R3, R4 Read-heavy or workloads with high memory demands (e.g.,
aggregations).
C4 High concurrency/indexing workloads
I2 Up to 1.6 TB of SSD instance storage.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Secure access
to your domain
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cluster Security
AWS manages
security of the
cluster
You manage
access to the
domain via
policies
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Security through IAM A&AC
• IP-based policies limit access to CIDR blocks for
anonymous requests
• Principal-based policies limit access to particular IAM
users or roles, requiring AWS SigV4 signing
• Policies
• Can control HTTP method allowed to create differential
access for e.g. queries and updates
• Can specify resources at the index level
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Example IAM Policy
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam:123456789012:user/susan"
},
"Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost",
"es:CreateElasticsearchDomain",
"es:ListDomainNames" ],
"Resource":
"arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*"
} ] }
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Security Responsibilities
• Creation of a service VPC that allows limited access to
the cluster through a configurable access policy
• Application of security patches on the instances
• DDOS protection for the DNS name associated with the
domain via Route53
• Facades the transport protocol with an ELB (HTTP:80)
• Hides ports 9200 and 9300
• Built on top of AWS secure networking
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Security Patches
• There is no defined maintenance window as patches
are applied using blue/green methodologies
• Critical patches applied immediately
• Routine patching typically applied during customer
activities such as cluster resizing or policy changes
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Load data
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Logstash
REST
CWL Agent
EC2 Instances
Amazon
Kinesis
Amazon
RDS
Amazon
DynamoDB
Amazon
SQS
Queue
Logstash
Cluster
Amazon
Elasticsearch
Service
Amazon
CloudWatch
AWS
Lambda
AWS
CloudTrail
Access Logs
Amazon
VPC Flow
Logs
Amazon S3
bucket
AWS IoT
Amazon Kinesis
Firehose
Integration with the AWS
ecosystem
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitor and audit
CloudWatch CloudTrail Elasticsearch APIs
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitoring
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Durability
• Read Replicas + Zone Awareness
• Automatic Daily snapshots
• Manual Index Snapshots
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Built-in Kibana
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Application overview
Logstash indexer
Amazon
Elasticsearch
Service
Application instances/
Logstash forwarders
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Pay for what you
use
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Pay for compute and storage you use
With Amazon Elasticsearch Service, you pay only for the
compute and storage resources you use. AWS Free Tier for
qualifying customers.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Easy to	Use
Deploy	a	production-ready	Elasticsearch	
cluster	in	minutes
Simplifies	time-consuming	management	
tasks	such	as	software	patching,	failure	
recovery,	backups,	and	monitoring	
Open
Get	direct	access	to	the	Elasticsearch	
open-source	API
Fully	compatible	with	the	open	source	
Elasticsearch	API,	for	all	code	and	
applications
Secure
Secure	Elasticsearch	clusters	with	AWS	
Identity	and	Access	Management	(IAM)	
policies	with	fine-grained	access	control	
access	for	users	and	endpoints
Automatically	applies	security	patches	
without	disruption,	keeping	Elasticsearch	
environments	secure
Available
Provides	high	availability	using	Zone	
Awareness,	which	replicates	data	between	
two	Availability	Zones	
Monitors	the	health	of	clusters	and	
automatically	replaces	failed	instances,	
without	service	disruption
AWS	Integrated
Integrates	with	Amazon	Kinesis	Firehose,	
AWS	IOT,	and	Amazon	CloudWatch	Logs	for	
seamless	data	ingestion
AWS	CloudTrail	for	auditing,	AWS	Identity	
and	Access	Management	(IAM)	for	
security,	and	AWS	CloudFormation	for	
cloud	orchestration
Scalable
Scale	clusters	from	a	single	instance	up	to	
100	instances
Configure	clusters	to	meet	performance	
requirements	by	selecting	from	a	range	of	
instance	types	and	storage	options	
including	SSD-powered	EBS	volumes
Amazon Elasticsearch Service Benefits
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Takeaways
1. Elasticsearch is a tool for full-text search, analysis, and
visualization of time series data that helps you get the
most out of your growing data set
2. Amazon Elasticsearch Service makes it easy to deploy
and manage an Elasticsearch cluster in the AWS cloud
3. Amazon Elasticsearch Service is a drop-in replacement
for your existing Elasticsearch cluster
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!

Más contenido relacionado

La actualidad más candente

Intro to AWS: EC2 & Compute Services
Intro to AWS: EC2 & Compute ServicesIntro to AWS: EC2 & Compute Services
Intro to AWS: EC2 & Compute ServicesAmazon Web Services
 
Getting Started with AWS Lambda Serverless Computing
Getting Started with AWS Lambda Serverless ComputingGetting Started with AWS Lambda Serverless Computing
Getting Started with AWS Lambda Serverless ComputingAmazon Web Services
 
Amazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS SummitAmazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS SummitAmazon Web Services
 
Introduction to AWS Lambda and Serverless Applications
Introduction to AWS Lambda and Serverless ApplicationsIntroduction to AWS Lambda and Serverless Applications
Introduction to AWS Lambda and Serverless ApplicationsAmazon Web Services
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Amazon Web Services
 
Introduction to Amazon Relational Database Service (Amazon RDS)
Introduction to Amazon Relational Database Service (Amazon RDS)Introduction to Amazon Relational Database Service (Amazon RDS)
Introduction to Amazon Relational Database Service (Amazon RDS)Amazon Web Services
 
Amazon CloudWatch Logs and AWS Lambda: A Match Made in Heaven
Amazon CloudWatch Logs and AWS Lambda: A Match Made in HeavenAmazon CloudWatch Logs and AWS Lambda: A Match Made in Heaven
Amazon CloudWatch Logs and AWS Lambda: A Match Made in HeavenAmazon Web Services
 
Introduction to Amazon Elastic File System (EFS)
Introduction to Amazon Elastic File System (EFS)Introduction to Amazon Elastic File System (EFS)
Introduction to Amazon Elastic File System (EFS)Amazon Web Services
 
Introduction to AWS Cost Management
Introduction to AWS Cost ManagementIntroduction to AWS Cost Management
Introduction to AWS Cost ManagementAmazon Web Services
 
DAT302_Deep Dive on Amazon Relational Database Service (RDS)
DAT302_Deep Dive on Amazon Relational Database Service (RDS)DAT302_Deep Dive on Amazon Relational Database Service (RDS)
DAT302_Deep Dive on Amazon Relational Database Service (RDS)Amazon Web Services
 
Amazon EKS - Elastic Container Service for Kubernetes
Amazon EKS - Elastic Container Service for KubernetesAmazon EKS - Elastic Container Service for Kubernetes
Amazon EKS - Elastic Container Service for KubernetesAmazon Web Services
 

La actualidad más candente (20)

Intro to AWS: EC2 & Compute Services
Intro to AWS: EC2 & Compute ServicesIntro to AWS: EC2 & Compute Services
Intro to AWS: EC2 & Compute Services
 
Getting Started with AWS Lambda Serverless Computing
Getting Started with AWS Lambda Serverless ComputingGetting Started with AWS Lambda Serverless Computing
Getting Started with AWS Lambda Serverless Computing
 
Introduction to Amazon EC2
Introduction to Amazon EC2Introduction to Amazon EC2
Introduction to Amazon EC2
 
Amazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS SummitAmazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
 
Intro to AWS Lambda
Intro to AWS Lambda Intro to AWS Lambda
Intro to AWS Lambda
 
Introduction to AWS Lambda and Serverless Applications
Introduction to AWS Lambda and Serverless ApplicationsIntroduction to AWS Lambda and Serverless Applications
Introduction to AWS Lambda and Serverless Applications
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
 
Amazon S3 Masterclass
Amazon S3 MasterclassAmazon S3 Masterclass
Amazon S3 Masterclass
 
Introduction to Amazon Relational Database Service (Amazon RDS)
Introduction to Amazon Relational Database Service (Amazon RDS)Introduction to Amazon Relational Database Service (Amazon RDS)
Introduction to Amazon Relational Database Service (Amazon RDS)
 
AWS RDS
AWS RDSAWS RDS
AWS RDS
 
Amazon CloudWatch Logs and AWS Lambda: A Match Made in Heaven
Amazon CloudWatch Logs and AWS Lambda: A Match Made in HeavenAmazon CloudWatch Logs and AWS Lambda: A Match Made in Heaven
Amazon CloudWatch Logs and AWS Lambda: A Match Made in Heaven
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
Introduction to Amazon Elastic File System (EFS)
Introduction to Amazon Elastic File System (EFS)Introduction to Amazon Elastic File System (EFS)
Introduction to Amazon Elastic File System (EFS)
 
Amazon EFS
Amazon EFSAmazon EFS
Amazon EFS
 
AWS glue technical enablement training
AWS glue technical enablement trainingAWS glue technical enablement training
AWS glue technical enablement training
 
Introduction to AWS Cost Management
Introduction to AWS Cost ManagementIntroduction to AWS Cost Management
Introduction to AWS Cost Management
 
DAT302_Deep Dive on Amazon Relational Database Service (RDS)
DAT302_Deep Dive on Amazon Relational Database Service (RDS)DAT302_Deep Dive on Amazon Relational Database Service (RDS)
DAT302_Deep Dive on Amazon Relational Database Service (RDS)
 
BDA311 Introduction to AWS Glue
BDA311 Introduction to AWS GlueBDA311 Introduction to AWS Glue
BDA311 Introduction to AWS Glue
 
Introduction to Amazon EC2
Introduction to Amazon EC2Introduction to Amazon EC2
Introduction to Amazon EC2
 
Amazon EKS - Elastic Container Service for Kubernetes
Amazon EKS - Elastic Container Service for KubernetesAmazon EKS - Elastic Container Service for Kubernetes
Amazon EKS - Elastic Container Service for Kubernetes
 

Similar a Introduction to Amazon Elasticsearch Service

How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017
How to build a data lake with aws glue data catalog (ABD213-R)  re:Invent 2017How to build a data lake with aws glue data catalog (ABD213-R)  re:Invent 2017
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017Amazon Web Services
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWSAWS Germany
 
Using Search with a Database - Peter Dachnowicz
Using Search with a Database - Peter DachnowiczUsing Search with a Database - Peter Dachnowicz
Using Search with a Database - Peter DachnowiczAmazon Web Services
 
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Amazon Web Services
 
Adding Search to DynamoDB: Database Week San Francisco
Adding Search to DynamoDB: Database Week San FranciscoAdding Search to DynamoDB: Database Week San Francisco
Adding Search to DynamoDB: Database Week San FranciscoAmazon Web Services
 
Using Search with a Database: Database Week SF
Using Search with a Database: Database Week SFUsing Search with a Database: Database Week SF
Using Search with a Database: Database Week SFAmazon Web Services
 
ABD312_Deep Dive Migrating Big Data Workloads to AWS
ABD312_Deep Dive Migrating Big Data Workloads to AWSABD312_Deep Dive Migrating Big Data Workloads to AWS
ABD312_Deep Dive Migrating Big Data Workloads to AWSAmazon Web Services
 
21st Century Analytics with Zopa
21st Century Analytics with Zopa21st Century Analytics with Zopa
21st Century Analytics with ZopaAmazon Web Services
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFAmazon Web Services
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersAmazon Web Services
 
STG316_Optimizing Storage for Big Data Workloads
STG316_Optimizing Storage for Big Data WorkloadsSTG316_Optimizing Storage for Big Data Workloads
STG316_Optimizing Storage for Big Data WorkloadsAmazon Web Services
 
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Amazon Web Services
 
Amazon Elasticsearch and Databases
Amazon Elasticsearch and DatabasesAmazon Elasticsearch and Databases
Amazon Elasticsearch and DatabasesAmazon Web Services
 

Similar a Introduction to Amazon Elasticsearch Service (20)

Elasticsearch as a Database?
Elasticsearch as a Database?Elasticsearch as a Database?
Elasticsearch as a Database?
 
Elasticsearch as a Database?
Elasticsearch as a Database?Elasticsearch as a Database?
Elasticsearch as a Database?
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017
How to build a data lake with aws glue data catalog (ABD213-R)  re:Invent 2017How to build a data lake with aws glue data catalog (ABD213-R)  re:Invent 2017
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
Using Search with a Database - Peter Dachnowicz
Using Search with a Database - Peter DachnowiczUsing Search with a Database - Peter Dachnowicz
Using Search with a Database - Peter Dachnowicz
 
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
 
Adding Search to DynamoDB: Database Week San Francisco
Adding Search to DynamoDB: Database Week San FranciscoAdding Search to DynamoDB: Database Week San Francisco
Adding Search to DynamoDB: Database Week San Francisco
 
Using Search with a Database: Database Week SF
Using Search with a Database: Database Week SFUsing Search with a Database: Database Week SF
Using Search with a Database: Database Week SF
 
ABD312_Deep Dive Migrating Big Data Workloads to AWS
ABD312_Deep Dive Migrating Big Data Workloads to AWSABD312_Deep Dive Migrating Big Data Workloads to AWS
ABD312_Deep Dive Migrating Big Data Workloads to AWS
 
21st Century Analytics with Zopa
21st Century Analytics with Zopa21st Century Analytics with Zopa
21st Century Analytics with Zopa
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million Users
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
STG316_Optimizing Storage for Big Data Workloads
STG316_Optimizing Storage for Big Data WorkloadsSTG316_Optimizing Storage for Big Data Workloads
STG316_Optimizing Storage for Big Data Workloads
 
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
 
Amazon Elasticsearch and Databases
Amazon Elasticsearch and DatabasesAmazon Elasticsearch and Databases
Amazon Elasticsearch and Databases
 

Más de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Introduction to Amazon Elasticsearch Service

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Introduction to Amazon Elasticsearch Service Darin Briskman AWS Technical Evangelist briskman@amazon.com
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Real-time, distributed, search & analytics engine: • Built on top of Apache Lucene • Schema free • Developer friendly RESTful API
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is the ELK stack? Logstash – simple tool for transforming and streaming data into ES Elasticsearch – distributed search engine based on Lucene Kibana – Easy to use tool for visualization of data in Elasticsearch
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Elasticsearch for trends and patterns Aggregations: • Buckets (like GROUP BY in SQL) • Metrics (like COUNT, SUM, MAX etc)
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Elasticsearch for full-text search • Core search is provided through Apache Lucene • Elasticsearch handles JSON structure, including nesting • Aggregations to provide faceting++ • Supports core search features • Suggestions • Highlights • Boolean expressions • Adjustable ranking • Fuzzy search and more
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Operating Elasticsearch is time-consuming “Elasticsearch allows us to easily and quickly build bleeding edge big data and analytics applications using the ELK stack. By offering direct access to the Elasticsearch API while offloading administrative tasks, Amazon Elasticsearch Service gives us the manageability, flexibility and control we need ” Sean Curtis, SVP Engineering at Major League Baseball Advanced Engineering
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Leading enterprises trust Amazon Elasticsearch Service for their search and analytics applications Media & Entertainment Online Services Technology Other
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Adobe Developer Platform (Adobe I/O) P R O B L E M • Cost effective monitor for XL amount of log data • Over 200,000 API calls per second at peak - destinations, response times, bandwidth • Integrate seamlessly with other components of AWS eco-system. S O L U T I O N • Log data is routed with Amazon Kinesis to Amazon Elasticsearch Service, then displayed using AES Kibana • Adobe team can easily see traffic patterns and error rates, quickly identifying anomalies and potential challenges B E N E F I T S • Management and operational simplicity • Flexibility to try out different cluster config during dev and test Amazon Kinesis Streams Spark Streaming Amazon Elasticsearch Service Data Sources 1
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. McGraw Hill Education P R O B L E M • Supporting a wide catalog across multiple services in multiple jurisdictions • Over 100 million learning events each month • Tests, quizzes, learning modules begun / completed / abandoned S O L U T I O N • Search and analyze test results, student/teacher interaction, teacher effectiveness, student progress • Analytics of applications and infrastructure are now integrated to understand operations in real time B E N E F I T S • Confidence to scale throughout the school year. From 0 to 32TB in 9 months • Focus on their business, not their infrastructure
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Elasticsearch fundamentals
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Building blocks - documents • A document is the core entity of search – file, database, log line, data structure etc., represented with JSON • Documents are referenced by an index • Documents contain field-value pairs • Documents are distributed across shards and replicas • You choose which fields in the document to index for search, which to store, and with each query, which to search
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Building blocks - shards • A shard is an instance of Lucene, using compute, storage and other system resources • Elasticsearch deploys shards elastically, and dynamically to the instances in the cluster, providing the means for horizontal scale • An index always has at least one shard and may have any number • Shards are primary (exactly one) or replica (dynamic) • Elasticsearch routes requests to the applicable shards
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Building blocks - replicas • A replica is a copy of a shard • You can replicate a single shard multiple times for availability or scale • Replicas are never allocated on the same instance as the original shard • Replicas allow you to scale throughput as searches can leverage the replicas in parallel • The number of replicas can be changed dynamically
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Building blocks - index • An index is a collection of documents that has fields that can be searched • You access an index through any participating instance via a restful interface • The index spans across instances by leveraging shards and replicas
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Building blocks - instance • For AES, an instance is an Elasticsearch node installed on EC2 • An instance can work with other AES instances to facilitate distribution of work or it can stand by itself • In a cluster, multiple instances allow you to scale the compute and storage that holds your indexes • Instances hold shards and replicas that are referenced by the index
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Index and mappings/types Index: Product Type: cellphone ID, make, color, etc. ID, make, color, etc. ID, make, color, etc. ID, make, color, etc. ID, make, color, etc. Type: review ID, author, date, title, text ID, author, date, title, text ID, author, date, title, text ID, author, date, title, text ID, author, date, title, text
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Documents are distributed to shards randomly • Document IDs are hashed • You can control this behavior Shard 1 Shard 2 Shard 3 Primary shards Index Shard 1 Shard 2 Shard 3 Replica shards _bulk API
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Elasticsearch Service
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Easy Elasticsearch cluster creation and scaling
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon ES domain overview Amazon Route 53 Elastic Load Balancing IAM CloudWatch Elasticsearch API CloudTrail
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Route 53 Elastic Load Balancing IAM CloudWatch Elasticsearch API CloudTrail Amazon ES domain overview Instances under management
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. IAM CloudWatchCloudTrail Elasticsearch API Amazon Route 53 Elastic Load Balancing Amazon ES domain overview Single endpoint, REST API
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. CloudWatchCloudTrail Elasticsearch API Amazon Route 53 Elastic Load Balancing IAM Amazon ES domain overview IAM integration
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Elasticsearch API Amazon Route 53 Elastic Load Balancing IAM CloudWatchCloudTrail Amazon ES domain overview CloudWatch/CloudTrail for monitoring
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS CLI commands add-tags create-elasticsearch-domain delete-elasticsearch-domain describe-elasticsearch-domain describe-elasticsearch-domain- config describe-elasticsearch-domains list-domain-names list-tags remove-tags update-elasticsearch-domain-config aws es create-elasticsearch-domain --domain-name my-domain --elasticsearch-cluster-config InstanceType=m4.xlarge.elasticsearch,InstanceCount=4 --ebs-options EBSEnabled=true,VolumeType=gp2,VolumeSize=512
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Cluster reconfiguration • Console or single API call • You can change any parameters of the domain: Instance types and counts, storage options, dedicated master configuration, zone-awareness, etc. • Amazon ES non-disruptively makes the changes requested • Expands the cluster with new instances • Elasticsearch replicates data • Old instances are dropped
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Dedicated masters improve stability
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Cluster with no dedicated masters Amazon ES cluster 1 3 3 1 Instance 1, Master 2 1 1 2 Instance 2 3 2 2 3 Instance 3
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Cluster with dedicated masters Amazon ES cluster 1 3 3 1 Instance 1 2 1 1 2 Instance 2 3 2 2 3 Instance 3Dedicated master instances Data instances: queries and updates Adding dedicated masters improves stability by removing the master function from data instances Amazon ES provides 1-click deployment
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Master node recommendations Number of data nodes Master node instance type < 10 m3.medium+ < 20 m4.large+ <= 50 c4.xlarge+ 50-100 c4.2xlarge+ Always use an odd number of masters, >= 3
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Zone awareness for high availability
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Cluster with zone awareness Amazon ES cluster Availability Zone 1 Availability Zone 2 1 3 Instance 1 2 1 2 Instance 2 3 2 1 Instance 3 2 1 Instance 4 3 3 Cluster instances are split evenly between two zones Primary and replica shards go to different zones
  • 37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scale for your workload
  • 38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Instance type recommendations Instance Workload T2 Entry point. Dev and test. OK for dedicated masters. M3, M4 Equal read and write volumes. R3, R4 Read-heavy or workloads with high memory demands (e.g., aggregations). C4 High concurrency/indexing workloads I2 Up to 1.6 TB of SSD instance storage.
  • 39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Secure access to your domain
  • 40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Cluster Security AWS manages security of the cluster You manage access to the domain via policies
  • 41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Security through IAM A&AC • IP-based policies limit access to CIDR blocks for anonymous requests • Principal-based policies limit access to particular IAM users or roles, requiring AWS SigV4 signing • Policies • Can control HTTP method allowed to create differential access for e.g. queries and updates • Can specify resources at the index level
  • 42. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Example IAM Policy { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam:123456789012:user/susan" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:CreateElasticsearchDomain", "es:ListDomainNames" ], "Resource": "arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*" } ] }
  • 43. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Security Responsibilities • Creation of a service VPC that allows limited access to the cluster through a configurable access policy • Application of security patches on the instances • DDOS protection for the DNS name associated with the domain via Route53 • Facades the transport protocol with an ELB (HTTP:80) • Hides ports 9200 and 9300 • Built on top of AWS secure networking
  • 44. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Security Patches • There is no defined maintenance window as patches are applied using blue/green methodologies • Critical patches applied immediately • Routine patching typically applied during customer activities such as cluster resizing or policy changes
  • 45. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Load data
  • 46. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Logstash REST CWL Agent EC2 Instances Amazon Kinesis Amazon RDS Amazon DynamoDB Amazon SQS Queue Logstash Cluster Amazon Elasticsearch Service Amazon CloudWatch AWS Lambda AWS CloudTrail Access Logs Amazon VPC Flow Logs Amazon S3 bucket AWS IoT Amazon Kinesis Firehose Integration with the AWS ecosystem
  • 47. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitor and audit CloudWatch CloudTrail Elasticsearch APIs
  • 48. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring
  • 49. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Durability • Read Replicas + Zone Awareness • Automatic Daily snapshots • Manual Index Snapshots
  • 50. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Built-in Kibana
  • 51. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Application overview Logstash indexer Amazon Elasticsearch Service Application instances/ Logstash forwarders
  • 52. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Pay for what you use
  • 53. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Pay for compute and storage you use With Amazon Elasticsearch Service, you pay only for the compute and storage resources you use. AWS Free Tier for qualifying customers.
  • 54. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Easy to Use Deploy a production-ready Elasticsearch cluster in minutes Simplifies time-consuming management tasks such as software patching, failure recovery, backups, and monitoring Open Get direct access to the Elasticsearch open-source API Fully compatible with the open source Elasticsearch API, for all code and applications Secure Secure Elasticsearch clusters with AWS Identity and Access Management (IAM) policies with fine-grained access control access for users and endpoints Automatically applies security patches without disruption, keeping Elasticsearch environments secure Available Provides high availability using Zone Awareness, which replicates data between two Availability Zones Monitors the health of clusters and automatically replaces failed instances, without service disruption AWS Integrated Integrates with Amazon Kinesis Firehose, AWS IOT, and Amazon CloudWatch Logs for seamless data ingestion AWS CloudTrail for auditing, AWS Identity and Access Management (IAM) for security, and AWS CloudFormation for cloud orchestration Scalable Scale clusters from a single instance up to 100 instances Configure clusters to meet performance requirements by selecting from a range of instance types and storage options including SSD-powered EBS volumes Amazon Elasticsearch Service Benefits
  • 55. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Takeaways 1. Elasticsearch is a tool for full-text search, analysis, and visualization of time series data that helps you get the most out of your growing data set 2. Amazon Elasticsearch Service makes it easy to deploy and manage an Elasticsearch cluster in the AWS cloud 3. Amazon Elasticsearch Service is a drop-in replacement for your existing Elasticsearch cluster
  • 56. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you!