SlideShare una empresa de Scribd logo
1 de 47
Log Analytics with
Amazon Elasticsearch
Service
Jon Handler
(handler@amazon.com)
What we'll cover
• Understanding Elasticsearch capabilities
• Elasticsearch, the technology
• Aggregations; ad-hoc analysis
• Amazon Elasticsearch Service is a drop-in
replacement for self-managed Elasticsearch
• Q&A
Understanding Elasticsearch capabilities
CloudTrail delivers API calls to you
• AWS API call monitoring
• You need to understand the changing
landscape of your AWS resources
• You need to do security analysis and
compliance auditing
• You want the ability to dig into your logs
in an intuitive, fine-grained way
How Elasticsearch can help
• Combined with Kibana, Elasticsearch provides a
tool for search, real-time analytics, and data
visualization
Demo Architecture
Amazon
CloudWatch
Logs
Amazon
Elasticsearch Service
CloudTrail
Logs
AWS
Resources
Log lines
Demo
Scenario: Log data analytics
• Application monitoring and
event diagnosis
• You need to monitor the performance of
your application, web servers, and
hardware
• You need easy to use, yet powerful
data visualization tools to detect issues
in near real-time
• You want the ability to dig into your logs
in an intuitive, fine-grained way
• Kibana provides fast, easy visualization
Scenario: Batch data analytics
• Reporting and Analysis
• You are a mobile app developer
• You have to monitor/manage users
across multiple app versions
• You want to analyze and report on
usage and migration between app
versions
• Use Kibana for dashboarding. Use the
query API for deeper analysis
Scenario: Full-text search
• Traditional search
• Your application or website provides
search capabilities over diverse
documents
• You are tasked with making this
knowledge base searchable and
accessible
• You need key search features including
text matching, faceting, filtering, fuzzy
search, auto complete, and highlighting
• Use the query API to support
application search
Elasticsearch the technology
Elasticsearch is like a database
Search
Value
Field
Document
Index
Cluster
Queries
Database
Value
Column
Row
Table
Database
SQL
Documents are the core entity
ID
F1 Value
F2 Value
{
"eventVersion": "1.03",
"eventTime": "2016-06-01T00:16:19Z",
"eventSource": "dynamodb.amazonaws.com",
"eventName": "DescribeStream",
"awsRegion": "eu-west-1",
"sourceIPAddress": "52.51.24.XX",
"userAgent": "leb-kcl-580935a6-5f94-4ce0-ac69-cdeb609ba16a,amazon-
kinesis-client-library-java-lambda_1.2.1, aws-internal/3",
"requestParameters": {
"streamArn": "arn:aws:dynamodb:eu-west-
1:17816119XXXX:table/restaurant/stream/2016-04-08T18:07:53.837"
},
"responseElements": null,
"requestID": "KC608PH8POAF2I184E2SL1PS2FVV4KQNSO5AEMVJF66Q9ASUAAJG",
"eventID": "49b56379-903b-4f04-8ce5-d21bbfcf8ab3",
"eventType": "AwsApiCall",
"apiVersion": "2012-08-10",
"recipientAccountId": "17816119XXXX",
"userIdentity": {
"type": "AssumedRole",
"principalId":
"AROAJBQVRM7LN25CAHX7Y:awslambda_338_20160531233813522",
"arn": "arn:aws:sts::178161197791:assumed-role/geospatial-rec-
engine-ApplicationExecutionRole-
9LPKB77QMR97/awslambda_338_20160531233813522", ...
Lucene provides text analysis and indexing
0 quick 1,3,5
1 brown 2,3,4,6
2 fox 1,7,9
3 lazy 2,8
4 dog 24
Term ID Term Postings
Index
Writer
Index
Searcher
Segment
Elsaticsearch query processing
Query
quick
brown
fox
lazy
lorem
ipsum
dolor
sit
Index Lookup
id: 216
id: 305
id: 486
id: 713
Matches
Query
logic and
post-
filtering Scoring,
aggs
id: 713
id: 305
id: 486
id: 216
Sorted matches
(results)
Aggregations; ad-hoc analysis
Faceting: basic aggregation
• Query: shirt
Facets
Carhartt (1092)
 Russell Athletic (1087)
Dickies (954)
 RALPH LAUREN (823)
 Wrangler (701)
Doublju (259)
 Levi's (12)
ID
F1 Value
F2 Value
Elasticsearch Aggregations
• Buckets – a collection of documents meeting
some criterion
• Metrics – calculations on the content of buckets.
Bucket: time
Metric:count
A more complicated aggregation
Bucket: ARN
Bucket: Region
Bucket: eventName
Metric: Count
More kinds of aggregations
Buckets
• Date histogram
• Histogram
• Range
• Terms
• Filters
• Significant terms
Metrics
• Count
• Average
• Sum
• Min
• Max
• Std. Dev
• Unique Count
• Percentiles
Setting up your cluster
Shard 1 Shard 2 Shard 3
{
{
{
{
Shard 4
Shards: independent collections of documents
Id Id Id . . .
Documents
Index/Type
Deployment of indices to a cluster
• Index 1
– Shard 1
– Shard 2
– Shard 3
• Index 2
– Shard 1
– Shard 2
– Shard 3
Amazon ES cluster
1
2
3
1
2
3
1
2
3
1
2
3
Primary Replica
1
3
3
1
Instance 1,
Master
2
1
1
2
Instance 2
3
2
2
3
Instance 3
Determining storage
• Data:Index ratio is typically close to 1:1
• Add a replica, double the storage
• Figure out data node count based on storage
– Current limits; 10T EBS, 32T instance store
Determining instance type
• Instance type is workload-dependent
• T2; dev, test, QA
• M3; solid performance
• R3; heavier queries, aggs
• I2; largest storage option
Best practices
• Take the minimum number of shards for 50G
max data per shard
• Number of replicas = 1
• For all prod workloads: use 3 dedicated masters
• Use the _bulk API. Some ingest mechanisms do
this automatically
• Increase index.refresh_interval for higher
throughput
Indexing strategy
Logstash
REST
CWL Agent
EC2 Instances
Amazon
Kinesis
Amazon
RDS
Amazon
DynamoDB
Amazon
SQS
Queue
Logstash
Cluster
Amazon
Elasticsearch
Service
Amazon
CloudWatch
AWS
Lambda
AWS
CloudTrail
Access Logs
Amazon
VPC Flow
Logs
Amazon S3
bucket
AWS IoT
Amazon Kinesis
Firehose
Integration with the AWS
ecosystem
Amazon ECS
Indexing strategy for streaming data
• Use an index per time period, typically index-
per-day, high volume can go to index-per-hour
• Shard the index according to data size; use
50GB as a soft limit per shard
• Master nodes increase cluster stability
Index settings control sharding and more
curl -XPUT <endpoint>/<index>/_settings -d '{
"number_of_shards" : 5,
"number_of_replicas" : 1,
"refresh_interval": "5s"
}'
Mappings control how data is indexed
curl -XPUT <endpoint>/<index> -d '{
"mappings" : {
<type> : {
"properties" : {
"eventName" : {
"type" : "string",
"index" : "not_analyzed" } } } }
}'
Index templates simplify mapping creation
curl -XPUT <endpoint>/_template/<name> -d '{
"template" : "<wildcard e.g. cwl-*>",
"settings" : { "number_of_shards" : 2 },
"mappings" : {
<type, e.g. _default_> : {
"dynamic_templates" : [ {
<template name> : {
"mapping" : {
"index" : "not_analyzed"
},
"match" : "*" } } ],
"properties" : {
"@timestamp" : { "type" : "date" } } }
}'
Don't forget the query API!
Direct access to the Elasticsearch API
• $ curl -XPUT https://<endpoint>/blog -d '{
• "settings" : { "number_of_shards" : 3, "number_of_replicas" : 1 } }'
• $ curl -XPOST http://<endpoint>/blog/post/1 -d '{
• "author":"jon handler",
• "title":"Amazon ES Launch" }'
• $ curl -XPOST https://<endpoint>/blog/post/_bulk -d '
• { "index" : { "_index" : "blog", "_type" : "post", "_id" : "2"}}
• {"title":"Amazon ES for search", "author": "carl meadows"},
• { "index" : { "_index":"blog", "_type":"post", "_id":"3" } }
• { "title":"Analytics too", "author": "vivek sriram"}'
• $ curl -XGET http://<endpoint>/_search?q=ES
• {"took":16,"timed_out":false,"_shards":{"total":3,"successful":3,"failed":0
},"hits":{"total":2,"max_score":0.13424811,"hits":[{"_index":"blog","_type":
"post","_id":"1","_score":0.13424811,"_source":{"author":"jon handler",
"title":"Amazon ES Launch"
}},{"_index":"blog","_type":"post","_id":"2","_score":0.11506981,"_source":{
"title":"Amazon ES for search", "author": "carl meadows"},}]}}
Elasticsearch is a full-featured search engine
• Built on Lucene, the popular, open-source library
• Search structured and unstructured data with
complex, boolean queries
• Supports common search features: geo search,
aggregations, highlighting, search suggestions,
and more
Challenges with self-managed Elasticsearch
• Easy to get started, challenging to scale
• Scaling ingest pipelines is difficult
• Undifferentiated heavy lifting
Amazon Elasticsearch Service
Amazon ES overview
Amazon Route
53
Elastic Load
Balancing
IAM
CloudWatch
Elasticsearch API
CloudTrail
Easy cluster configuration and reconfiguration
AWS
• Elasticsearch Version
• Data nodes, count and type
• Master nodes, count and type
• Storage option – EBS/instance
• HA option
• Advanced options
High availability with Zone Awareness
Amazon ES cluster
1
3
Instance 1
2
1 2
Instance 2
3
2
1
Instance 3
Availability Zone 1 Availability Zone 2
2
1
Instance 4
3
3
Monitor with CloudWatch metrics
• FreeStorageSpace – monitor and alarm before the
cluster runs out of space
• CPUUtilization – alarm at 80% CPU to signal the need to
scale up
• ClusterStatus.yellow – check whether replication
requires additional nodes
• JVMMemoryPressure – check instance type and count
for sufficient resources
• MasterCPUUtilization – monitoring for master nodes is
separated from data nodes
Security with IAM
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam:123456789012:user/susan"
},
"Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost",
"es:CreateElasticsearchDomain",
"es:ListDomainNames" ],
"Resource":
"arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*"
} ] }
Pay for compute and storage you use
• With Amazon Elasticsearch Service, you pay
only for the compute and storage resources you
use. AWS Free Tier for qualifying customers.
Wrap up
• Combined with Kibana, Elasticsearch provides search and
visualization for streaming data and full-text use cases.
• Elasticsearch is based on Lucene, which reads and writes
search indices
• Aggregations allow you to analyze your data, splitting into
Buckets and computing Metrics
• Amazon Elasticsearch Service makes it easy to set up and
manage your Elasticsearch cluster on AWS
• Amazon ES is a great way to get started with Elasticsearch!
Q&A
• Jon Handler: handler@amazon.com
• Vivek Sriram: Business Development Manager:
vsriram@amazon.com
• https://run.qwiklab.com/searches/elasticsearch

Más contenido relacionado

La actualidad más candente

대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
Amazon Web Services Korea
 

La actualidad más candente (20)

(SEC313) Security & Compliance at the Petabyte Scale
(SEC313) Security & Compliance at the Petabyte Scale(SEC313) Security & Compliance at the Petabyte Scale
(SEC313) Security & Compliance at the Petabyte Scale
 
(SEC302) IAM Best Practices To Live By
(SEC302) IAM Best Practices To Live By(SEC302) IAM Best Practices To Live By
(SEC302) IAM Best Practices To Live By
 
Managing Your Infrastructure as Code
Managing Your Infrastructure as CodeManaging Your Infrastructure as Code
Managing Your Infrastructure as Code
 
Masting Access Control Policies
Masting Access Control PoliciesMasting Access Control Policies
Masting Access Control Policies
 
(SEC312) Reliable Design & Deployment of Security & Compliance
(SEC312) Reliable Design & Deployment of Security & Compliance(SEC312) Reliable Design & Deployment of Security & Compliance
(SEC312) Reliable Design & Deployment of Security & Compliance
 
(SEC306) Turn on CloudTrail: Log API Activity in Your AWS Account | AWS re:In...
(SEC306) Turn on CloudTrail: Log API Activity in Your AWS Account | AWS re:In...(SEC306) Turn on CloudTrail: Log API Activity in Your AWS Account | AWS re:In...
(SEC306) Turn on CloudTrail: Log API Activity in Your AWS Account | AWS re:In...
 
(DVO304) AWS CloudFormation Best Practices
(DVO304) AWS CloudFormation Best Practices(DVO304) AWS CloudFormation Best Practices
(DVO304) AWS CloudFormation Best Practices
 
Serverless Geospatial Mobile Apps with AWS
Serverless Geospatial Mobile Apps with AWSServerless Geospatial Mobile Apps with AWS
Serverless Geospatial Mobile Apps with AWS
 
Getting Started with AWS IoT
Getting Started with AWS IoTGetting Started with AWS IoT
Getting Started with AWS IoT
 
IAM Recommended Practices
IAM Recommended PracticesIAM Recommended Practices
IAM Recommended Practices
 
(SEC303) Mastering Access Control Policies | AWS re:Invent 2014
(SEC303) Mastering Access Control Policies | AWS re:Invent 2014(SEC303) Mastering Access Control Policies | AWS re:Invent 2014
(SEC303) Mastering Access Control Policies | AWS re:Invent 2014
 
Security Day IAM Recommended Practices
Security Day IAM Recommended PracticesSecurity Day IAM Recommended Practices
Security Day IAM Recommended Practices
 
AWS Services Overview - September 2016 Webinar Series
AWS Services Overview - September 2016 Webinar SeriesAWS Services Overview - September 2016 Webinar Series
AWS Services Overview - September 2016 Webinar Series
 
AWS re:Invent 2016: IoT Visualizations and Analytics (IOT306)
AWS re:Invent 2016: IoT Visualizations and Analytics (IOT306)AWS re:Invent 2016: IoT Visualizations and Analytics (IOT306)
AWS re:Invent 2016: IoT Visualizations and Analytics (IOT306)
 
Security Architecture recommendations for your new AWS operation - Pop-up Lof...
Security Architecture recommendations for your new AWS operation - Pop-up Lof...Security Architecture recommendations for your new AWS operation - Pop-up Lof...
Security Architecture recommendations for your new AWS operation - Pop-up Lof...
 
AWS re:Invent 2016: Automating Security Event Response, from Idea to Code to ...
AWS re:Invent 2016: Automating Security Event Response, from Idea to Code to ...AWS re:Invent 2016: Automating Security Event Response, from Idea to Code to ...
AWS re:Invent 2016: Automating Security Event Response, from Idea to Code to ...
 
AWS APAC Webinar Week - Securing Your Business on AWS
AWS APAC Webinar Week - Securing Your Business on AWSAWS APAC Webinar Week - Securing Your Business on AWS
AWS APAC Webinar Week - Securing Your Business on AWS
 
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
 
Security Day IAM Recommended Practices
Security Day IAM Recommended PracticesSecurity Day IAM Recommended Practices
Security Day IAM Recommended Practices
 
Security and Compliance
Security and ComplianceSecurity and Compliance
Security and Compliance
 

Destacado

(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
Amazon Web Services
 

Destacado (20)

Amazon ElasticSearch Service
Amazon ElasticSearch Service  Amazon ElasticSearch Service
Amazon ElasticSearch Service
 
Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301
 
AWS October Webinar Series - Introducing Amazon QuickSight
AWS October Webinar Series - Introducing Amazon QuickSightAWS October Webinar Series - Introducing Amazon QuickSight
AWS October Webinar Series - Introducing Amazon QuickSight
 
Creating a Data Driven Culture with Amazon QuickSight - Technical 201
Creating a Data Driven Culture with Amazon QuickSight - Technical 201Creating a Data Driven Culture with Amazon QuickSight - Technical 201
Creating a Data Driven Culture with Amazon QuickSight - Technical 201
 
Deep Dive on AWS reInvent 2016 Breakout Sessions
Deep Dive on AWS reInvent 2016 Breakout SessionsDeep Dive on AWS reInvent 2016 Breakout Sessions
Deep Dive on AWS reInvent 2016 Breakout Sessions
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSight
 
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
 
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
 
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
 
Real-Time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-Time Data Exploration and Analytics with Amazon Elasticsearch ServiceReal-Time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-Time Data Exploration and Analytics with Amazon Elasticsearch Service
 
Amazon CloudWatch Logs and AWS Lambda
Amazon CloudWatch Logs and AWS LambdaAmazon CloudWatch Logs and AWS Lambda
Amazon CloudWatch Logs and AWS Lambda
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Secure Content Delivery with AWS
Secure Content Delivery with AWSSecure Content Delivery with AWS
Secure Content Delivery with AWS
 
A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in Action
 
Releasing Software Quickly and Reliably with AWS CodePipline
Releasing Software Quickly and Reliably with AWS CodePiplineReleasing Software Quickly and Reliably with AWS CodePipline
Releasing Software Quickly and Reliably with AWS CodePipline
 
DynamodbDB Deep Dive
DynamodbDB Deep DiveDynamodbDB Deep Dive
DynamodbDB Deep Dive
 
AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...
AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...
AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...
 
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
 
AWS Security & Compliance
AWS Security & ComplianceAWS Security & Compliance
AWS Security & Compliance
 

Similar a Log Analytics with Amazon Elasticsearch Service - September Webinar Series

Elasticsearch in Production
Elasticsearch in ProductionElasticsearch in Production
Elasticsearch in Production
foundsearch
 

Similar a Log Analytics with Amazon Elasticsearch Service - September Webinar Series (20)

Microservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital OneMicroservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital One
 
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
 
Elk presentation1#3
Elk presentation1#3Elk presentation1#3
Elk presentation1#3
 
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
 
AWS-Certified-Cloud-Practitioner wiz.pdf
AWS-Certified-Cloud-Practitioner wiz.pdfAWS-Certified-Cloud-Practitioner wiz.pdf
AWS-Certified-Cloud-Practitioner wiz.pdf
 
Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...
Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...
Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...
 
AWS Cloud Practitioner.PDF
AWS Cloud Practitioner.PDFAWS Cloud Practitioner.PDF
AWS Cloud Practitioner.PDF
 
WhizCard-CLF-C01-06-09-2022.pdf
WhizCard-CLF-C01-06-09-2022.pdfWhizCard-CLF-C01-06-09-2022.pdf
WhizCard-CLF-C01-06-09-2022.pdf
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
Auto Scaling Groups
Auto Scaling GroupsAuto Scaling Groups
Auto Scaling Groups
 
Log Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & KibanaLog Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & Kibana
 
Kinney j aws
Kinney j awsKinney j aws
Kinney j aws
 
오토스케일링 제대로 활용하기 (김일호) - AWS 웨비나 시리즈 2015
오토스케일링 제대로 활용하기 (김일호) - AWS 웨비나 시리즈 2015오토스케일링 제대로 활용하기 (김일호) - AWS 웨비나 시리즈 2015
오토스케일링 제대로 활용하기 (김일호) - AWS 웨비나 시리즈 2015
 
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
 
Elasticsearch in Production
Elasticsearch in ProductionElasticsearch in Production
Elasticsearch in Production
 
AWS re:Invent 2016: Running Lean Architectures: How to Optimize for Cost Effi...
AWS re:Invent 2016: Running Lean Architectures: How to Optimize for Cost Effi...AWS re:Invent 2016: Running Lean Architectures: How to Optimize for Cost Effi...
AWS re:Invent 2016: Running Lean Architectures: How to Optimize for Cost Effi...
 
AWS July Webinar Series - Troubleshooting Operational and Security Issues in ...
AWS July Webinar Series - Troubleshooting Operational and Security Issues in ...AWS July Webinar Series - Troubleshooting Operational and Security Issues in ...
AWS July Webinar Series - Troubleshooting Operational and Security Issues in ...
 
Amazon Web Services
Amazon Web ServicesAmazon Web Services
Amazon Web Services
 
Deep Dive on AWS IoT
Deep Dive on AWS IoTDeep Dive on AWS IoT
Deep Dive on AWS IoT
 

Más de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Log Analytics with Amazon Elasticsearch Service - September Webinar Series

  • 1. Log Analytics with Amazon Elasticsearch Service Jon Handler (handler@amazon.com)
  • 2. What we'll cover • Understanding Elasticsearch capabilities • Elasticsearch, the technology • Aggregations; ad-hoc analysis • Amazon Elasticsearch Service is a drop-in replacement for self-managed Elasticsearch • Q&A
  • 4. CloudTrail delivers API calls to you • AWS API call monitoring • You need to understand the changing landscape of your AWS resources • You need to do security analysis and compliance auditing • You want the ability to dig into your logs in an intuitive, fine-grained way
  • 5.
  • 6. How Elasticsearch can help • Combined with Kibana, Elasticsearch provides a tool for search, real-time analytics, and data visualization
  • 10. Scenario: Log data analytics • Application monitoring and event diagnosis • You need to monitor the performance of your application, web servers, and hardware • You need easy to use, yet powerful data visualization tools to detect issues in near real-time • You want the ability to dig into your logs in an intuitive, fine-grained way • Kibana provides fast, easy visualization
  • 11. Scenario: Batch data analytics • Reporting and Analysis • You are a mobile app developer • You have to monitor/manage users across multiple app versions • You want to analyze and report on usage and migration between app versions • Use Kibana for dashboarding. Use the query API for deeper analysis
  • 12. Scenario: Full-text search • Traditional search • Your application or website provides search capabilities over diverse documents • You are tasked with making this knowledge base searchable and accessible • You need key search features including text matching, faceting, filtering, fuzzy search, auto complete, and highlighting • Use the query API to support application search
  • 14. Elasticsearch is like a database Search Value Field Document Index Cluster Queries Database Value Column Row Table Database SQL
  • 15. Documents are the core entity ID F1 Value F2 Value { "eventVersion": "1.03", "eventTime": "2016-06-01T00:16:19Z", "eventSource": "dynamodb.amazonaws.com", "eventName": "DescribeStream", "awsRegion": "eu-west-1", "sourceIPAddress": "52.51.24.XX", "userAgent": "leb-kcl-580935a6-5f94-4ce0-ac69-cdeb609ba16a,amazon- kinesis-client-library-java-lambda_1.2.1, aws-internal/3", "requestParameters": { "streamArn": "arn:aws:dynamodb:eu-west- 1:17816119XXXX:table/restaurant/stream/2016-04-08T18:07:53.837" }, "responseElements": null, "requestID": "KC608PH8POAF2I184E2SL1PS2FVV4KQNSO5AEMVJF66Q9ASUAAJG", "eventID": "49b56379-903b-4f04-8ce5-d21bbfcf8ab3", "eventType": "AwsApiCall", "apiVersion": "2012-08-10", "recipientAccountId": "17816119XXXX", "userIdentity": { "type": "AssumedRole", "principalId": "AROAJBQVRM7LN25CAHX7Y:awslambda_338_20160531233813522", "arn": "arn:aws:sts::178161197791:assumed-role/geospatial-rec- engine-ApplicationExecutionRole- 9LPKB77QMR97/awslambda_338_20160531233813522", ...
  • 16. Lucene provides text analysis and indexing 0 quick 1,3,5 1 brown 2,3,4,6 2 fox 1,7,9 3 lazy 2,8 4 dog 24 Term ID Term Postings Index Writer Index Searcher Segment
  • 17. Elsaticsearch query processing Query quick brown fox lazy lorem ipsum dolor sit Index Lookup id: 216 id: 305 id: 486 id: 713 Matches Query logic and post- filtering Scoring, aggs id: 713 id: 305 id: 486 id: 216 Sorted matches (results)
  • 19. Faceting: basic aggregation • Query: shirt Facets Carhartt (1092)  Russell Athletic (1087) Dickies (954)  RALPH LAUREN (823)  Wrangler (701) Doublju (259)  Levi's (12) ID F1 Value F2 Value
  • 20. Elasticsearch Aggregations • Buckets – a collection of documents meeting some criterion • Metrics – calculations on the content of buckets. Bucket: time Metric:count
  • 21. A more complicated aggregation Bucket: ARN Bucket: Region Bucket: eventName Metric: Count
  • 22. More kinds of aggregations Buckets • Date histogram • Histogram • Range • Terms • Filters • Significant terms Metrics • Count • Average • Sum • Min • Max • Std. Dev • Unique Count • Percentiles
  • 23. Setting up your cluster
  • 24. Shard 1 Shard 2 Shard 3 { { { { Shard 4 Shards: independent collections of documents Id Id Id . . . Documents Index/Type
  • 25. Deployment of indices to a cluster • Index 1 – Shard 1 – Shard 2 – Shard 3 • Index 2 – Shard 1 – Shard 2 – Shard 3 Amazon ES cluster 1 2 3 1 2 3 1 2 3 1 2 3 Primary Replica 1 3 3 1 Instance 1, Master 2 1 1 2 Instance 2 3 2 2 3 Instance 3
  • 26. Determining storage • Data:Index ratio is typically close to 1:1 • Add a replica, double the storage • Figure out data node count based on storage – Current limits; 10T EBS, 32T instance store
  • 27. Determining instance type • Instance type is workload-dependent • T2; dev, test, QA • M3; solid performance • R3; heavier queries, aggs • I2; largest storage option
  • 28. Best practices • Take the minimum number of shards for 50G max data per shard • Number of replicas = 1 • For all prod workloads: use 3 dedicated masters • Use the _bulk API. Some ingest mechanisms do this automatically • Increase index.refresh_interval for higher throughput
  • 31. Indexing strategy for streaming data • Use an index per time period, typically index- per-day, high volume can go to index-per-hour • Shard the index according to data size; use 50GB as a soft limit per shard • Master nodes increase cluster stability
  • 32. Index settings control sharding and more curl -XPUT <endpoint>/<index>/_settings -d '{ "number_of_shards" : 5, "number_of_replicas" : 1, "refresh_interval": "5s" }'
  • 33. Mappings control how data is indexed curl -XPUT <endpoint>/<index> -d '{ "mappings" : { <type> : { "properties" : { "eventName" : { "type" : "string", "index" : "not_analyzed" } } } } }'
  • 34. Index templates simplify mapping creation curl -XPUT <endpoint>/_template/<name> -d '{ "template" : "<wildcard e.g. cwl-*>", "settings" : { "number_of_shards" : 2 }, "mappings" : { <type, e.g. _default_> : { "dynamic_templates" : [ { <template name> : { "mapping" : { "index" : "not_analyzed" }, "match" : "*" } } ], "properties" : { "@timestamp" : { "type" : "date" } } } }'
  • 35. Don't forget the query API!
  • 36. Direct access to the Elasticsearch API • $ curl -XPUT https://<endpoint>/blog -d '{ • "settings" : { "number_of_shards" : 3, "number_of_replicas" : 1 } }' • $ curl -XPOST http://<endpoint>/blog/post/1 -d '{ • "author":"jon handler", • "title":"Amazon ES Launch" }' • $ curl -XPOST https://<endpoint>/blog/post/_bulk -d ' • { "index" : { "_index" : "blog", "_type" : "post", "_id" : "2"}} • {"title":"Amazon ES for search", "author": "carl meadows"}, • { "index" : { "_index":"blog", "_type":"post", "_id":"3" } } • { "title":"Analytics too", "author": "vivek sriram"}' • $ curl -XGET http://<endpoint>/_search?q=ES • {"took":16,"timed_out":false,"_shards":{"total":3,"successful":3,"failed":0 },"hits":{"total":2,"max_score":0.13424811,"hits":[{"_index":"blog","_type": "post","_id":"1","_score":0.13424811,"_source":{"author":"jon handler", "title":"Amazon ES Launch" }},{"_index":"blog","_type":"post","_id":"2","_score":0.11506981,"_source":{ "title":"Amazon ES for search", "author": "carl meadows"},}]}}
  • 37. Elasticsearch is a full-featured search engine • Built on Lucene, the popular, open-source library • Search structured and unstructured data with complex, boolean queries • Supports common search features: geo search, aggregations, highlighting, search suggestions, and more
  • 38. Challenges with self-managed Elasticsearch • Easy to get started, challenging to scale • Scaling ingest pipelines is difficult • Undifferentiated heavy lifting
  • 40. Amazon ES overview Amazon Route 53 Elastic Load Balancing IAM CloudWatch Elasticsearch API CloudTrail
  • 41. Easy cluster configuration and reconfiguration AWS • Elasticsearch Version • Data nodes, count and type • Master nodes, count and type • Storage option – EBS/instance • HA option • Advanced options
  • 42. High availability with Zone Awareness Amazon ES cluster 1 3 Instance 1 2 1 2 Instance 2 3 2 1 Instance 3 Availability Zone 1 Availability Zone 2 2 1 Instance 4 3 3
  • 43. Monitor with CloudWatch metrics • FreeStorageSpace – monitor and alarm before the cluster runs out of space • CPUUtilization – alarm at 80% CPU to signal the need to scale up • ClusterStatus.yellow – check whether replication requires additional nodes • JVMMemoryPressure – check instance type and count for sufficient resources • MasterCPUUtilization – monitoring for master nodes is separated from data nodes
  • 44. Security with IAM { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam:123456789012:user/susan" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:CreateElasticsearchDomain", "es:ListDomainNames" ], "Resource": "arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*" } ] }
  • 45. Pay for compute and storage you use • With Amazon Elasticsearch Service, you pay only for the compute and storage resources you use. AWS Free Tier for qualifying customers.
  • 46. Wrap up • Combined with Kibana, Elasticsearch provides search and visualization for streaming data and full-text use cases. • Elasticsearch is based on Lucene, which reads and writes search indices • Aggregations allow you to analyze your data, splitting into Buckets and computing Metrics • Amazon Elasticsearch Service makes it easy to set up and manage your Elasticsearch cluster on AWS • Amazon ES is a great way to get started with Elasticsearch!
  • 47. Q&A • Jon Handler: handler@amazon.com • Vivek Sriram: Business Development Manager: vsriram@amazon.com • https://run.qwiklab.com/searches/elasticsearch