Amazon Elasticsearch Service gives customers many options for log analytics. From small environments with a single application to large environments where multiple teams log five terabytes or more per day with retention periods that span months, Amazon ES provides a tool kit that gives organizations a holistic view of their application logs. In this session, we discuss effective patterns leveraged by organizations across the AWS ecosystem and gives you foundational knowledge and deployment architectures that will accelerate your goals of building a cost-effective logging solution.
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS Summit
1. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Searching for patterns:
Log analytics using Amazon ES
Kevin Fallis
Senior Specialist Solutions Architect
AWS â Search Services
A D B 2 0 5
2. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Source: TechCrunch survey of popular open source software from Aprilâ17
⢠Sometimes referred to as the âELK Stackâ
â Elasticsearch, Logstash, & Kibana
⢠Distributed search and analytics engine
built on Apache Lucene
⢠Easy ingestion and visualization
What is Elasticsearch?
3. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Machine data driving Elasticsearch growth
Machine-generated data is growing 10x faster than business data⌠Logs, logs, and more logs
IT & DevOps: Databases,
servers, storage,
networking
Increase in IoT and Mobile
devices: Gaming, sensors, web
content
Cloud-based
architectures
Source: insideBigDataâThe Exponential Growth of Data, February 16, 2017
4. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Popular use cases
Application
log monitoring
Security event
information
monitoring
Data
visualization
Full text
search
5. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon Elasticsearch Service (Amazon
ES) is a fully managed service that
makes it easy to deploy, manage, and
scale Elasticsearch and Kibana in the
AWS Cloud
Amazon Elasticsearch Service
6. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Seamless data ingestion, security,
auditing, and orchestration
Benefits of Amazon ES
Drop-in replacement with no need to
learn new APIs or skills
Deploy a production-ready
Elasticsearch cluster in minutes
Resize your cluster with a few clicks
or a single API call
Deploy into your VPC and restrict
access using security groups and IAM
policies
Replicate across Availability Zones,
with monitoring and automated self-
healing
Supports OS APIs and tools Easy to use Scalable
Secure Highly available Tightly integrated
7. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS Cloud
Elasticsearch runs on a cluster of instances
VPC
Data nodes Master nodes
AWS Management Console
AWS Command Line Interface
AWS Tools and SDKs
AWS CloudFormation
AWS Identity and
Access
Management (IAM)
Elastic Load Balancing (ELB)
AWS CloudTrailAmazon CloudWatch AWS Database
Migration Service
Amazon Kinesis Data
Firehose
Amazon
CloudWatch
Logs
Amazon ES domain
8. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Provides Kibana real-time visualization tool
9. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Security information and event management (SIEM)
IoT & mobile
Application monitoring & root-cause analysis
Business and web analytics
Amazon ES empowers you with the data to
understand and intelligently react to your business
needs
⢠End-to-end visibility: Better understanding of
customers' behavior to improve user experience
and react to demand
⢠Improve reliability: Increased operational
efficiencies by identifying, solving and preventing
system failures in real time
⢠Faster time-to-value: Accelerate time to market
with application delivery and performance
monitoring
⢠Security: Improved business confidence with end-
to-end monitoring of data, infrastructure, and
transactions
Build actionable insights
10. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Central Log Management System
https://www.youtube.com/watch?v=fSjAfp-uqSs
Case study: Autodesk
Highly distributed organization. No consistent way to collect
and measure metrics.
Small ops team.
Must integrate easily with other AWS services.
Scale: Accommodate current and future requirements.
Must be cost effective with no data lock-in.
TBs of log data to sift through to find and fix issues that
impact customers.
C H A L L E N G E
B E N E F I T S
Unified log data management solution built on AWS. Single interface
for log analytics across applications. Annotate log records to enable
distributed tracing states.
Streaming application logs via Kinesis Data Firehose to Amazon S3,
Amazon Athena, and Amazon ES.
10 i3.4xlarge Amazon ES data nodes â 33 TB. Will grow to 110 TB.
Kibana, built-in within Amazon ES, for near real-time analytics and
dashboards
S O L U T I O N
All managed services: âManage less to gain more.â Focus on developing awesome products.
Common vocabulary for diagnosing and solving problems. Eliminated silos.
Scalable and cost-effective â i3s delivering great value per TB.
Improving customer experience by reducing the time to find and fix customer issues.
11. S U M M I T Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
12. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Application dataServer, application,
network, AWS, and
other logs
Amazon ES domain
with index
How it works
1. Send data as JSON via REST APIs
2. Data is indexed: All fields searchable, including nested
JSON
3. Queries, via REST APIs, allow fielded matching,
Boolean expressions, include sorting and analysis
1
2
3
Application users, analysts, DevOps, security
13. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
You use the query APIs to retrieve data from
Elasticsearch
Amazon ES domain
Query
engine
Scoring &
sorting
Ranked
resultsMatches
14. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
The query engine matches requested field values
Field1:value1
Field2:value2
logs_11.28.2018 index
F1 index F2 index
V1
V2
Vn
V1
V2
Vn
ID
Field: value
Field: value
Field: value
Field: value
15. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
You use aggregations to analyze log data
Amazon ES domain
Query
engine
Matches
Analysis
engine
(aggre-
gations)
⢠Histogram
⢠Numeric: sum,
min., max.
⢠Terms: bucketing
⢠Nesting
16. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
All docs
1/51/5 1/5 1/5 1/5
Index
ID
Field: value
Field: value
Field: value
Field: value
Data is stored in an index comprised of shards
17. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Shards are primary or replica
Index
Primary shards
Replica shards
ID
Field: value
Field: value
Field: value
Field: value
18. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Elasticsearch distributes shards to data nodes
Queries
Updates
19. S U M M I T Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
20. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Overview of delivering logs to Amazon ES
Collect Buffer Aggregate Store
21. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log collectors: Popular options
22. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log collectors: Properties
⢠Typically read files on a file system
⢠But can receive events with data from things other than file systems
⢠Configuration driven
⢠Can be âlightweightâ or âheavyweightâ
⢠Lightweight: Consumes as few system resources as possible
⢠Written in C, Ruby, or another efficient language
⢠Agent based: Runs as a service on the OS
⢠Config-driven
⢠Heavyweight: Requires a JVM or other execution engine
⢠Purpose built or leverage âpluginsâ via configuration
⢠Can perform data transformation
23. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log buffers: Popular options
24. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log buffers: Properties
⢠Allow you to decouple producers from consumers
⢠Control the ingest pipeline
⢠Metered consumption of data from consumer fleets
⢠Have âdata durabilityâ
⢠Individual events can have a lifecycle outside of Elasticsearch when dealing with sliding windows
⢠Can allow you to replay events
⢠Give you options to involve other business functions
⢠Machine learning
⢠Big data and analytics
⢠Data science
⢠Promote âLambdaâ architectures (batch + near-real time)
25. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log aggregators: Popular options
26. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log aggregators: Properties
⢠Aggregate events into one payload for Amazon ES
⢠Give you control of the ingest activity
⢠Allow you to âthrottleâ the volume of request to Elasticsearch because:
⢠Data nodes have limited space in processing queues
⢠You need to balance query activity with ingest activity
⢠Use the _bulk API to push JSON formatted, grouped events to Elasticsearch
⢠Can be âlightweightâ or âheavyweight,â just like forwarders
⢠Can act as interim buffers
⢠Use AWS Auto Scaling to throttle Amazon EC2 or container fleets
⢠Lambda should leverage âconcurrencyâ setting to throttle indexing
⢠In some cases, can âfan outâ to multiple destinations other than Elasticsearch
for additional business value
27. S U M M I T Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
28. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Patterns help you build solutions quickly
⢠Asserted
⢠Others have done this
⢠Extensible
⢠Prescriptive
⢠Repeatable
⢠Verifiable
⢠Natively on AWS if using
AWS CloudFormation and
AWS Config
29. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
VPC option presents challenges to architectures
⢠Elastic Network Interfaces (ENIs) get presented to consumers of the Amazon ES
⢠This means all traffic to your domain is private and must be accessed from within the VPC
⢠ENIs cannot be presented to external services without a proxy, AWS PrivateLink or VPC peering
⢠DNS resolution of the endpoint is private
⢠You cannot present one Amazon ES domain to more that one VPC
⢠Kibana access via Amazon Cognito must be brokered with a proxy
⢠NGINX
⢠Apache
⢠Amazon Kinesis Data Firehose will eventually support VPC endpoints
30. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon S3 event notifications approach
31. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon Kinesis approach
32. S U M M I T Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
33. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data retention is directly proportional to cost
⢠Do you really need to log it?
⢠Remove irrelevant fields
⢠For example, are you really using that user-agent field in your access logs?
⢠Transform string values into integers
⢠For example, VPC Flow Logs contain a field called âactionâ and âstatus.â You could transform
these character fields to enumerations
⢠Do your customers need larger retention periods?
⢠Most data is actionable in a âhotâ time period
⢠Consider smaller retention periods unless the business case dictates otherwise
⢠Use a âforensic clusterâ that is populated by manual snapshots as needed
⢠Audits
⢠Historical trend analysis
34. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Pattern: Time-based indexes for log analytics
⢠You use a root string, e.g., logs_.
⢠Depending on volume, rotate at regular
intervals, normally daily.
⢠Daily indexes simplify index management.
Delete the oldest index to create more
space on your cluster.
⢠Use aliases to query aggregate indices.
logs_2019.07.01
logs_2019.07.02
logs_2019.07.03
logs_2019.07.04
logs_2019.07.05
logs_2019.07.06
logs_2019.07.07
35. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Going deeper on index managementâaliases
Aliases enable you to query multiple indices using a
reference name
⢠Begin by creating a new index that fits the pattern-defined
using settings
⢠Adjust the alias to include the new index name, for example
âlogs_2019.07.01â
⢠Remove the oldest index from the alias for example
âlogs_2019.07.01â
⢠Manual snapshot the oldest index
⢠Drop the oldest index
36. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Forensic cluster pattern
⢠Amazon CloudWatch Events trigger Lambda, which invokes curator to
manage indices
⢠Create a schedule in Amazon CloudWatch for the event
⢠Create a snapshot repository
⢠AWS Lambda creates a metadata record in Amazon DynamoDB for the snapshot with a state of
âstartingâ
⢠Lambda calls curator to manage the indexes via API
⢠Snapshot is kicked off asynchronously
⢠Lambda updates the metadata record to a state of started
⢠Another scheduled event checks the snapshot using the _snapshots API to query the status. It
should be in a âSUCCESSâ status, and you can mark the snapshot âcompleteâ
⢠Code for error scenarios
⢠Create a new cluster
⢠Restore snapshots based on metadata records in Amazon DynamoDB
37. Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Wrap up
⢠Machine-generated data is growing rapidly, driven by DevOps, cloud infrastructure,
and IoT
⢠Logs contain valuable insights: what your users are doing, whether you have bad
actors, & what's happening at your devices
⢠Amazon ES enables ingesting and analyzing logs in real time to provide you with the
data and insights you need
38. Thank you!
S U M M I T Š 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Kevin Fallis
kffallis@amazon.com