Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018

432 visualizaciones

Publicado el

Real Time Data Ingestion and Analysis

In this session you will learn how to perform real time data analytics on streaming data using Amazon Kinesis Streams and run prediction algorithms. Learn how to stream your Cloud Trail logs to Amazon Kinesis Streams and identify anomalies using Spark Stream analytics and Amazon Kinesis Data Analytics.

Ganesh Raja, Big Data Solutions Architect, Amazon Web Services

  • Inicia sesión para ver los comentarios

Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018

  1. 1. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Ganesh Raja Solutions Architect – Data & Analytics, Amazon Web Services Real Time Data Ingestion And Analysis
  2. 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Streams Are Everywhere • Most data is continuously produced as a stream • Processing Data as it arrives is becoming very popular • Many diverse applications and use cases Streaming Ingest- Transform-Load Continuous Metric Generation Actionable Insights Compute analytics as the data is generated React to analytics based off of insights Deliver data to analytics tools faster and cheaper
  3. 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. It’s All About The Pace Hourly server logs Weekly or monthly bills Daily web-site clickstream Daily fraud reports Batch Processing Real time metrics Real time spending alerts/caps Real time clickstream analysis Real time detection Stream Processing
  4. 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The Diminishing Value Of Data Recent data is highly valuable • If you act on it in time • Perishable Insights (M. Gualtieri, Forrester) Old + Recent data is even more valuable • If you have the means to combine them
  5. 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Simple Pattern For Streaming Data Continuously creates data Continuously writes data to a stream Can be almost anything Data Producer Durably stores data Provides a temporary buffer that prepares data Supports very high- throughput Streaming Service Continuously processes data Cleans, prepares, & aggregates data Transforms data into information Data Consumer Mobile Clients Amazon Kinesis Amazon Kinesis app
  6. 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Amazon Kinesis Data Streams Amazon Kinesis Data Firehose Build custom applications that process and analyse streaming data Easily load streaming data into AWS
  7. 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Data Streams • Easy administration and low cost • Build real time applications with a framework of choice • Secure and durable storage
  8. 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Data Firehose • Zero administration and seamless elasticity • Direct-to-data store integration • Serverless and continuous data transformations
  9. 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Anomaly Detection on AWS CloudTrail Logs
  10. 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Anomaly Detection
  11. 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing CloudTrail Event Logs AWS CloudTrail Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Deliver to a real time dashboard and archive Compute operational metrics
  12. 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Ingest And Deliver AWS Cloudtrail Events • AWS CloudTrail provides continuous account activity logging • Events are sent in near real time to Amazon Kinesis Data Firehose and Streams • Each event includes a timestamp, the AWS IAM user or AWS service name, API call, response and more. Amazon CloudWatch events trigger Amazon S3 bucket for raw data Amazon Kinesis Data Streams
  13. 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Stream Data To Amazon Kinesis Automatic ingestion Easy setup Write your own Amazon VPC Flow Logs Elastic Load Balancing Amazon RDS Amazon CloudWatch Logs AWS CloudTrail Event Logs Amazon Pinpoint Amazon API Gateway AWS IoT events AWS SDKs Amazon DynamoDB Amazon Kinesis Agent Amazon Kinesis Producer Library As a proxy: For change data capture: Just a sample… many more ways stream data to Amazon Kinesis
  14. 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing CloudTrail Event Logs Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams AWS CloudTrail Deliver to a real time dashboard and archive Amazon EMR Data Analytics Compute operational metrics
  15. 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Compute Operational Metrics In Real Time Compute metrics using SQL in real time like: • Total calls by IP, service, API call, AWS IAM user • Amazon S3 API failures (or any other service) • Anomalous behavior of Amazon S3 API (or any other service) • Top 10 API calls across all services Amazon EMR Data Analytics Raw data Real time analytics
  16. 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. How Do We Aggregate Streaming Data? • A common requirement in streaming analytics is to perform set-based operation(s) (count, average, max, min,..) over events that arrive within a specified period of time • Cannot simply aggregate over an entire table like typical static database
  17. 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Windowing Concepts • Windows can be tumbling or sliding • Windows are of fixed length 1 5 4 26 8 6 4 t1 t2 t5t3 t4 Time Window1 Window2 Window3 Aggregate Function(Sum) 18 14Output Events t6
  18. 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing CloudTrail Event Logs AWS CloudTrail Compute operational metrics Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Amazon EMR Data Analytics
  19. 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Persist Data For Real Time Dashboards • Use Amazon Kinesis Data Firehose to archive processed data to Amazon S3 • Use AWS Lambda to deliver data to Amazon DynamoDB (or another database) • Open source or other tools to visualise the data Real time analytics AWS Lambda function Amazon S3 bucket for processed data Amazon DynamoDB Table(s) Redash Dashboard Amazon Kinesis Data Stream
  20. 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing CloudTrail Event Logs AWS CloudTrail Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Amazon S3 bucket for processed data AWS Lambda function Amazon DynamoDB Table(s) Chart.JS Dashboard Deliver to a real time dashboard and archive Amazon Kinesis Data Streams Compute operational metrics Amazon EMR Data Analytics
  21. 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. DEMO Analyse AWS CloudTrail Logs using Amazon EMR
  22. 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing CloudTrail Event Logs AWS CloudTrail Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Amazon S3 bucket for processed data AWS Lambda function Amazon DynamoDB Table(s) Chart.JS Dashboard Deliver to a real time dashboard and archive Amazon Kinesis Data Streams Compute operational metrics Amazon EMR Data Analytics
  23. 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Invent And Simplify
  24. 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Amazon Kinesis Data Streams Amazon Kinesis Data Firehose Build custom applications that process and analyse streaming data Easily load streaming data into AWS Amazon Kinesis Data Analytics Easily process and analyse streaming data with standard SQL
  25. 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Data Analytics • Powerful real time applications • Easy to use, fully managed • Automatic elasticity
  26. 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Data Analytics Applications Easily write SQL code to process streaming data Connect to a streaming source Continuously deliver SQL results
  27. 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing AWS CloudTrail Event Logs AWS CloudTrail Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Amazon S3 bucket for processed data AWS Lambda function Amazon DynamoDB Table(s) Chart.JS Dashboard Deliver to a real time dashboards and archival Amazon Kinesis Data Streams Compute operational metrics Amazon EMR Data Analytics
  28. 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing AWS CloudTrail Event Logs AWS CloudTrail Compute operational metrics Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Amazon S3 bucket for processed data AWS Lambda function Amazon DynamoDB Table(s) Redash Dashboard Deliver to a real time dashboards and archival Amazon Kinesis Data Streams Amazon Kinesis Data Analytics
  29. 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. DEMO Amazon Kinesis Data Analytics
  30. 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing AWS CloudTrail Event Logs AWS CloudTrail Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Amazon S3 bucket for processed data AWS Lambda function Amazon DynamoDB Table(s) Chart.JS Dashboard Deliver to a real time dashboards and archival Amazon Kinesis Data Streams Compute operational metrics Amazon Kinesis Data Analytics
  31. 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Try It Out Yourself Go to aws.amazon.com/kinesis/ Some good examples: • A click through template for AWS CloudTrail Event Log Analytics – https://tinyurl.com/RTInsights • A Click through template for Real-Time Web Analytics with Kinesis Data Analytics - https://tinyurl.com/RTWebAnalytics • Blog Posts on Kinesis - https://tinyurl.com/KinesisBlogs
  32. 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank You

×