SlideShare una empresa de Scribd logo
1 de 32
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ray Zhu, Sr. Product Manager, Amazon Kinesis
9/1/2016
Streaming Data Analytics
with Amazon Kinesis
Firehose and Redshift
Agenda
• Kinesis Firehose and Redshift
• Stream Data to Redshift
o Step 1 Set Up Redshift DB and Table
o Step 2 Create Firehose Delivery Stream
o Step 3 Send Data to Firehose Delivery Stream
o Step 4 Query and Analyze the Data from Redshift
o Step 5 Monitor Streaming Data Pipeline
Load streaming data into Amazon S3,
Amazon Redshift, and Amazon
Elasticsearch Service
Kinesis Firehose
Petabyte-scale data warehouse
Amazon Redshift
Stream Data to Redshift
Zero administration: Capture and deliver streaming data into Amazon S3, Redshift, and
Elasticsearch Service without writing an application or managing infrastructure.
Direct-to-data store integration: Batch, compress, and encrypt streaming data for
delivery into data destinations in as little as 60 secs using simple configurations.
Seamless elasticity: Seamlessly scale to match data throughput without intervention.
Capture and submit
streaming data to Firehose
Firehose loads streaming data
continuously into Amazon S3,
Redshift, or Elasticsearch Service
Analyze streaming data using
your favorite analytical tools
Data Flow Overview
Step 1 Set Up Redshift DB and
Table
Cluster Details
Node Configuration
Additional Configuration
Review
Configure VPC Security Group
Configure VPC Security Group
US East (N. Virginia)
52.70.63.192/27
US West (Oregon)
52.89.255.224/27
EU (Ireland)
52.19.239.192/27
Connect to Redshift DB
Create a Redshift Table
create table TrafficViolation(
dateofstop date,
timeofstop timestamp,
agency varchar(100),
subagency varchar(100),
description varchar(300),
location varchar(100),
latitude varchar(100),
longtitude varchar(100),
accident varchar(100),
belts varchar(100),
personalinjury varchar(100),
propertydamage varchar(100),
fatal varchar(100),
commlicense varchar(100),
hazmat varchar(100),
commvehicle varchar(100),
alcohol varchar(100),
workzone varchar(100),
state varchar(100),
veichletype varchar(100),
year varchar(100),
make varchar(100),
model varchar(100),
color varchar(100),
violation varchar(100),
type varchar(100),
charge varchar(100),
article varchar(100),
contributed varchar(100),
race varchar(100),
gender varchar(100),
drivercity varchar(100),
driverstate varchar(100),
dlstate varchar(100),
arresttype varchar(100),
geolocation varchar(100));
Create a Redshift Table
Step 2 Set Up Firehose
Delivery Stream
Destination
Configuration
Review
Step 3 Send Data to Firehose
Delivery Stream
Sample Data
09/30/2014,23:51:00,MCP,"1st district, Rockville",DRIVER FAILURE TO STOP AT STEADY CIRCULAR RED
SIGNAL,PARK RD AT HUNGERFORD DR,,,No,No,No,No,No,No,No,No,No,No,MD,02 -
Automobile,2014,FORD,MUSTANG,BLACK,Citation,21-202(h1),Transportation
Article,No,BLACK,M,ROCKVILLE,MD,MD,A - Marked Patrol,
03/31/2015,23:59:00,MCP,"2nd district, Bethesda",HEADLIGHTS (*),CONNECTICUT AT METROPOLITAN
AVE,,,No,No,No,No,No,No,No,No,No,No,MD,02 -
Automobile,2003,HONDA,2S,BLUE,ESERO,55*,,No,HISPANIC,M,SILVER SPRING,MD,MD,A - Marked Patrol,
09/30/2014,23:30:00,MCP,"5th district, Germantown",FAILURE TO DISPLAY TWO LIGHTED FRONT LAMPS
WHEN REQUIRED,OBSERVATION @ RIDGE ROAD,,,No,No,No,No,No,No,No,No,No,No,MD,02 -
Automobile,2009,TOYOTA,CAMRY,RED,Warning,22-226(a),Transportation
Article,No,BLACK,F,GERMANTOWN,MD,MD,A - Marked Patrol,
03/31/2015,23:59:00,MCP,"5th district, Germantown",DRIVER FAILURE TO STOP AT STOP SIGN LINE,W/B
PLYERS MILL RD AT METROPOLITAN AVE,39.0338233333333,-
77.07544,No,No,No,No,No,No,No,No,No,No,MD,02 - Automobile,2007,ACURA,MDX,BLACK,Warning,21-
707(a),Transportation Article,No,WHITE,F,KENSINGTON,MD,MD,A - Marked Patrol,"(39.0338233333333, -
77.07544)"
US Government Open Data: https://catalog.data.gov/dataset/traffic-violations-56dda
Send Data
PutRecordRequest putRecordRequest = new PutRecordRequest();
putRecordRequest.setDeliveryStreamName(deliveryStreamName);
String data = line + "n";
Record record = createRecord(data);
putRecordRequest.setRecord(record);
FirehoseClient.putRecord(putRecordRequest);
Step 4 Query and Analyze the
Data from Redshift
Query Data
Connect via QuickSight
Visualize Data
Step 5 Monitor Streaming Data
Pipeline
Monitor with CloudWatch Metrics
Monitor with CloudWatch Logs
Q & A
Thank you!

Más contenido relacionado

Destacado

AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)
Amazon Web Services
 

Destacado (20)

AWS re:Invent 2016: Chalk Talk: Applying Security-by-Design to Drive Complian...
AWS re:Invent 2016: Chalk Talk: Applying Security-by-Design to Drive Complian...AWS re:Invent 2016: Chalk Talk: Applying Security-by-Design to Drive Complian...
AWS re:Invent 2016: Chalk Talk: Applying Security-by-Design to Drive Complian...
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECS
 
AWS Elemental Services for Video Processing and Delivery
AWS Elemental Services for Video Processing and DeliveryAWS Elemental Services for Video Processing and Delivery
AWS Elemental Services for Video Processing and Delivery
 
Storm and Cassandra
Storm and Cassandra Storm and Cassandra
Storm and Cassandra
 
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsWebinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
 
Building prediction models with Amazon Redshift and Amazon Machine Learning -...
Building prediction models with Amazon Redshift and Amazon Machine Learning -...Building prediction models with Amazon Redshift and Amazon Machine Learning -...
Building prediction models with Amazon Redshift and Amazon Machine Learning -...
 
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)
 
Machine Learning for Developers - Pop-up Loft Tel Aviv
Machine Learning for Developers - Pop-up Loft Tel AvivMachine Learning for Developers - Pop-up Loft Tel Aviv
Machine Learning for Developers - Pop-up Loft Tel Aviv
 
ECS for Amazon Deep Learning and Amazon Machine Learning
ECS for Amazon Deep Learning and Amazon Machine LearningECS for Amazon Deep Learning and Amazon Machine Learning
ECS for Amazon Deep Learning and Amazon Machine Learning
 
Voxxed Days Thesaloniki 2016 - Machine Learning for Developers
Voxxed Days Thesaloniki 2016 - Machine Learning for DevelopersVoxxed Days Thesaloniki 2016 - Machine Learning for Developers
Voxxed Days Thesaloniki 2016 - Machine Learning for Developers
 
AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
 
Die Bedeutung von Machine Learning für den e-Commerce am Beispiel von Amazon
Die Bedeutung von Machine Learning für den e-Commerce am Beispiel von AmazonDie Bedeutung von Machine Learning für den e-Commerce am Beispiel von Amazon
Die Bedeutung von Machine Learning für den e-Commerce am Beispiel von Amazon
 
AWS Business Essentials
AWS Business EssentialsAWS Business Essentials
AWS Business Essentials
 
Using Amazon Machine Learning to Identify Trends in IoT Data - Technical 201
Using Amazon Machine Learning to Identify Trends in IoT Data - Technical 201Using Amazon Machine Learning to Identify Trends in IoT Data - Technical 201
Using Amazon Machine Learning to Identify Trends in IoT Data - Technical 201
 
AWS SQS SNS
AWS SQS SNSAWS SQS SNS
AWS SQS SNS
 
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
 
Predictive Analytics: Getting started with Amazon Machine Learning
Predictive Analytics: Getting started with Amazon Machine LearningPredictive Analytics: Getting started with Amazon Machine Learning
Predictive Analytics: Getting started with Amazon Machine Learning
 
Intro to AWS Machine Learning
Intro to AWS Machine LearningIntro to AWS Machine Learning
Intro to AWS Machine Learning
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 

Similar a Streaming Data Analytics with Kinesis Firehouse and Redshift

How we cooked Elasticsearch, Consul, HAproxy and DNS-recursor
How we cooked Elasticsearch, Consul, HAproxy and DNS-recursorHow we cooked Elasticsearch, Consul, HAproxy and DNS-recursor
How we cooked Elasticsearch, Consul, HAproxy and DNS-recursor
Oleg Tokarev
 
Standard Provenance Reporting and Scientific Software Management in Virtual L...
Standard Provenance Reporting and Scientific Software Management in Virtual L...Standard Provenance Reporting and Scientific Software Management in Virtual L...
Standard Provenance Reporting and Scientific Software Management in Virtual L...
njcar
 
A walk in the clouds
A walk in the cloudsA walk in the clouds
A walk in the clouds
DaeMyung Kang
 

Similar a Streaming Data Analytics with Kinesis Firehouse and Redshift (20)

Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
How we cooked Elasticsearch, Consul, HAproxy and DNS-recursor
How we cooked Elasticsearch, Consul, HAproxy and DNS-recursorHow we cooked Elasticsearch, Consul, HAproxy and DNS-recursor
How we cooked Elasticsearch, Consul, HAproxy and DNS-recursor
 
(SOV204) Scaling Up to Your First 10 Million Users | AWS re:Invent 2014
(SOV204) Scaling Up to Your First 10 Million Users | AWS re:Invent 2014(SOV204) Scaling Up to Your First 10 Million Users | AWS re:Invent 2014
(SOV204) Scaling Up to Your First 10 Million Users | AWS re:Invent 2014
 
AWS Cloud Kata 2014 | Jakarta - 2-3 Big Data
 AWS Cloud Kata 2014 | Jakarta - 2-3 Big Data AWS Cloud Kata 2014 | Jakarta - 2-3 Big Data
AWS Cloud Kata 2014 | Jakarta - 2-3 Big Data
 
Standard Provenance Reporting and Scientific Software Management in Virtual L...
Standard Provenance Reporting and Scientific Software Management in Virtual L...Standard Provenance Reporting and Scientific Software Management in Virtual L...
Standard Provenance Reporting and Scientific Software Management in Virtual L...
 
Introduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf TrainingIntroduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf Training
 
Confluent-Ably-AWS-ID-2023 - GSlide.pptx
Confluent-Ably-AWS-ID-2023 - GSlide.pptxConfluent-Ably-AWS-ID-2023 - GSlide.pptx
Confluent-Ably-AWS-ID-2023 - GSlide.pptx
 
(ADV402) Beating the Speed of Light with Your Infrastructure in AWS | AWS re:...
(ADV402) Beating the Speed of Light with Your Infrastructure in AWS | AWS re:...(ADV402) Beating the Speed of Light with Your Infrastructure in AWS | AWS re:...
(ADV402) Beating the Speed of Light with Your Infrastructure in AWS | AWS re:...
 
A walk in the clouds
A walk in the cloudsA walk in the clouds
A walk in the clouds
 
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당::  AWS Summit Online Korea 2020AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당::  AWS Summit Online Korea 2020
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020
 
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
 
Embrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with RippleEmbrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with Ripple
 
AWS Cloud Kata 2014 | Jakarta - 2-1 AWS Intro and Scale 2014
AWS Cloud Kata 2014 | Jakarta - 2-1 AWS Intro and Scale 2014AWS Cloud Kata 2014 | Jakarta - 2-1 AWS Intro and Scale 2014
AWS Cloud Kata 2014 | Jakarta - 2-1 AWS Intro and Scale 2014
 
What's new with Amazon Redshift - ADB203 - New York AWS Summit
What's new with Amazon Redshift - ADB203 - New York AWS SummitWhat's new with Amazon Redshift - ADB203 - New York AWS Summit
What's new with Amazon Redshift - ADB203 - New York AWS Summit
 
Analyzing Log Data With Apache Spark
Analyzing Log Data With Apache SparkAnalyzing Log Data With Apache Spark
Analyzing Log Data With Apache Spark
 
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
 
AWS March 2016 Webinar Series - Building Big Data Solutions with Amazon EMR a...
AWS March 2016 Webinar Series - Building Big Data Solutions with Amazon EMR a...AWS March 2016 Webinar Series - Building Big Data Solutions with Amazon EMR a...
AWS March 2016 Webinar Series - Building Big Data Solutions with Amazon EMR a...
 
How Disney+ uses fast data ubiquity to improve the customer experience
 How Disney+ uses fast data ubiquity to improve the customer experience  How Disney+ uses fast data ubiquity to improve the customer experience
How Disney+ uses fast data ubiquity to improve the customer experience
 
(ARC317) Maintaining a Resilient Front Door at Massive Scale | AWS re:Invent ...
(ARC317) Maintaining a Resilient Front Door at Massive Scale | AWS re:Invent ...(ARC317) Maintaining a Resilient Front Door at Massive Scale | AWS re:Invent ...
(ARC317) Maintaining a Resilient Front Door at Massive Scale | AWS re:Invent ...
 

Más de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Streaming Data Analytics with Kinesis Firehouse and Redshift

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Ray Zhu, Sr. Product Manager, Amazon Kinesis 9/1/2016 Streaming Data Analytics with Amazon Kinesis Firehose and Redshift
  • 2. Agenda • Kinesis Firehose and Redshift • Stream Data to Redshift o Step 1 Set Up Redshift DB and Table o Step 2 Create Firehose Delivery Stream o Step 3 Send Data to Firehose Delivery Stream o Step 4 Query and Analyze the Data from Redshift o Step 5 Monitor Streaming Data Pipeline
  • 3. Load streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service Kinesis Firehose
  • 5. Stream Data to Redshift
  • 6. Zero administration: Capture and deliver streaming data into Amazon S3, Redshift, and Elasticsearch Service without writing an application or managing infrastructure. Direct-to-data store integration: Batch, compress, and encrypt streaming data for delivery into data destinations in as little as 60 secs using simple configurations. Seamless elasticity: Seamlessly scale to match data throughput without intervention. Capture and submit streaming data to Firehose Firehose loads streaming data continuously into Amazon S3, Redshift, or Elasticsearch Service Analyze streaming data using your favorite analytical tools Data Flow Overview
  • 7. Step 1 Set Up Redshift DB and Table
  • 13. Configure VPC Security Group US East (N. Virginia) 52.70.63.192/27 US West (Oregon) 52.89.255.224/27 EU (Ireland) 52.19.239.192/27
  • 15. Create a Redshift Table create table TrafficViolation( dateofstop date, timeofstop timestamp, agency varchar(100), subagency varchar(100), description varchar(300), location varchar(100), latitude varchar(100), longtitude varchar(100), accident varchar(100), belts varchar(100), personalinjury varchar(100), propertydamage varchar(100), fatal varchar(100), commlicense varchar(100), hazmat varchar(100), commvehicle varchar(100), alcohol varchar(100), workzone varchar(100), state varchar(100), veichletype varchar(100), year varchar(100), make varchar(100), model varchar(100), color varchar(100), violation varchar(100), type varchar(100), charge varchar(100), article varchar(100), contributed varchar(100), race varchar(100), gender varchar(100), drivercity varchar(100), driverstate varchar(100), dlstate varchar(100), arresttype varchar(100), geolocation varchar(100));
  • 17. Step 2 Set Up Firehose Delivery Stream
  • 21. Step 3 Send Data to Firehose Delivery Stream
  • 22. Sample Data 09/30/2014,23:51:00,MCP,"1st district, Rockville",DRIVER FAILURE TO STOP AT STEADY CIRCULAR RED SIGNAL,PARK RD AT HUNGERFORD DR,,,No,No,No,No,No,No,No,No,No,No,MD,02 - Automobile,2014,FORD,MUSTANG,BLACK,Citation,21-202(h1),Transportation Article,No,BLACK,M,ROCKVILLE,MD,MD,A - Marked Patrol, 03/31/2015,23:59:00,MCP,"2nd district, Bethesda",HEADLIGHTS (*),CONNECTICUT AT METROPOLITAN AVE,,,No,No,No,No,No,No,No,No,No,No,MD,02 - Automobile,2003,HONDA,2S,BLUE,ESERO,55*,,No,HISPANIC,M,SILVER SPRING,MD,MD,A - Marked Patrol, 09/30/2014,23:30:00,MCP,"5th district, Germantown",FAILURE TO DISPLAY TWO LIGHTED FRONT LAMPS WHEN REQUIRED,OBSERVATION @ RIDGE ROAD,,,No,No,No,No,No,No,No,No,No,No,MD,02 - Automobile,2009,TOYOTA,CAMRY,RED,Warning,22-226(a),Transportation Article,No,BLACK,F,GERMANTOWN,MD,MD,A - Marked Patrol, 03/31/2015,23:59:00,MCP,"5th district, Germantown",DRIVER FAILURE TO STOP AT STOP SIGN LINE,W/B PLYERS MILL RD AT METROPOLITAN AVE,39.0338233333333,- 77.07544,No,No,No,No,No,No,No,No,No,No,MD,02 - Automobile,2007,ACURA,MDX,BLACK,Warning,21- 707(a),Transportation Article,No,WHITE,F,KENSINGTON,MD,MD,A - Marked Patrol,"(39.0338233333333, - 77.07544)" US Government Open Data: https://catalog.data.gov/dataset/traffic-violations-56dda
  • 23. Send Data PutRecordRequest putRecordRequest = new PutRecordRequest(); putRecordRequest.setDeliveryStreamName(deliveryStreamName); String data = line + "n"; Record record = createRecord(data); putRecordRequest.setRecord(record); FirehoseClient.putRecord(putRecordRequest);
  • 24. Step 4 Query and Analyze the Data from Redshift
  • 28. Step 5 Monitor Streaming Data Pipeline
  • 31. Q & A