SlideShare una empresa de Scribd logo
1 de 27
Descargar para leer sin conexión
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Apache Flink with Amazon
Kinesis
Greg Finch
Senior Product Manager
John Deere
A N T 3 9 5
Ryan Nienhuis
Senior Technical Product Manager
AWS, Amazon Kinesis
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
Streaming and Amazon Kinesis overview
New Capability: Amazon Kinesis Data Analytics for Java
Streaming data at John Deere
Architectural choices in streaming data
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is streaming data?
Low-latencyContinuous Ordered,
incremental
High volume
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Streaming with Amazon Kinesis
Easily collect, process, and analyze video and data streams in real time
Capture, process,
and store video
streams
Load data streams
into AWS data
stores
Analyze data
streams in real time
Capture, process,
and store data
streams
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Kinesis Data Streams overview
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data processing from a variety of consumers
Fully managed service for real-time processing of streaming data
Cost-effective: $0.014 per 1,000,000 PUT Payload Units
Millions of sources
producing 100’s of
terabytes per hour
Amazon Web Services
Front
End
AZ AZ AZAuthentic
authorization
Durable, highly consistent storage replicas data
across three data centers (availability zones)
Ordered stream of
events supports
multiple readers
Amazon Kinesis
Client Library
on EC2
Amazon Kinesis
Data Firehose
Amazon Kinesis
Data Analytics
AWS Lambda
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Apache Flink
Framework and distributed engine for stateful processing of data
streams.
Simple
programming
High performance
Stateful
Processing
Strong data
integrity
Easy to use and
flexible APIs make
building apps fast
In-memory
computing provides
low latency & high
throughput
Durable
application state
saves
Exactly-once
processing and
consistent state
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How do you build a Flink application?
Streaming operators are applied to data streams in a pipeline
Source
Sink
DataStream
KeyedDataStream
DataStream
Sink
keyBy,
window
filter
apply
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What does your code look like?
DataStream <GameEvent> rawEvents = env.addSource(
New KinesisStreamSource(“input_events”));
DataStream <UserPerLevel> gameStream =
rawEvents.map(event - > new UserPerLevel(event.gameMetadata.gameId,
event.gameMetadata.levelId,event.userId));
gameStream.keyBy(event -> event.gameId)
.keyBy(1)
.window(TumblingProcessingTimeWindows.of(Time.minutes(1)))
.apply(...) - > {...};
gameStream.addSink(new KinesisStreamSink("myGameStateStream"));
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018 Deere & Company, All rights reserved.
About John Deere
© 2018 Deere & Company, All rights reserved.
Sophisticated machines produce massive data streams
© 2018 Deere & Company, All rights reserved.
Machine Sync - real-time multi-machine coordination
© 2018 Deere & Company, All rights reserved.
Remote monitoring and adjustments
© 2018 Deere & Company, All rights reserved.
Operations Center
© 2018 Deere & Company, All rights reserved.
John Deere Data Platform
Processing millions
of sensor
measurements per
second.
Serving more than
one billion field
maps.
Supports monitoring,
tracking,
dashboarding, and
deep analysis
applications.
© 2018 Deere & Company, All rights reserved.
A simple solution for many applications
© 2018 Deere & Company, All rights reserved.
Managing state can get complicated
© 2018 Deere & Company, All rights reserved.
Keeping up: Over-sharding the stream
© 2018 Deere & Company, All rights reserved.
Keeping up: Fan-out
© 2018 Deere & Company, All rights reserved.
System complexity increases with use case complexity
© 2018 Deere & Company, All rights reserved.
Shifting to Apache Flink
© 2018 Deere & Company, All rights reserved.
An example
Source
Sessions
Stream
Source
Sensors
Stream
Map
Decode
Sessions
Map
Decode
Sensors
Join
Session &
Sensors
Window
Aggregate
Totals
Flat Map
Compute
Tile Keys
Window
Rasterize
Sink
Tiles to
S3
Sink
Totals to
DynamoDB
Key by
Key by
High
Parallelism
© 2018 Deere & Company, All rights reserved.
Powerful solution for sophisticated applications
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ryan Nienhuis
Greg Finch
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
 
Automate & Audit Cloud Governance & Compliance in Your Landing Zone (ENT315-R...
Automate & Audit Cloud Governance & Compliance in Your Landing Zone (ENT315-R...Automate & Audit Cloud Governance & Compliance in Your Landing Zone (ENT315-R...
Automate & Audit Cloud Governance & Compliance in Your Landing Zone (ENT315-R...
 
Cost Optimization Tooling (ARC301) - AWS re:Invent 2018
Cost Optimization Tooling (ARC301) - AWS re:Invent 2018Cost Optimization Tooling (ARC301) - AWS re:Invent 2018
Cost Optimization Tooling (ARC301) - AWS re:Invent 2018
 
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
 
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
 
AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...
AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...
AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...
 
DevSecOps: Instituting Cultural Transformation for Public Sector Organization...
DevSecOps: Instituting Cultural Transformation for Public Sector Organization...DevSecOps: Instituting Cultural Transformation for Public Sector Organization...
DevSecOps: Instituting Cultural Transformation for Public Sector Organization...
 
MassMutual Goes Cloud First with Hybrid Cloud on AWS (ENT210) - AWS re:Invent...
MassMutual Goes Cloud First with Hybrid Cloud on AWS (ENT210) - AWS re:Invent...MassMutual Goes Cloud First with Hybrid Cloud on AWS (ENT210) - AWS re:Invent...
MassMutual Goes Cloud First with Hybrid Cloud on AWS (ENT210) - AWS re:Invent...
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
 
Analyze Amazon CloudFront and Lambda@Edge Logs to Improve Customer Experience...
Analyze Amazon CloudFront and Lambda@Edge Logs to Improve Customer Experience...Analyze Amazon CloudFront and Lambda@Edge Logs to Improve Customer Experience...
Analyze Amazon CloudFront and Lambda@Edge Logs to Improve Customer Experience...
 
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
 
Architecture Patterns of Serverless Microservices (ARC304-R1) - AWS re:Invent...
Architecture Patterns of Serverless Microservices (ARC304-R1) - AWS re:Invent...Architecture Patterns of Serverless Microservices (ARC304-R1) - AWS re:Invent...
Architecture Patterns of Serverless Microservices (ARC304-R1) - AWS re:Invent...
 
[NEW LAUNCH!] Building modern applications using Amazon DynamoDB transactions...
[NEW LAUNCH!] Building modern applications using Amazon DynamoDB transactions...[NEW LAUNCH!] Building modern applications using Amazon DynamoDB transactions...
[NEW LAUNCH!] Building modern applications using Amazon DynamoDB transactions...
 
Policy Verification and Enforcement at Scale with AWS (SEC320) - AWS re:Inven...
Policy Verification and Enforcement at Scale with AWS (SEC320) - AWS re:Inven...Policy Verification and Enforcement at Scale with AWS (SEC320) - AWS re:Inven...
Policy Verification and Enforcement at Scale with AWS (SEC320) - AWS re:Inven...
 
Don’t Wait Until Tomorrow: From Batch to Streaming (ANT360) - AWS re:Invent 2018
Don’t Wait Until Tomorrow: From Batch to Streaming (ANT360) - AWS re:Invent 2018Don’t Wait Until Tomorrow: From Batch to Streaming (ANT360) - AWS re:Invent 2018
Don’t Wait Until Tomorrow: From Batch to Streaming (ANT360) - AWS re:Invent 2018
 
How GumGum Migrated from Cassandra to Amazon DynamoDB (DAT345) - AWS re:Inven...
How GumGum Migrated from Cassandra to Amazon DynamoDB (DAT345) - AWS re:Inven...How GumGum Migrated from Cassandra to Amazon DynamoDB (DAT345) - AWS re:Inven...
How GumGum Migrated from Cassandra to Amazon DynamoDB (DAT345) - AWS re:Inven...
 
Dissecting Media Asset Management Architecture and Media Archive TCO (MAE301)...
Dissecting Media Asset Management Architecture and Media Archive TCO (MAE301)...Dissecting Media Asset Management Architecture and Media Archive TCO (MAE301)...
Dissecting Media Asset Management Architecture and Media Archive TCO (MAE301)...
 
Implementing Multi-Region AWS IoT, ft. Analog Devices (IOT401) - AWS re:Inven...
Implementing Multi-Region AWS IoT, ft. Analog Devices (IOT401) - AWS re:Inven...Implementing Multi-Region AWS IoT, ft. Analog Devices (IOT401) - AWS re:Inven...
Implementing Multi-Region AWS IoT, ft. Analog Devices (IOT401) - AWS re:Inven...
 
Serverless Video Ingestion & Analytics with Amazon Kinesis Video Streams (ANT...
Serverless Video Ingestion & Analytics with Amazon Kinesis Video Streams (ANT...Serverless Video Ingestion & Analytics with Amazon Kinesis Video Streams (ANT...
Serverless Video Ingestion & Analytics with Amazon Kinesis Video Streams (ANT...
 
Real-Time Web Analytics with Amazon Kinesis Data Analytics (ADT401) - AWS re:...
Real-Time Web Analytics with Amazon Kinesis Data Analytics (ADT401) - AWS re:...Real-Time Web Analytics with Amazon Kinesis Data Analytics (ADT401) - AWS re:...
Real-Time Web Analytics with Amazon Kinesis Data Analytics (ADT401) - AWS re:...
 

Similar a Using Apache Flink with Amazon Kinesis (ANT395) - AWS re:Invent 2018

Similar a Using Apache Flink with Amazon Kinesis (ANT395) - AWS re:Invent 2018 (20)

Analyzing Streaming Data in Real Time
Analyzing Streaming Data in Real TimeAnalyzing Streaming Data in Real Time
Analyzing Streaming Data in Real Time
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Analyzing Streams: Data Analytics Week SF
Analyzing Streams: Data Analytics Week SFAnalyzing Streams: Data Analytics Week SF
Analyzing Streams: Data Analytics Week SF
 
Analyzing Streams: Data Analytics Week at the SF Loft
Analyzing Streams: Data Analytics Week at the SF LoftAnalyzing Streams: Data Analytics Week at the SF Loft
Analyzing Streams: Data Analytics Week at the SF Loft
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
BDA307 Analyzing Data Streams in Real Time with Amazon Kinesis
BDA307 Analyzing Data Streams in Real Time with Amazon KinesisBDA307 Analyzing Data Streams in Real Time with Amazon Kinesis
BDA307 Analyzing Data Streams in Real Time with Amazon Kinesis
 
Build an AWS Analytics Solution to Monitor the Video Streaming Experience (MA...
Build an AWS Analytics Solution to Monitor the Video Streaming Experience (MA...Build an AWS Analytics Solution to Monitor the Video Streaming Experience (MA...
Build an AWS Analytics Solution to Monitor the Video Streaming Experience (MA...
 
WildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing WorkshopWildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing Workshop
 
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS SummitServerless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
 
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
 
Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018
Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018
Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018
 
ABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis
ABD301-Analyzing Streaming Data in Real Time with Amazon KinesisABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis
ABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
 
Introduction to Real-Time Streaming Analytics - Amazon Kinesis State Of Union...
Introduction to Real-Time Streaming Analytics - Amazon Kinesis State Of Union...Introduction to Real-Time Streaming Analytics - Amazon Kinesis State Of Union...
Introduction to Real-Time Streaming Analytics - Amazon Kinesis State Of Union...
 
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
 
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
 
Keeping the Pace with Data Ingestion (GPSCT402) - AWS re:Invent 2018
Keeping the Pace with Data Ingestion (GPSCT402) - AWS re:Invent 2018Keeping the Pace with Data Ingestion (GPSCT402) - AWS re:Invent 2018
Keeping the Pace with Data Ingestion (GPSCT402) - AWS re:Invent 2018
 

Más de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Using Apache Flink with Amazon Kinesis (ANT395) - AWS re:Invent 2018

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Apache Flink with Amazon Kinesis Greg Finch Senior Product Manager John Deere A N T 3 9 5 Ryan Nienhuis Senior Technical Product Manager AWS, Amazon Kinesis
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda Streaming and Amazon Kinesis overview New Capability: Amazon Kinesis Data Analytics for Java Streaming data at John Deere Architectural choices in streaming data
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What is streaming data? Low-latencyContinuous Ordered, incremental High volume
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Streaming with Amazon Kinesis Easily collect, process, and analyze video and data streams in real time Capture, process, and store video streams Load data streams into AWS data stores Analyze data streams in real time Capture, process, and store data streams
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Data Streams overview
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data processing from a variety of consumers Fully managed service for real-time processing of streaming data Cost-effective: $0.014 per 1,000,000 PUT Payload Units Millions of sources producing 100’s of terabytes per hour Amazon Web Services Front End AZ AZ AZAuthentic authorization Durable, highly consistent storage replicas data across three data centers (availability zones) Ordered stream of events supports multiple readers Amazon Kinesis Client Library on EC2 Amazon Kinesis Data Firehose Amazon Kinesis Data Analytics AWS Lambda
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Apache Flink Framework and distributed engine for stateful processing of data streams. Simple programming High performance Stateful Processing Strong data integrity Easy to use and flexible APIs make building apps fast In-memory computing provides low latency & high throughput Durable application state saves Exactly-once processing and consistent state
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. How do you build a Flink application? Streaming operators are applied to data streams in a pipeline Source Sink DataStream KeyedDataStream DataStream Sink keyBy, window filter apply
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What does your code look like? DataStream <GameEvent> rawEvents = env.addSource( New KinesisStreamSource(“input_events”)); DataStream <UserPerLevel> gameStream = rawEvents.map(event - > new UserPerLevel(event.gameMetadata.gameId, event.gameMetadata.levelId,event.userId)); gameStream.keyBy(event -> event.gameId) .keyBy(1) .window(TumblingProcessingTimeWindows.of(Time.minutes(1))) .apply(...) - > {...}; gameStream.addSink(new KinesisStreamSink("myGameStateStream"));
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 12. © 2018 Deere & Company, All rights reserved. About John Deere
  • 13. © 2018 Deere & Company, All rights reserved. Sophisticated machines produce massive data streams
  • 14. © 2018 Deere & Company, All rights reserved. Machine Sync - real-time multi-machine coordination
  • 15. © 2018 Deere & Company, All rights reserved. Remote monitoring and adjustments
  • 16. © 2018 Deere & Company, All rights reserved. Operations Center
  • 17. © 2018 Deere & Company, All rights reserved. John Deere Data Platform Processing millions of sensor measurements per second. Serving more than one billion field maps. Supports monitoring, tracking, dashboarding, and deep analysis applications.
  • 18. © 2018 Deere & Company, All rights reserved. A simple solution for many applications
  • 19. © 2018 Deere & Company, All rights reserved. Managing state can get complicated
  • 20. © 2018 Deere & Company, All rights reserved. Keeping up: Over-sharding the stream
  • 21. © 2018 Deere & Company, All rights reserved. Keeping up: Fan-out
  • 22. © 2018 Deere & Company, All rights reserved. System complexity increases with use case complexity
  • 23. © 2018 Deere & Company, All rights reserved. Shifting to Apache Flink
  • 24. © 2018 Deere & Company, All rights reserved. An example Source Sessions Stream Source Sensors Stream Map Decode Sessions Map Decode Sensors Join Session & Sensors Window Aggregate Totals Flat Map Compute Tile Keys Window Rasterize Sink Tiles to S3 Sink Totals to DynamoDB Key by Key by High Parallelism
  • 25. © 2018 Deere & Company, All rights reserved. Powerful solution for sophisticated applications
  • 26. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Ryan Nienhuis Greg Finch
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.