Serverless architecture can eliminate the need to provision and manage servers required to process files or streaming data in real time.
In this session, we will cover the fundamentals of using AWS Lambda to process data from sources such as Amazon DynamoDB Streams, Amazon Kinesis, and Amazon S3. We will walk through sample use cases for real-time data processing and discuss best practices on using these services together. We will then demonstrate how to set up a real-time stream processing solution using just Amazon Kinesis and AWS Lambda, all without the need to run or manage servers.
Nell’iperspazio con Rocket: il Framework Web di Rust!
Real-time Data Processing Using AWS Lambda
1. AWS Cloud Kata for Start-Ups and Developers
Hong
Kong
Real-time Data Processing
Using AWS Lambda
KJ Wu
Solutions Architect
2. AWS Cloud Kata for Start-Ups and Developers
AWS Services for Data Processing
AWS Lambda
Amazon Kinesis
Architecture & Workflow for Streaming Data
Processing
Demo
Best Practices in Building Data Processing
Solutions
Agenda
3. AWS Cloud Kata for Start-Ups and Developers
AWS Services for Data
Processing
Amazon
Kinesis
AWS
Lambda
4. AWS Cloud Kata for Start-Ups and Developers
Amazon Lambda: Overview
Serverless compute service that runs code in response to events without need to manage servers
No Servers to Manage
• Automatically runs
code without
Provisioning or
Managing servers.
• Just write the code and
upload it to Lambda.
Continuous Scaling
• Auto scales Application
precisely with the size
of the workload.
• Code runs in Parallel &
Processes each trigger
individually
Subsecond Metering
• Starts Code within
milliseconds of an Event
Executes only when event
is triggered.
5. AWS Cloud Kata for Start-Ups and Developers
AWS Lambda: Serverless Compute in the Cloud
Easy to author, deploy, maintain,
secure and manage
Allows for focus on business logic,
not infrastructure
Stateless, event-driven code with native support for
Node.js, Java, and Python languages
Compute & Code without managing infrastructure like
EC2 instances and auto scaling groups
Makes it easy to Build back-end
services that perform at scale
6. AWS Cloud Kata for Start-Ups and Developers
Amazon Kinesis Streams
• Build your own custom
applications that process or
analyze streaming data
Amazon Kinesis Firehose
• Easily load massive volumes of
streaming data into Amazon S3,
ElasticSearch and Redshift
Amazon Kinesis: Overview
A managed service for streaming data ingestion and processing
Buffer size/interval
Data compression
Data Encryption with
KMS
7. AWS Cloud Kata for Start-Ups and Developers
Data Processing/Streaming
Architecture & Workflow
Smart
Devices
Click
Stream
Log
Data
8. AWS Cloud Kata for Start-Ups and Developers
AWS Lambda and Amazon Kinesis integration
Stream-based model:
▪ Lambda polls the stream; When new records detected Lambda
function invoked.
▪ New records are passed by Kinesis as parameter.
▪ Kinesis mapped as Event source in Lambda
Synchronous invocation:
▪ Lambda invoked by RequestResponse invocation
▪ Lambda function is executed once.
Event structure:
▪ Event received by Lambda function is a collection of records from
Kinesis stream.
▪ Lambda Kinesis Event source: Batch size/Max records configured to
be received per invocation.
9. AWS Cloud Kata for Start-Ups and Developers
Streaming Architecture Workflow: Lambda+Kinesis
Data Input Kinesis Action Lambda Data Output
IT application activity
Capture the
stream
Audit
Process the
stream
SNS
Metering records Condense Redshift
Change logs Backup S3
Financial data** Store RDS
Transaction orders** Process SQS
Server health metrics Monitor EC2
User clickstream Analyze EMR
IoT device data Respond Backend endpoint
Custom data Custom action Custom application
10. AWS Cloud Kata for Start-Ups and Developers
Common Architecture: Lambda + Kinesis
Real Time Data Processing
Amazon
Kinesis
AWS
Lambda 1
Amazon
CloudWatch
Amazon
DynamoDB
AWS
Lambda 2 Amazon
S3
1. Real-time event data sent to Amazon
Kinesis, allows multiple AWS Lambda
functions to process the same events.
2. In AWS Lambda, Function 1 processes the
incoming events and stores event data in
Amazon DynamoDB
3. Lambda Function 1 also sends values to
Amazon CloudWatch for simple monitoring
of metrics.
4. In AWS Lambda function, Function 2 stores
incoming data events in Amazon S3
11. AWS Cloud Kata for Start-Ups and Developers
Demo: Real time processing of
Amazon Kinesis data streams with
AWS Lambda
12. AWS Cloud Kata for Start-Ups and Developers
Data Processing:
Best Practices & Tips
13. AWS Cloud Kata for Start-Ups and Developers
Best Practices: Kinesis
Batch size:
▪ Number of records that AWS Lambda will
retrieve from Kinesis at the time of invoking
your function
▪ Increasing batch size will cause fewer
Lambda function invocations with more data
processed per function
Starting Position:
▪ The position in the stream where Lambda
starts reading
▪ Set to “Trim Horizon” for reading from
start of stream (all data)
▪ Set to “Latest” for reading most recent
data (LIFO) (latest data)
Performance tuning Kinesis as an
event source
14. AWS Cloud Kata for Start-Ups and Developers
Best Practices: Lambda
• Write your Lambda function code in a stateless style
• Be aware of Lambda retries on different error and retry scenario
• Synchronous invocation – The invoking application receives
a 429 error, and is responsible for retries.
• Stream-based event sources - Lambda attempts to process
the erring batch of records until the time the data expires
• Minimizing the use of startup code not directly related to
processing the current event, ex. the third party lib Cost &
Performance
15. AWS Cloud Kata for Start-Ups and Developers
Get Started: Data Processing with AWS
1. Create your first Kinesis stream. Configure hundreds of thousands of data
producers to put data into an Amazon Kinesis stream. Ex. data from Social
media feeds.
2. Create and test your first Lambda function. Use any third party library, even
native ones. First 1M requests each month are on us! (free-tier)
3. Read the Developer Guide, AWS Lambda and Kinesis Tutorial, and resources
on GitHub at AWS Labs
• http://docs.aws.amazon.com/lambda/latest/dg/with-kinesis.html
• https://github.com/awslabs/lambda-streams-to-firehose lambda-streams-to-firehose
Next Steps