Ingesting, storing, processing and delivering a large library of content involves massive complexity. This session walks through sample code that leverages AWS Services to perform all these tasks while coordinating the activities with Amazon Simple Workflow Service (SWF). Along the journey you are introduced to best practices for cost optimization, monitoring, reporting, and exception or error handling. In addition to the sample workflow, a guest speaker from Netflix takes the audience on a deep dive into their “digital supply chain” where you learn how they have automated their processes in moving data all the way from the studios to the last mile. Services covered include Amazon SWF, Amazon Simple Storage Service (S3), Amazon Glacier, Amazon Elastic Compute Cloud (EC2), Amazon Elastic Transcoder, Amazon Mechanical Turk, and Amazon CloudFront.
4. Scenario
• At any given time, company X produces 10 broadcast
quality shows
• Each show consists of 200 30-minute episodes per year
• High-res post-production copies of each show are
temporarily stored at company X’s studio in Tokyo
• The content must be made available for distribution to
consumers via web, mobile devices, and media players
• The high-res content must be archived for future access
25. Media Workflow
Amazon Simple Workflow Service (SWF)
Ingest
Processing
Discovery &
Delivery
Amazon Storage Services
Amazon S3 – Standard & RRS, Amazon Glacier
26. Amazon Simple Workflow (SWF)
• SWF
– Maintains distributed
application state
– Tracks workflow executions
– Dispatches tasks
(activities & deciders)
– Retains history
– Provides visibility
• Activities tasks
– Do the “work” associated
with a workflow step
• Decider tasks
– Determines which activity
task should come next
• Activities & deciders can
run anywhere (on prem, in
cloud)
27. Start
Decider Logic
No
NextActivity =
ACTIVITIES[len(EventList)]
Task
Exists?
Task = GetDecision
Yes
No
EventList with
[‘ActivityTaskCompleted’,
‘WorkflowExecutionStarted’]
All Activities
Completed?
Yes
Signal Completion of
Execution
Is First
Activity?
Yes
NextActivity.Input =
Execution Input
No
NextActivity.Input =
PreviosActivity.Result
28. Activity Worker – Code Snippet
from mwf_Ingest import *
swf_l1 = swf.Layer1()
while True:
task = swf_l1.poll_for_activity_task(domain['name'], workflow_type['task_list'])
if 'taskToken' in task:
task_token = task['taskToken']
task_input = json.loads(task['input'])
try:
if task['activityType']['name'] == activities[0]['name']:
remoteIP = task_input['remoteIP']
remoteFileName = task_input['remoteFileName']
s3Key_HighRes = get_rand() + remoteFileName[remoteFileName.rindex('.'):]
doWork_INGEST(remoteIP,remoteFileName,s3Key_HighRes)
dataToPass = {'s3Key_HighRes' : s3Key_HighRes}
task_status_s = json.dumps(dataToPass)
out = swf_l1.respond_activity_task_completed(task_token,task_status_s)
except:
out = swf_l1.respond_activity_task_failed(task_token,'','')
29. Workflow Steps
• Start workflow execution
• Ingest (transfer file to Amazon EC2 using
Tsunami UDP & upload to Amazon S3)
• Transcode file (multiple output formats)
• Select thumbnail
• Archive high-res file
• Signal completion of execution
30. Scalability & Fault Tolerance Analysis
Step
Ingest
Transcode
Archive to Amazon Glacier
Amazon Mechanical Turk
for thumbnails
Delivery with Amazon
CloudFront
Automation elements
Is Scalable?
Is Fault Tolerant?
36. Rules of the Game
•
•
•
•
•
•
200 MPH!
Purchase only
Quantities limited
It breaks, you fix it
Pay for parking
Obsolete in 1 year
•
•
•
•
85 MPH
Lease, cancel anytime
Unlimited quantity
It breaks, replace it, no
charge
• No parking, just walk away
• Brand new each year
37. Industry Heritage : Optimize for Latency
• Interactive editing
– Master creation
– DVD/Blu-ray authoring
– Edits for television
37
38. Netflix 2008
• Custom data center
• Custom GPU encoders
• Fixed size
• New format needed
– PC, Mac, Xbox
• Content library doubled
• Frequent HW failures
• Fail! Catalog incomplete
38
39. Fall 2009 – Launch Netflix PS3 Player
• First 100% AWS
transcode
• New format, unique to
Netflix PS3 player
• Encode recipe nailed
down late
• 3 weeks, transcode
entire catalog
40. Netflix 2009 to Present
•
•
•
•
US East AWS
Variable sized EC2 farm
S3 for storage
Optimized for throughput, not
latency
• No more missed deadlines
– Devices, catalogs, countries
40
41. Spring 2010 – Launch Netflix iPad Player
• Launch April 10th
• Apple approached us in
mid February
• Grew EC2 farm to 4,000
instances
• Entire library transcoded
in 2 weeks
• New format ready for
launch
41
43. For Netflix, Throughput Trumps Latency
•
•
•
•
Think horizontal, not vertical
Priuses move more people than Ferraris
Frequent re-encodes of growing libraries
Netflix is nimble because of AWS
43
44. More Proof That Horizontal Wins
• New countries, new content
• Codec innovation
44
45. AWS Handles Netflix Scale
• 6 regional catalogs
• 4 formats supported today
– 1 VC-1, 3 H.264
– Multiple bit rates per format
• 10s of 1000s of hours of content
• Petabytes of S3 storage
45
46. Netflix Media in AWS
• Matrix: The Netflix media pipeline
• MAPLE: New generation media
pipeline
• Concluding thoughts
47. New Generation : Address Faults and Latency
• More than 1 week 4K
transcode
• 2 – 3 days for HD transcode
• Fault intolerant
~700 Mbps
EC2: C1 Medium
10-16 Mbps
S3
• Maintenance is challenging
• Often too slow
– Day after broadcast
– Redelivery of damaged
content
47
48. MAPLE : Massively Parallel Encoding
• 5-minute chunks
– Close to real time
• Fault tolerant
• Easy maintenance
• Address low latency use cases
– Day after broadcast
– Redelivery of damaged content
EC2
S3
48
49. Netflix Media in AWS
• Matrix : The Netflix media pipeline
• MAPLE : New generation media
pipeline
• Concluding thoughts
49
50. We Would Do It All Over Again
• Don’t be fooled by IT cost
comparisons
– We don’t administer the gear
• 6,000 EC2 instances
• Petabytes of storage
• High network traffic
– Storage is durable
– It is a moving target
• You cannot put a price on nimble
50
51. Please give us your feedback on this
presentation
MED304
As a thank you, we will select prize
winners daily for completed surveys!