More Related Content Similar to Workshop part2 – Big Data (20) More from Amazon Web Services (20) Workshop part2 – Big Data2. THE MORE DATA YOU COLLECT
THE MORE VALUE YOU CAN
DERIVE FROM IT
11. + ELASTIC AND HIGHLY SCALABLE
+ NO UPFRONT CAPITAL EXPENSE
+ ONLY PAY FOR WHAT YOU USE
+ AVAILABLE ON-DEMAND
!
= REMOVE CONSTRAINTS
13. AWS Import / Export
AWS Direct Connect
GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE
15. Amazon S3,
Amazon Glacier,
Amazon DynamoDB,
Amazon RDS,
Amazon Redshift,
AWS Storage Gateway,
Data on Amazon EC2
GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE
21. !
!
!
NO ADMINISTRATION
28. AMAZON REDSHIFT LETS YOU
START SMALL AND GROW BIG
Extra Large Node (HS1.XL)
!
Single Node (2 TB)
!
Cluster 2-32 Nodes (4 TB – 64 TB)
Eight Extra Large Node (HS1.8XL)
Cluster 2-100 Nodes (32 TB – 1.6 PB)
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
XL
XL XL XL XL XL XL XL XL
XL XL XL XL XL XL XL XL
XL XL XL XL XL XL XL XL
XL XL XL XL XL XL XL XL
43. GPU INSTANCES"
!
G2"
CG1
1x NVIDIA Kepler GK104
8 vCPU (Intel Xeon E5-2670)
$0.65/h
2x NVIDIA Fermi M2050
16 vCPU (Intel Xeon X5570)
$2.10/h
44. ON A SINGLE INSTANCE
COMPUTE TIME: 4h
COST: 4h x $2.1 = $8.4
51. CASE STUDY:
"WITH AMAZON EMR WE CAN ANALYZE 100% OF THE DATA,
NOT JUST A SAMPLE"
- Sanjeevan Bala, Head of Data Planning & Analytics, Channel 4
52. Amazon S3,
Amazon DynamoDB,
Amazon RDS,
Amazon Redshift,
Data on Amazon EC2
GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE
60. Hourly server logs: how your
systems went wrong an hour ago
Weekly / Monthly Bill: What you
spent this past billing cycle
Daily customer report from your
website: tells you what deal or ad
to try next time
Daily fraud reports: tells you if there
was fraud yesterday
Daily business reports: tells me
how customers used AWS services
yesterday
Real-time metrics: what just went
wrong now
Real-time spending alerts/caps:
guaranteeing you can’t
overspend
Real-time analysis: what to offer
the current customer now
Real-time detection: blocks
fraudulent use now
Fast ETL into Amazon Redshift:
how are customers using
services now
63. Amazon S3,
Amazon DynamoDB,
Amazon RDS,
Amazon Redshift,
Data on Amazon EC2
Amazon S3,
Amazon Glacier,
Amazon DynamoDB,
Amazon RDS,
Amazon Redshift,
AWS Storage Gateway,
Data on Amazon EC2
GENERATE ➔ STORE ➔ ANALYZE ➔ SHARE
Amazon EC2
Amazon Elastic
MapReduce
AWS Import / Export
AWS Direct Connect
65. STREAM
PROCESSING
Amazon S3,
Amazon DynamoDB,
Amazon RDS,
Amazon Redshift,
Data on Amazon EC2
GENERATE ➔ ➔ SHARE
Amazon Kinesis
Stream Processing
on Amazon EC2