As a result of the advantages shown on the previous slides, there are more people running their data lakes, and analytics on AWS than anywhere else. This includes customers like FINRA, Netflix, Nasdaq, Amazon.com, Atlassian, Sysco, Airbnb, iRobot, CrowdStrike, Viber, 21st Century Fox, Vanguard, Takeda, Movable Inc, Expedia.com, Zillow, Yelp, Amgen, JustGiving, NTT Docomo, to name a few. In later slides, we will share details on a number of customers and how they use specific services to achieve their goals.
To enable this for customers, AWS provides the broadest and deepest portfolio of databases and analytics services than any other cloud provider, many of them named above to match with the function they perform (as shown on the previous slide.)
AWS offers at least:
10 data movement services
13 analytics services
18 machine learning and AI services
17 security and governance services
Maybe more since this slide was created!
<timing 2 minute>
Backup notes
the way to deal with massive amounts of data is to use S3 to store your data. It can store exabytes of data without breaking a sweat so u can store all your relational and non relational data and not have to throw anything away because your data base does not scale like was the case on-premise. We call if your Data Lake. It has all your data. To get data into the Data Lake, we have data movement services and devices. These bring all the data into the Data Lake from on-premise systems using snowball devices, migrate databases into s3 using database migration service and bring data from IOT and real time ingestion using kinesis. And we have a set of purpose built tools that work directly on data in S3. Redshift is our exabyte scale Data Warehouse that can work directly on data in S3. Quicksight is our BI tool that can query the data directly in S3. Athena is our adhoc query service for data in S3 that you can use in place of Redshift if you have need to do adhoc query to data sitting in S3. EMR supports Spark, and Hadoop to directly make sense out of data in S3.
And these tools are priced so inexpensively that you can make sense out of all your data and not have to delete data to save money. No need to do a ROI analysis on data. Redshift costs $1,000/TB per year instead of 10k to 50k like Teradata and Oracle on-premise systems. Athena can query data at a rate of ½ cent per GB and S3 can store GB of data for a whole month for 2.3 cents. Quicksight can query data for 30 cents for a 30 mt session instead of legacy tools that can cost you lot more money per user per month. You can give all your users access to all your data without breaking the bank and truly be a data driven organization.
AWS continues to innovate and find more efficient ways for you to analyze. We have an emerging serverless analytics stack – you can put all these systems together with zero infrastructure to manage. This lets you pay per use, with close to zero costs when things are ideal. Scales automatically; systems are highly available and fault tolerant by default.
IoT data is a great example where an “always on” system provides continuous sensor data, but the analytics is on-demand and you pay for those services only when you use them.
We also deliver our AI/ML services this way. While we offer GPUs and frameworks for experts, we also make higher level services for image and video analysis, transcription, translation etc, available via APIs with nothing for you to manage Simply put, we do a bunch of work behind the scenes, so you don’t have to.
As clouding services increases, cloud spending increases.
We have to find where we can improve and optimize
Provide visibility to prove the insight and lead to
Consumption model: turn off when work off -> pay as your demand
Measure: business output
Stop ?
Analyze:
Managed services: operation cost