Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Cloud Data Migration Strategies - AWS May 2016 Webinar Series

Cargando en…3

Eche un vistazo a continuación

1 de 51 Anuncio

Cloud Data Migration Strategies - AWS May 2016 Webinar Series

Descargar para leer sin conexión

AWS offers a variety of methods to migrate your data into the cloud. You may want perform regular backups, start collecting device streams, migrate a single large datastore, or simply establish dedicated connectivity and figure out what to do next. Which AWS cloud data migration offering is right for your needs?

This webinar will give you an overview of the six data migration tools we offer, including the strengths and weaknesses of each, as well as their complementary opportunities.

Learning Objectives:
• An overview of cloud data migration
• The basics of the six services (Direct Connect, Storage Gateway, Snowball, Transfer Acceleration, Firehose, 3rd party partners)
• An overview of the Amazon Content Distribution network and how it can help with long distance transfers into and out of the cloud
• Special emphasis on the new Amazon S3 Transfer Acceleration feature

AWS offers a variety of methods to migrate your data into the cloud. You may want perform regular backups, start collecting device streams, migrate a single large datastore, or simply establish dedicated connectivity and figure out what to do next. Which AWS cloud data migration offering is right for your needs?

This webinar will give you an overview of the six data migration tools we offer, including the strengths and weaknesses of each, as well as their complementary opportunities.

Learning Objectives:
• An overview of cloud data migration
• The basics of the six services (Direct Connect, Storage Gateway, Snowball, Transfer Acceleration, Firehose, 3rd party partners)
• An overview of the Amazon Content Distribution network and how it can help with long distance transfers into and out of the cloud
• Special emphasis on the new Amazon S3 Transfer Acceleration feature


Más Contenido Relacionado

Presentaciones para usted (20)

A los espectadores también les gustó (20)


Similares a Cloud Data Migration Strategies - AWS May 2016 Webinar Series (20)

Más de Amazon Web Services (20)


Más reciente (20)

Cloud Data Migration Strategies - AWS May 2016 Webinar Series

  1. 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Business Development May 2016 Cloud Data Migration: Eight Strategies for Getting Data into AWS
  2. 2. Storage is the Gravity for Cloud Applications
  3. 3. Amazon EFS File Amazon EBS Amazon EC2 Instance Store Block Amazon S3 Amazon Glacier Object Data Transfer AWS Direct Connect ISV Connectors Amazon Kinesis Firehose Storage Gateway S3 Transfer Acceleration AWS Storage is a Platform AWS Snowball Amazon CloudFront Internet/VPN
  4. 4. Internet / VPN Ingest
  5. 5. What is Internet/VPN… Globally Available Default method of ingesting content into Amazon S3 Simple standards based (HTTP) connection Use your existing internet connection Available within a VPC for VPN connectivity Acceleration via Multipart Upload Data Transfer Into AWS is free VPN Connections using VPC Virtual Private Gateway •$0.05 per VPN Connection-hour •$0.048 per VPN Connection-hour for connections to the Tokyo region
  6. 6. How does Internet / VPN ingest work? Accelerate Data Transfer using Multipart Upload Ingest Data Directly Into S3 Buckets with existing internet connectivity S3 Bucket AWS Region and Via Management Console or API customer gateway endpoints VPN connection Internet Internet via VPN + VPC
  7. 7. Amazon S3 Transfer Acceleration
  8. 8. What is Amazon S3 Transfer Acceleration… Network and Protocol Based Data Transfer Service Acceleration of Data Ingress / Egress with S3 Buckets Typically 50% to 400% faster Feature of S3 Enabled at the Bucket Level Available at All S3 Regions Worldwide No Client / Server Software Required No Code Changes to Your Application No Firewall Exceptions Simple Pricing Model
  9. 9. Ingest & Egress with S3 transfer acceleration S3 Bucket AWS Edge Location Uploader Optimized Throughput! Uses AWS 55 global edge locations AWS determines best edge location Data transfer optimized between edge and customer, and edge and S3 Data is not stored on the edge cache
  10. 10. Amazon Route 53 Resolve HTTPS PUT/POST HTTP/S PUT/POST “” Service traffic flow Client to S3 Bucket example S3 Bucket EC2 Proxy AWS Region AWS Edge Location Customer Client 1 2 3 4 Data is not cached on the AWS Edge location Fully Managed File Transfer Acceleration using all AWS Edge Locations
  11. 11. Using the Service is as easy as 1, 2, 3… Update Application to Point to new S3 URL • Update“” to “<bucket-name>” • Original bucket location and contents are the same, only namespace changes Or Use Permissions via API s3:PutAccelerateConfiguration Enable the Service in the Management Console Start Uploading Data to Amazon S3 1 2 3
  12. 12. How fast is S3 transfer acceleration? Rio De Janeiro Warsaw New York Atlanta Madrid Virginia Melbourne Paris Los Angeles Seattle Tokyo Singapore Time[hrs] 500 GB upload from these edge locations to a bucket in Singapore Public InternetS3 Transfer Acceleration
  13. 13. How much will it help me?
  14. 14. Speed Checker Demo
  15. 15. Pricing* Dimension Price / GB Data Transfer In from Internet** $0.04 (Edge location in US, EU, JP) $0.08 (Edge location in rest of the world) Data Transfer Out to Internet $0.04 Data Transfer Out to Another AWS Region $0.04 Amazon S3 Charges Standard data transfer charges apply *Plus standard Amazon S3 data transfer charges apply **Accelerated performance or there is no bandwidth charge
  16. 16. Amazon CloudFront
  17. 17. Global Content Delivery Network 55 Edge Locations Worldwide Supports Ingest via PUT/POST methods Works with S3 Multi-part upload Supports SSL SNI and TLS connections Integrated with ACM and AWS WAF for additional security Proxy ingest to S3, EC2 and even your own origins Tiered and Custom Pricing Models What is Amazon CloudFront…
  18. 18. Using CloudFront to Ingest Data into AWS AWS Region Customer Client HTTP/S PUT/POST “” Amazon EC2 S3 Bucket ELB Custom Origin CloudFront Edge Location Ingest content into S3, EC2, ELB or your own custom origin with Amazon CloudFront Use cache behaviors to direct to the correct origin based on PATH pattern matching Restrict Access via Geo Restriction or AWS WAF Web ACL
  19. 19. Amazon CloudFront Pricing Data Transfer out of Amazon CloudFront to your origin server billed at the “Regional Data Transfer Out to Origin” rates listed in the Regional Data Transfer Out to Origin (per GB) table. Data Transfer out of Amazon CloudFront to Internet will be charged at rates listed in “Regional Data Transfer Out to Internet (per GB)” table. Amazon CloudFront offers additional pricing options via a CloudFront Reserved Capacity (CFRC) contract. Contact sales for additional details and pricing.
  20. 20. AWS Direct Connect
  21. 21. Dedicated, 1 or 10 GE private pipes into AWS Create private (VPC) or public virtual interfaces to AWS Reduced data-out rates (data-in still free) Consistent network performance At least 1 location to each AWS region Option for redundant connections Uses BGP to exchange routing information over a VLAN What is AWS Direct Connect…
  22. 22. Physical Connection • Cross Connect at the location • Single Mode Fiber - 1000Base-LX or 10GBASE-LR • Potential onward Delivery via Direct Connect Partner • Customer Router
  23. 23. At the Direct Connect Location CORP AWS Direct Connect Routers Custo mer Router Colocat ion DX Location Customer Network ` AWS Backbone Network Cross Connect Customer Router Customers Network Demarcation
  24. 24. Dedicated Port via Direct Connect Partner AWS Direct Connect Routers Colocat ion DX Location Partner Network AWS Backbone Network Cross Connect Customer Router Partner Network Access Circuit Demarcation Partner Equipment CORP
  25. 25. Direct Connect - Locations AWS Region AWS Direct Connect Location Asia Pacific (Singapore) Equinix SG2, GPX, Mumbai Asia Pacific (Seoul) KINX, Seoul Asia Pacific (Sydney) Equinix SY3, Global Switch Asia Pacific (Tokyo) Equinix OS1, Equinix TY2 China (Beijing) Sinnet JiuXianqiao IDC, CIDS Jiachuang IDC EU (Frankfurt) Equinix FR5, Interxion Frankfurt EU (Ireland) TelecityGroup, London Docklands’, Eircom Clonshaugh Equinix LD4 - LD6, London South America (Sao Paulo) Terremark NAP do Brasil, Tivit US East (Virginia) CoreSite NY1 & NY2, Equinix DC1 - DC6 & DC10 US West (Northern California) CoreSite One Wilshire & 900 North Alameda, CA, Equinix SV1 & SV5 US West (Oregon) Equinix SE2 & SE3, Switch SUPERNAP, Las Vegas AWS GovCloud (US) Equinix SV1 & SV5
  26. 26. Amazon Kinesis Firehose
  27. 27. Amazon Kinesis Platform Amazon Kinesis streaming data on the AWS cloud • Amazon Kinesis Streams • Amazon Kinesis Firehose • Amazon Kinesis Analytics
  28. 28. Amazon Kinesis Firehose Load massive volumes of streaming data into Amazon S3 and Amazon Redshift Zero administration: Capture and deliver streaming data into S3, Redshift, and other destinations without writing an application or managing infrastructure. Direct-to-data store integration: Batch, compress, and encrypt streaming data for delivery into data destinations in as little as 60 secs using simple configurations. Seamless elasticity: Seamlessly scales to match data throughput w/o intervention Capture and submit streaming data to Firehose Firehose loads streaming data continuously into S3 and Redshift Analyze streaming data using your favorite BI tools
  29. 29. Vertical/Use Case Accelerated Ingest- Load to final destination for Analytics Ad Tech/ Marketing Analytics Advertising data aggregation Consumer Online/Gaming Online customer engagement data aggregation Financial Services Market/ Financial Transaction order data collection IoT / Sensor Data Fitness device , vehicle Sensor, telemetry data ingestion Amazon Kinesis Firehose Use Cases
  30. 30. AWS Storage Gateway
  31. 31. What is AWS Storage Gateway? Works with your existing applications Secure and durable storage in AWS Low-latency for frequently used data Scalable and cost-effective on-premises storage - $125 per gateway per month + S3/Glacier storage fees Service connecting an on-premises software appliance with cloud-based storage
  32. 32. Common uses for AWS Storage Gateway Backup and archive Disaster recovery Data migration
  33. 33. How does AWS Storage Gateway work? Amazon EBS snapshots Amazon S3 Amazon Glacier AWS Storage Gateway appliance Application server AWS Storage Gateway backend Customer premises S3 Transfer Acceleration AWS Direct Connect Internet
  34. 34. AWS Storage Gateway configurations iSCSI block storage Gateway-stored volumes iSCSI virtual tape storage Low-latency for all your data with point-in-time backups to AWS Replacement for on-premises physical tape infrastructure for backup and archive Gateway-cached volumes Gateway-virtual tape library (VTL) Low-latency for frequently used data with all data stored in AWS
  35. 35. Gateway-virtual tape library (VTL) • Replace or augment your aging tape infrastructure with durable object storage • Virtual tapes stored in AWS. Frequently accessed data cached on-premises • Up to 1,500 tapes, up to 2.5 TB each, for up to 150 TB per gateway-VTL • Unlimited number of tapes in virtual tape shelf (VTS) Customer data center VTS storage backed by Amazon Glacier AWS Storage Gateway VM Backup Server INITIATOR AWS Storage Gateway service MEDIA CHANGER Upload Buffer Cache Storage Gateway-VTL storage backed by Amazon S3 VT S TAPE DRIVE
  36. 36. AWS Snowball
  37. 37. What is AWS Snowball? Petabyte-scale data transport E-ink shipping label Ruggedized case “8.5G impact” All data encrypted end-to-end Rain- and dust- resistant Tamper-resistant case and electronics 80 TB 10 GE network
  38. 38. How it works
  39. 39. • Less than 1 day to transfer 200TB via 3x10G connections with 3 Snowballs, less than 1 week including shipping • Number of days to transfer 200TB via the Internet at typical utilizations How fast is Snowball? Internet Connection Speed Utilization 1Gbps 500Mbps 300Mbps 150Mbps 25% 71 141 236 471 50% 36 71 118 236 75% 24 47 225 157
  40. 40. Use cases: AWS Snowball Cloud Migration Disaster Recovery Data Center Decommission Content Distribution
  41. 41. Pricing Dimension Price Usage Charge per Job $200.00 (50 TB) $250.00 (80 TB) Extra Day Charge (First 10 days* are free) $15.00 Data Transfer In $0.00/GB Data Transfer Out $0.03/GB Shipping** Varies Amazon S3 Charges Standard storage and request fees apply * Starts one day after the appliance is delivered to you. The first day the appliance is received at your site and the last day the appliance is shipped out are also free and not included in the 10-day free usage time. ** Shipping charges are based on your shipment destination and the shipping option (e.g., overnight, 2-day) you choose.
  42. 42. AWS Technology Partnerships
  43. 43. Amazon Storage Partner Ecosystem Gateway/NAS Data Management Sync and ShareBackup/DR Content and Acceleration Archive File System
  44. 44. Example of Data Transfer with Partner Solution: Attunity Cloudbeam for AWS S3 EMR Hourly Model, BYOL, and Free Trial Available
  45. 45. Backup to AWS Approaches Amazon S3 Amazon Glacier AWS Direct Connect Internet Amazon S3-IA Application servers Cloud Gateway Local disk Media Server Cloud Gateway Application servers Backup SW cloud connector Local disk Media Server with cloud connector
  46. 46. CommVault Ties Together On Premise and Cloud Data Strategies Commvault Orchestrates the Enterprise • Back up in the Cloud: Keep backups of cloud workloads internal to the cloud • Back up to the Cloud: Allow on premise workloads the ability to leverage AWS • Disaster Recovery to the Cloud: Automate disaster recovery to the cloud on a scheduled basis • Workload Portability: Rest assured that virtual servers can be moved from on- premise to the cloud and back, keep your data available wherever you need it • Archiving to the Cloud: Moving legacy data to tier 2 storage in the cloud for long term archive AWS and Commvault together combine to minimize networking, storage and infrastructure costs, while providing the business a sound data protection and disaster recovery strategy.
  47. 47. Backup to AWS Approaches Amazon S3 Amazon Glacier AWS Direct Connect Internet Amazon S3-IA Application servers Cloud Gateway Local disk Media Server Cloud Gateway Application servers Backup SW cloud connector Local disk Media Server with cloud connector
  48. 48. NetApp AltaVault Backup from On- premises to S3/Glacier Common backup applications integrated with AltaVaultSolve backup & archive headaches with cloud-integrated storage  90% reduction in time, cost, and data volumes  Shrink recovery times from days to minutes  85% of backup & software providers supported Glacier On Premises AWS Cloud-integrated storage appliance NetApp AltaVault FAS E-Series Non-NetApp Storage  NetApp SnapProtect  Arcserve  CommVault Simpana  EMC NetWorker  HP Data Protector  IBM Tivoli Storage Mgr  Symantec Backup Exec  Symantec NetBackup  Veeam  Microsoft SQL Server  Oracle RMAN S3 AltaVault also available on marketplace to protect cloud-native workloads Seamlessly integrates into existing storage and backup software environment Caches recent backups locally, vaults older copies to the cloud Store data in the public or private cloud of choice Deduplicates, compresses, and encrypts
  49. 49. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. unified file services that extend from endpoints, to remote offices, to the cloud. snapshots, file versioning and file sync runs across all access points via the cloud data is secured and optimized at the source all stored in your AWS VPC, data is stored on AWS S3-IA Integrated with trusted enterprise security and management tools ROBO NAS Gateways Endpoint Apps Cloud Server Agents Data Protection Engine File Sync Engine with centralized automation, management and multi-tenancy Identity management data governance cloud orchestration S3 Infrequent Access CTERA Global Deduplication Ctera: Enterprise File Services Platform
  50. 50. Summary – When to Use each Service IF YOU NEED: CONSIDER: An optimized or replacement Internet connection to: connect directly into an AWS regional datacenter Direct Connect migrate TB or PB of data to the cloud Snowball Accelerate data transfer S3 Transfer Acceleration, CloudFront, AWS Partner A friendly interface into S3 to: cache data locally in a hybrid model (for performance reasons) Storage Gateway, AWS Partner redirect backups or archives with minimal disruption Storage Gateway, AWS Partner aggregate data streams from multiple devices Kinesis Firehose
  51. 51. Thank you!

Notas del editor

  • Hi everyone and thanks for joining the webinar. Today we’re going to discuss 8 effectives methods to move data into the AWS cloud and discuss which use cases each one is best suited for. The audience is expected to have basic knowledge of core AWS principles and services, such as VPC and S3.
  • Storage is more than just the protocol or interface. It’s the lifeblood of application design and renewed architectures. Our customers have taught us that they need two things: scale and trust. 1. Make sure I can grow. 2. Make sure I can access what I need when I need it, (and of course help me keep costs down). Today we’re focusing on the latter - data migration options but I wanted to give you a quick taste on the different storage services we provide once you transfer the data over.

    Data comes in all shapes and sizes with different requirement for performance, cost and access interface. AWS provides several services that address those different needs – S3 is a globally accessible and highly scalable object storage service. Glacier has a similar object architecture but caters for deep archives and long-term backup. EBS is high performance block storage dedicated to EC2 and EFS is a high performance NFS-based file system dedicated to EC2.

    What’s common to all these services is that they’re all run on AWS which means there’s no need for you to put up any upfront investment or commitment, you pay only for what you use and don’t have to perform any complex or risky capacity planning modeling. It’s easy to use, all of them were designed to very high levels of availability and durability and allow you to reduce time to market by easily provisioning PBs of capacity in minutes without ever worrying about running out of space or hitting performance bottlenecks.

    The suite of transfer services that support customers in their migrations means more choice. Large batches, incremental changes, constant streams or seamless integration are all part of the storage offering. Today we’re going to talk about all 6 of these data migration strategies, including two of the newest ways to do cloud data migration, Snowball and S3 Transfer Acceleration.

    Note to presenters: Disk Transfer service is not EOL but has been deprecated out of the transfer services story in favor of Snowball. Snowball has already surpassed the amount of data imported over the lifetime of the disk transfer service.

    EFS is in preview and due before the end of the year
  • Lets start with DX
  • S3 Transfer Acceleration Service provides our customers with an easy to use option of accelerating content ingest and egress to or from an S3 bucket.
    The service makes use of protocol optimization and takes advantage of AWS private network assets to speed up data transfer.

    Customers can enable this feature on an S3 bucket by bucket basis and elect to use the service on an asset by asset basis.
  • Lets start with DX
  • S3 Transfer Acceleration Service provides our customers with an easy to use option of accelerating content ingest and egress to or from an S3 bucket.
    The service makes use of protocol optimization and takes advantage of AWS private network assets to speed up data transfer.

    Customers can enable this feature on an S3 bucket by bucket basis and elect to use the service on an asset by asset basis.

    You can use all of the Amazon S3 operations through the transaction acceleration endpoint, except for the following the operations: 
    GET Service (list buckets), PUT Bucket (create bucket), and DELETE Bucket. Also, Amazon S3 Transfer Acceleration does not support cross region copies using PUT Object - Copy.
  • This is why we’re happy to introduce S3 Transfer Acceleration, a way to move data faster over long geographic distances. “Long distances” means between continents, not across town. It ensures that your data moves as fast as your first mile, and removes the vagaries of intermediate networks.

    S3-XA has shown typical performance benefits of up to 400% (5x) in optimal conditions that we’ve seen from internal testing and our beta customer results.
    S3-XA is extremely simple to use. As it is a feature of S3, you simply need to enable your bucket with a checkbox, and change your endpoint.
    To mitigate the problem we described earlier about the long paths a file transfer takes. S3-XA leverages our 54 POP locations to insure your transfers travel a shorter distance on the public internet and then travel the remaining portion over an optimized route via the Amazon backbone.
    Since S3-XA is an extension of S3, it uses standard TCP and HTTP and thus does not require any firewall exceptions or custom software installation.
  • This is how the flow of a request transferred through S3 XA looks like:
    The client’s request hits Route 53 which resolves the accelerate endpoint to the best POP latency wise. From there, S3 Transfer Acceleration selects the fastest path to send data over persistent connections to EC2 proxy fleet over HTTPS in the same AWS region as the S3 bucket. We maximize the send and receive windows here to maximize customer’s utilization of the available bandwidth. From here, the request is finally sent to S3.

    The service achieves acceleration thanks to: improvements on the slide?
    Routing optimized to maximize routing on AMZN network
    TCP optimizations along the path to maximize data transfer
    Persistent connections to minimize connection setup and maximize connection reuse

  • See how much geography hurts?

    In general, the farther your bucket, the more benefit from moving over the AWS network.
  • To determine if S3-XA is something that will benefit you and your customers, we developed an S3-XA Speed Checker to compare the likely transfer speed for a given endpoint. The tools compares the upload speed of S3 and S3-XA from the destination where the tool is running on to a other S3 regions. Depending on where your S3 bucket lives, you can determine if S3-XA will give you the performance benefits you desire before turning the feature on.
  • In general, the farther away you are from an Amazon S3 region, the higher the speed improvement you can expect from using Amazon S3 Transfer Acceleration. If you see similar speed results with and without the acceleration, your upload bandwidth or a system constraint might be limiting your speed.
  • Data Transfer In from Internet depends on the location from where the request originated.
    No request fees.
    Simple per GB pricing.

    Legal approved language on fast or free: For uploads only, Each time you use Amazon S3 Transfer Acceleration to transfer an object, we will check whether Amazon S3 Transfer Acceleration likely will be faster than a regular Amazon S3 transfer. To do this, we will use the origin location of the object transferred and the location of the edge location processing the accelerated transfer relative to the destination AWS region. If we determine, in our sole discretion,  that Amazon S3 Transfer Acceleration likely was not faster than a regular Amazon S3 transfer of the same object to the same destination AWS region, we will not charge for that use of Amazon S3 Transfer Acceleration.
  • Lets start with DX
  • S3 Transfer Acceleration Service provides our customers with an easy to use option of accelerating content ingest and egress to or from an S3 bucket.
    The service makes use of protocol optimization and takes advantage of AWS private network assets to speed up data transfer.

    Customers can enable this feature on an S3 bucket by bucket basis and elect to use the service on an asset by asset basis.
  • Lets start with DX
  • DX provides you with a dedicated pipe into AWS and we’ll talk over the next few slides how that works. It can either extend your on-prem network into your own Amazon VPC or give you direct access to regional services, such as S3.

    Same as with sending data over the Internet, data transfer in is free, but data transfer out is usually 2 or 3c per GB, which is significantly cheaper than Internet Out.

    Because you have a dedicated pipe you get consistent throughput and latency.
    Each AWS Region is associate with at least 1 Direct Connect Location. These locations are large co-lo facilties like Equinix or Coresite with large concentrations of customers.

    You can deploy multiple 1 and 10 GE connections for bandwidth aggregation and redundancy and also use the Internet to backup your DX links.

    Finally we use the BGP protocol to peer over the VLANs in the dedicated link.
  • We extend private fiber from our AWS region to the Direct Connect locations associated to the same region and set up a patch panel in a Meet Me Room.
    If you already have a footprint in these locations, its an easy cross connect into Direct Connect. If not, you can work with one of our Telco Partners to establish last mile connectivity. We recommend redundant fiber connection – or at least a VPN connection as a backup.
  • There are 2 main deployment models. The first depicted in this diagram is for a customer that already has infrastructure deployed at a colocation provider, such as Equinox and Coresigt where we offer DX connectivity. Here you can place a router in the colocation and the colo provider will cross connect it to the AWS backbone. What you buy is the cross connect service from the colo and access to GE or 10 GE ports from AWS and get dedicated connectivity from your network to ours.

  • Alternatively, a Direct Connect Partner, such as AT&T or Level3 can arrange a cross-connect on your behalf to their network equipment at the facility.
    They can then transport it using one of the previously mentioned services to your location and router where you still run the BGP over VLANs. This is a good choice if you don’t have a presence in any of our supported colocations.
  • We have more than 20 DX locations at this time and we keep expanding the service based on customer feedback and according the our region expansion plans. You can see we also have a DX location for our recently launched region in SK. You can view the complete list of locations, associated regions and partners globally on our website.
  • Finally we have Kinesis Firehose, a new and exciting service we introduced late last year.
  • Back in November 2013, re:Invent we introduced Amazon Kinesis to the world.
    Since then, Amazon Kinesis Platform became a core foundation Service in AWS that other services are using it (e,g, Cloudwatch, DynamoDBSteams)

    Amazon Kinesis is a platform for streaming data on AWS, offering powerful services to make it easy to load and analyze streaming data, and also providing the ability for you to build custom streaming data applications for specialized needs. Web applications, mobile devices, wearables, industrial sensors, and many software applications and services can generate staggering amounts of streaming data – sometimes TBs per hour – that need to be collected, stored, and processed continuously. Amazon Kinesis services enable you to do that simply and at a low cost.
  • Amazon Kinesis Firehose is the easiest way to load streaming data into AWS. It can capture and automatically load streaming data into S3 and Redshift enabling near real-time analytics with existing business intelligence tools and dashboards you’re already using today. It is a fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration. It can also batch, compress, and encrypt the data before loading it, minimizing the amount of storage used at the destination and increasing security. You can easily create a Firehose delivery stream from the AWS Management Console, configure it with a few clicks, and start sending data to the stream from hundreds of thousands of data sources to be loaded continuously to AWS – all in just a few minutes.
    With Amazon Kinesis Firehose, you only pay for the amount of data you transmit through the service. There is no minimum fee or setup cost.
  • Let’s look at how customers adopt Kinesis Firehose and utilizing Amazon S3 as Data Lake for Analytics purposes.
The primary use case is accelerated ingest when vast amounts of Streaming data needs to be collected, with each record being pretty small (Byte,KB) but millions or even billion of these records need to be aggregated into larger files and to be presisted in S3
    In AdTech and Marketing-  customers are using it for advertising data aggregation for advanced analytics, for example optimizing bid scores, email campaign management optimization. Clickstream data analytics is also popular as we try to analyze user behavior by looking at data assets.
    In Financial Services - customers are running audits and capture massive amount of real time financial transactions
    Finally, IoT is becoming popular as well over the last couple of years where data is flowing from fitness devices, vehicle sensors, telemetry data, machine data and more
  • Moving on to ingest services, let’s start with the AWS SGW.
  • SGW expands or replaces your on-premises storage, using a software appliance that connects to AWS storage.
    SGW works with your existing applications using industry standard storage protocols so you don’t need to modify your applications or use new data management tools.
    In Windows SGW can appear as a new drive letter, and in Linux it can be mounted as a file systems. Applications are unaware they are reading and writing data to and from cloud-backed storage.
    Data is encrypted before it leaves your data center, and while stored in AWS and Storage is backed by Amazon S3, Glacier, and EBS.
    The data you access most frequently is kept on-premises giving low-latency access to data which is durably stored in the cloud.
    Like many AWS services, you grow as needed and only pay for the storage you use.
    There’s no need to over-provision or over purchase storage and you can expand or reduce storage without needing to add or remove hardware.
  • We see customers adopting Storage Gateway for 3 common use cases:

    The first is backup and archive which is a natural fit for cost-effective cloud-storage. The SGW can be added to your existing backup process without impacting production operations.

    The second common use is Disaster Recovery (DR).
    Customers use the service for secure and durable off-site storage in the event of a disaster.
    Using SGW allows for data recovery to a secondary data center or into AWS which is increasingly becoming the DR site for many customers.

    The next use is customers who are migrating or need to mirror datasets into AWS. SGW simplifies this process moving or copying your data into or out of AWS.
  • Let’s take a look at how the service works.

    On-premises, your Application server connects to the AWS Storage Gateway appliance.
    Your Application could be a Backup Server

    The Appliance – which we often refer to as the gateway – securely connects to the service backend.
    The connection from the Appliance is outbound towards AWS using HTTPS so it’s firewall friendly.
    You can route this over the open Internet or use a Direct Connect if you have one.
    Additionally, you can leverage S3-XA to accelerate data transfer from on prem to AWS.
    The backend in turn integrates with Amazon S3, Glacier, and EBS to provide secure and durable storage.
  • The AWS SGW has 3 modes of operation that you can configure - Gateway-Stored and Cached Volumes, and Gateway-VTL

    Gateway-Cached volumes allow you to utilize Amazon S3 for your primary data, while retaining some portion of it locally in a cache for frequently accessed data.

    Gateway-Stored volumes store your primary data locally, while asynchronously backing up that data to AWS. These volumes provide your on-premises applications with low-latency access to their entire data sets, while providing durable, off-site backups. 
    Both Stored and Cached Volumes provide iSCSI block storage

    Gateway-VTL provides virtual tape storage. Let’s take a deeper look at how the VTL mode works….
  • VTL is Ideal for replacing on-premises physical tape infrastructure for backup and archive.

    Similar to cached volumes all your data is durably stored in AWS with your frequently accessed data cached on the gateway

    In your VTL you can configure up to 1,500 tapes up to 2.5 TB each (LTO-6 size) for a total of 150 TB per VTL

    For longer term storage you can archive your tapes to a Virtual Tape Shelf or VTS which is backed by Glacier providing lower cost long-term storage and there are no limits on the number of tapes you can move to VTS

    Gateway-VTL is integrated with 9 backup applications from Symantec/Veritas, Dell, Microsoft, and Veeam.
    And we’re actively working on adding more.
  • Sometimes using DX or Internet is either not a possibility, too expensive or not fast enough to transfer a large amount of data in time for our business purpose. For these reasons we offer an offline service called Snowball that provides both import and export capabilities.
  • Import/Export Snowball, a new service we introduced late last year, just got 60% more capacity. Snowball units are now 80TB, not 50TB, and improves upon the traditional Import/Export Disk in several ways, based on the customer feedback we received. Snowball offers a petabyte-scale data transfer service using Amazon-provided storage devices for transport. With the first Import/Export Disk service, customers purchase their own portable storage devices. With Snowball, customers leverage a highly secure, shippable, rugged NAS device owned by Amazon. Snowball scales as you like - once received and set up, you can copy up to 80TB of on-prem files to the Snowball device using client software that comes with it. It supports common network interface and you can use multiple Snowballs in parallel to support large migration jobs. It’s highly secure - prior to being transferred to the Snowball appliance, all data is subject to 256 bit encryption by the client software and it’s also equipped with tamper-resistant seals, including a built-in Trusted Platform Module that uses a dedicated processor designed to detect any unauthorized modifications to the hardware, firmware, or software. From a durability perspective, now please don’t intentionally drop it, but this thing is amazing, it can sustain 8.5 Gs of force, it’s dust proof and water resistant so it can sustain some rain when it’s sitting on a shipping dock. The device itself is shippable which means you don’t need to deal with the hassle of packing and unpacking it and it automatically displays the right shipping information on the embedded Kindle-based E-ink display. When customers finish transferring data to the device they simply ship it back to an AWS facility where the data is ingested at high speed into Amazon S3. We’re working with our partners to enable integration from solutions that you use today on-prem, so that your application-oriented workloads will be fully aware of the data at every step of the process.
  • The Snowball service is completely driven by the AWS console like our other services. In the console a customer is able to access the Snowball service under the AWS Import/Export Snowball link. Once there, a customer simply needs to create a data transfer job, specifying the S3 bucket(s) to use, the KMS encryption keys and the location they need a device shipped to. Once the device is received, the customer needs to connect the Snowball to power and the network, providing an IP address either manually or via DHCP. From there data is copied to the Snowball via the client software, a command line tool loaded on a host in the environment with encrypts all data before it is transferred to the Snowball. Once the data transfer is complete, simply power down the device and the return shipping information will update on the e-ink display automatically. Once the device is returned to Amazon, we will complete the data transfer from the Snowball to the specified S3 buckets. During this entire process the customer is notified during each step through the console, Amazon SNS, and/or via text message.
  • So how fast can you throw this Snowball? Compare and contrast Internet vs 3x Snowball? With 3 of these, you can transfer 200 TBs, a respectable amount of data, in less than a week. In comparison, if we use 1 GE link at 50% capacity we’re talking 36 days…

    Customers have moved Petabytes of data in batches
  • Cloud Migration
    If you have large quantities of data you need to migrate into AWS – as part of an application server, file server, database, or backup/archive migration – AWS Snowball is often much faster and more cost-effective than transferring that data over the Internet.

    Scripps Networks Interactive, a leading developer of engaging lifestyle content in the home, food and travel categories for television, digital, mobile and publishing. Our lifestyle media whose portfolio contains popular television and Internet brands HGTV, DIY Network, Food Network, Cooking Channel, Travel Channel and Great American Country, has used the Snowball service to migrate large sets of media assets to AWS, accelerating their migration beyond what they could have done with bandwidth alone.

    Disaster Recovery
    In the event that you need to quickly retrieve a large quantity of data stored in Amazon S3, AWS Snowball appliances can help retrieve the data much quicker than high-bandwidth Internet.

    Philips’ HealthSuite recently announced a new rapid secure data backup service, powered by Snowball, that can move the entire data repository for any healthcare organization to the cloud in days instead of months. Philips is already familiar wth the challenges of large scale data ingest, with over 15PB in AWS and growing at the pace of 1PB a month, with this new cloud-enabled storage service Philips aims to remove time and cost barriers for healthcare organizations adopting cloud-based data recovery.

    Zoolz was one of the first services to leverage Amazon Glacier to provide easy data tiering and complete users management system using their Cold Storage Technology; making it easy for businesses to backup and archive within the same solution at an extremely low cost. Recently Zoolz announced their availability of a large scale Import Export service enabling their customers to securely send their data directly via Snowball instead of transferring the data over the internet.
    Datacenter Decommission
    There are many steps involved to decommissioning a datacenter to ensure valuable data is not lost. AWS Snowball can help ensure that your data is securely and cost-effectively transferred to AWS during this process.

    DevFactory is a Dubai-based provider of software and services founded in 2008. A spinoff of Trilogy Inc., one of the world’s largest privately held software companies, DevFactory develops software solutions for global enterprises. The company relies on extensive standardization and automation to eliminate manual processes and reduce costs. A primary focus for DevFactory is acquiring inefficient software-as-a-service (SaaS) firms and optimizing their business applications for performance. Using Snowball, DevFactory has been able to move over 1PB of data out of datacenters from acquisitions. The organization can also migrate new customers to the DevFactory environment 60 percent faster than before by using Snowball, saving the company time and money.

    Content Distribution
    Across several verticals, many customers have found Snowball invaluable if they need to regularly receive or share large amounts of data with clients, customers, or business associates. Appliances can be sent directly from AWS to client or customer locations.
  • Let’s talk about the AWS partners related to our topic today
  • AWS has a large ecosystem of partners that allow us to complement and close gaps in our product line. You can see a high level list of our storage and ingest related partners in this slide. I’d like to focus on a few select partners that provide compelling solutions to replace traditional backup, recovery and archive environments and provide ease of use, lower costs and scalability by leveraging S3, S3-IA and Glacier as cloud storage targets.
  • Let me spend a minute walking you through 2 cloud backup architecture alternatives and then talk about the partners that enable them.
  • Commvault is a great example for a comprehensive backup and archive solution on AWS. They’ve supported S3 and Glacier as cloud targets for several years now and last year added our new storage class – S3-IA that’s optimized for infrequently accessed data, such as backup and archive. Using Commvault you can apply your organization’s data protection objectives to a policy that leverages on-prem disk for very rapid recoveries, usually your most recent backup, while offloading all other backup sets into S3-IA and Glacier, leveraging superior durability and economics. It’s a frictionless solution that doesn’t require any additional HW or SW and can be enabled by simply pointing the SW to a bucket in your AWS account and adjusting the tiered data protection policy. Commvault also provides DR capabilities with their workflow and scheduling capabilities. Other leading backup software vendors that recently added S3-IA support is Veritas with Netbackup version 7.7 and Oracle Secure Backup.
  • The second approach involves inserting a gateway, such as the AWS or partner one that provides a friendly IT interface and communicates to the AWS object storage services over HTTPS through the Internet or DX.
  • If your existing backup SW brand or version doesn’t support native cloud capabilities or you support a heterogeneous backup environment with multiple products, you can easily insert NetApp’s AltaVault series of cloud gateways. The gateway presents a NAS interface which is supported by virtually any major backup software. It also has some valuable additonal features with the ability to cache your most recent backup sets locally for rapid restoration, inline dedupe, compression and encryption. It comes in both virtual and physical flavors and supports using all 3 AWS object storage classes.

  • Ctera is a unique combination of backup SW and infrastructure all managed from a single pane of glass. It includes at a cost effective NAS device you can deploy in remote offices, ability to protect end point applications and devices both on-prem and running on our compute service - EC2. It applies data reduction and stores the data encrypted on S3-IA. It also provides a suite of data governance, identity management and cloud orch tools. Another partner that provides a comprehensive endpoint and app server backup solution is Druva.
  • To summarize, AWS provides a variety of transport services and ingest applications to simplify and speed up data migration projects. You can use DX to provide dedicated bandwidth and consistent latency from your corporate DCs to AWS regions instead or in addition to an Internet connection. Use Snowball for one-time or recurring migrations to speed up data transfer or complement an Internet or DX link. Use I/E disk for the same purpose but where you’d like to use your own storage devices or where Snowball is not yet available.
    The storage gateway provides an inexpensive way to perform backup, archive and data migration without needing to make significant changes to the existing environment. Partner solutions provide comprehensive backup and archive solutions adding advanced capabilities, such as, deduplication and endpoint protection and Kinesis Firehose is the simplest most effective way to collect streaming data and load it into S3 and Redshift for close to real-time analysis.
  • Thanks everyone for your time today. We’ll now take some questions and answers.