This document provides an overview of Amazon S3 beyond its basic storage capabilities. It discusses how S3 is a sophisticated distributed system that can scale to exabyte-level storage with high durability and availability. It also summarizes key S3 concepts like storage classes, namespaces, access controls, encryption, lifecycle management, and transitions. S3 provides flexible options for storing and managing objects at scale for many applications and use cases.
2. Masterclass
A technical deep dive beyond the basics
Help educate you on how to get the best from AWS technologies
Show you how things work and how to get things done
Broaden your knowledge in ~45 mins
3. Amazon S3
It’s more than just a ‘simple’ storage platform
A sophisticated 21st century distributed system
A bedrock architectural component for many applications
4.
5. “Spotify needed a storage solution that
could scale very quickly without incurring
long lead times for upgrades. This led us to
cloud storage, and in that market, Amazon
Simple Storage Service (Amazon S3) is the
most mature large-scale product.
Amazon S3 gives us confidence in our
ability to expand storage quickly while also
providing high data durability.”
Emil Fredriksson, Operations Director
6. You put it in S3
AWS stores with 99.999999999% durability
7. Highly scalable web
access to objects
You put it in S3
AWS stores with 99.999999999% durability
Multiple redundant
copies in a region
9. Objects in S3
1500.000 1.3 Trillion
1250.000
1000.000
750.000
500.000
250.000
0.000
835k+ peak transactions per second
10. Highly scalable data storage
A web store, not a file system Access via APIs
What is S3?
Fast Economical
Highly available & durable
11. A web store, not a file system
Write once,
read many
(WORM)
Eventually
consistent
12. A web store, not a file system
Load balancers Load balancers
Write once,
read many
Web servers Web servers
(WORM)
Eventually
consistent
Indexing Storage Indexing Storage
Availability Zone Availability Zone
Region
Conceptual diagram only
13. A web store, not a file system
Load balancers Load balancers
Write once,
read many
Web servers Web servers
(WORM)
Eventually
consistent
Indexing Storage Indexing Storage
Availability Zone Availability Zone
Region
Conceptual diagram only
14. A web store, not a file system
Load balancers Load balancers
Write once,
read many
Web servers Web servers
(WORM)
Eventually
consistent
Indexing Storage Indexing Storage
Availability Zone Availability Zone
Region
Conceptual diagram only
15. A web store, not a file system
Load balancers Load balancers
Write once,
read many
Web servers Web servers
(WORM)
Eventually
consistent
Indexing Storage Indexing Storage
Availability Zone Availability Zone
Region
Conceptual diagram only
16. A web store, not a file system
Load balancers Load balancers
Write once,
read many
Web servers Web servers
(WORM)
Eventually
consistent
Indexing Storage Indexing Storage
Availability Zone Availability Zone
Region
Conceptual diagram only
17. A web store, not a file system
Load balancers Load balancers
Write once,
read many
Web servers Web servers
(WORM)
Eventually
consistent
Indexing Storage Indexing Storage
Availability Zone Availability Zone
Region
Conceptual diagram only
18. A web store, not a file system
Load balancers Load balancers
Write once,
read many
Web servers Web servers
(WORM)
Eventually
consistent
Indexing Storage Indexing Storage
Availability Zone Availability Zone
Region
Conceptual diagram only
19. A web store, not a file system
Load balancers Load balancers
Write once,
read many
Web servers Web servers
(WORM)
Eventually
consistent
Indexing Storage Indexing Storage
Availability Zone Availability Zone
Region
Conceptual diagram only
20. A web store, not a file system
Write once,
read many
(WORM)
Eventually
consistent
Conceptual diagram only
21. New objects
Synchronously stores your data across multiple
facilities before returning SUCCESS
A web store, not a file system Read-after-write consistency*
Updates
Write once, Write then read: could report key does not exist
read many Write then list: might not include key in list
(WORM)
Overwrite then read: old data could be returned
Eventually
consistent
Deletes
Delete then read: could still get old data
Delete then list: deleted key could be included in
list
*except US-STANDARD region
24. Amazon S3 storage classes
Standard
Designed to provide
99.999999999% durability and
99.99% availability of objects
over a given year
Designed to sustain the
concurrent loss of data in two
facilities
25. Amazon S3 storage classes
Reduced
Standard Redundancy Storage
Designed to provide Reduces costs by storing data at
99.999999999% durability and lower levels of redundancy than
99.99% availability of objects the Standard storage
over a given year
Designed to provide 99.99%
Designed to sustain the durability and 99.99% availability
concurrent loss of data in two of objects over a given year
facilities
26. Amazon S3 storage classes
Reduced
Standard Redundancy Storage Glacier
Designed to provide Reduces costs by storing data at Suitable for archiving data, where
99.999999999% durability and lower levels of redundancy than data access is infrequent and a
99.99% availability of objects the Standard storage retrieval time of several hours is
over a given year acceptable
Designed to provide 99.99%
Designed to sustain the durability and 99.99% availability Uses the very low-cost Amazon
concurrent loss of data in two of objects over a given year Glacier storage service, but managed
facilities through Amazon S3
27. Amazon S3 storage classes
Reduced
Standard Redundancy Storage Glacier
Designed to provide Reduces costs by storing data Suitable for archiving data,
99.999999999%want to
Objects you durability at lower levels can afford to
Objects you of redundancy where data access is infrequent
Objects you want to keep
and 99.99% availability of than the Standard storage and a retrieval time of several
have high durability lose or can recreate in archive for a long time
objects over a given year Designed to provide 99.99% hours is acceptable
Designed to sustain the
e.g. master copy of movie e.g. durability and 99.99%
different encodings of movie e.g. digital very low-cost Amazon
Uses the archive of old movies &
concurrent loss of data in
media availability of objects over a
media Glacier storage service, but
broadcasts
two facilities given year managed through Amazon S3
33. Amazon S3 namespace
Object key
Max 1024 bytes UTF-8 Including ‘path’ prefixes
Unique with a bucket
34. Amazon S3 namespace
Object key
Max 1024 bytes UTF-8 Including ‘path’ prefixes
Unique with a bucket
assets/js/jquery/plugins/jtables.js
this is an object key
41. Automatic encryption of
data at rest
Simple
Additional PUT Durable
header S3 key storage
Server side encryption
Strong Secure
AES-256 3-way simultaneous
access
Self managed
No need to manage a key store
43. Server side encryption
Encrypted object
Data bucket Encrypted data
Per-object key
High level design
44. Server side encryption
Encrypted object
Data bucket Encrypted data
Encrypted
per-object key
Per-object key
Master key
High level design
45. Server side encryption
Encrypted object
Data bucket Encrypted data
Encrypted
per-object key
Per-object key
Key management
(monthly rotation)
Master key
High level design
48. You decide what to share
Apply policies to buckets and objects
Secure by default
Policies, ACLs & IAM
Use S3 policies, ACLs or IAM to define
rules
49. IAM
Fine grained
Administer as part of
role based access
Apply policies to S3 at
role, user & group level
Allow
Actions
PutObject
Resource
arn:aws:s3:::mybucket/*
Bob Jane
50. IAM VS Bucket Policies
Fine grained Fine grained
Administer as part of Apply policies at the bucket
role based access level in S3
Apply policies to S3 at Incorporate user restrictions
role, user & group level without using IAM
Allow Allow
Bob, Jane
Actions Actions
PutObject PutObject
Resource Resource
arn:aws:s3:::mybucket/* arn:aws:s3:::mybucket/*
Bob Jane mybucket
51. IAM VS Bucket Policies VS ACLs
Fine grained Fine grained Coarse grained
Administer as part of Apply policies at the bucket Apply access control
role based access level in S3 rules at the bucket
Apply policies to S3 at Incorporate user restrictions and/or object level in S3
role, user & group level without using IAM
Allow Allow Allow
Bob, Jane Everyone, Bob, Jane
Actions Actions Actions
PutObject PutObject Read
Resource Resource
arn:aws:s3:::mybucket/* arn:aws:s3:::mybucket/*
Bob Jane mybucket mybucket myobject
61. Durable
Designed for 99.999999999%
durability of archives
Long term Glacier archive
Cost effective
Write-once, read-never. Cost effective for
long term storage. Pay for accessing data
63. ✗
Expiry
Logs
accessible from S3
Objects
expire and
are deleted
time
64. Object
transition to
Transition
Glacier invoked
accessible from S3
Txns
✗
Expiry
Logs
accessible from S3
Objects
expire and
are deleted
time
65. Object Restoration of
transition to object requested
Transition
Glacier invoked for x hrs
accessible from S3
Txns
✗
Expiry
Logs
accessible from S3
Objects
expire and
are deleted
time
66. Object Restoration of
transition to object requested
Transition
Glacier invoked for x hrs Object held in S3
accessible from S3 RRS for x hrs
Txns
3-5hrs
✗
Expiry
Logs
accessible from S3
Objects
expire and
are deleted
time
69. using (client = new AmazonS3Client()){
var lifeCycleConfiguration = new LifecycleConfiguration()
{
Rules = new List<LifecycleRule>
{
new LifecycleRule
{
Id = "Archive and delete rule",
Prefix = "projectdocs/",
Status = LifecycleRuleStatus.Enabled,
Transition = new LifecycleTransition()
{
Days = 365,
StorageClass = S3StorageClass.Glacier
},
Expiration = new LifecycleRuleExpiration()
{
Days = 3650
}
}
}
};
70. using (client = new AmazonS3Client()){
var lifeCycleConfiguration = new LifecycleConfiguration()
{
Rules = new List<LifecycleRule>
{
new LifecycleRule
{ Transition to
Id = "Archive and delete rule",
Prefix = "projectdocs/", Glacier after 1
Status = LifecycleRuleStatus.Enabled, year
Transition = new LifecycleTransition()
{
Days = 365,
StorageClass = S3StorageClass.Glacier
},
Expiration = new LifecycleRuleExpiration()
{
Days = 3650
}
}
}
};
71. using (client = new AmazonS3Client()){
var lifeCycleConfiguration = new LifecycleConfiguration()
{
Rules = new List<LifecycleRule>
{
new LifecycleRule
{
Id = "Archive and delete rule",
Prefix = "projectdocs/",
Status = LifecycleRuleStatus.Enabled,
Transition = new LifecycleTransition() Delete object
{
Days = 365, after 10 years
StorageClass = S3StorageClass.Glacier
},
Expiration = new LifecycleRuleExpiration()
{
Days = 3650
}
}
}
};
72.
73. POST /ObjectName?restore HTTP/1.1
Host: BucketName.s3.amazonaws.com
Date: date
Authorization: signatureValue
Content-MD5: MD5
<RestoreRequest xmlns="http://s3.amazonaws.com/doc/2006-3-01">
<Days>NumberOfDays</Days>
</RestoreRequest>
74. POST /ObjectName?restore HTTP/1.1
Host: BucketName.s3.amazonaws.com
Date: date
Authorization: signatureValue
Content-MD5: MD5
<RestoreRequest xmlns="http://s3.amazonaws.com/doc/2006-3-01">
<Days>NumberOfDays</Days>
</RestoreRequest>
Response codes:
202 Accepted Restore request accepted
200 OK Object already restored, number of days updated
409 Conflict Restoration already in progress
81. Record set for:
aws-examples.info
R53
Index. Error.
bucket bucket html html
Website bucket name: Website bucket name:
www.aws-examples.info aws-examples.info
82. Record set for:
aws-examples.info
R53
Index. Error.
bucket bucket html html
Website redirect to:
aws-examples.info
Website bucket name: Website bucket name:
www.aws-examples.info aws-examples.info
83. Record set for:
aws-examples.info
A Record ‘Alias’ to S3 website:
R53 aws-examples.info @ s3-website-eu-
west-1.amazonaws.com
Index. Error.
bucket bucket html html
Website redirect to:
aws-examples.info
Website bucket name: Website bucket name:
www.aws-examples.info aws-examples.info
84. Record set for:
aws-examples.info
CNAME for www. to: A Record ‘Alias’ to S3 website:
www.aws-examples.info.s3- R53 aws-examples.info @ s3-website-eu-
website-eu-west-1.amazonaws.com west-1.amazonaws.com
Index. Error.
bucket bucket html html
Website redirect to:
aws-examples.info
Website bucket name: Website bucket name:
www.aws-examples.info aws-examples.info
87. Bucket level Persistent
Automatically preserves Even deleted object
all copies of objects history is held
88. >>> import boto
>>> conn = boto.connect_s3()
>>> from boto.s3.bucket import Bucket
>>> from boto.s3.key import Key
>>> bucket = conn.get_bucket(’mybucket')
>>> versions = bucket.list_versions()
>>> for version in versions:
... print version.name + version.version_id
...
89. >>> import boto
>>> conn = boto.connect_s3()
>>> from boto.s3.bucket import Bucket
>>> from boto.s3.key import Key
>>> bucket = conn.get_bucket(’mybucket')
>>> versions = bucket.list_versions()
>>> for version in versions:
... print version.name + version.version_id Object
... version
IDs
myfile.txt jU9eVv800OlP4PQx6zskMEyPIoExne57
myfile.txt xOJzMvMmGv0Bx2v4QpIypbkkH2XE2yyq
myfile.txt 8cjozv9Hmkzum8xj.8q8BZxR5CuXnzon
90. >>> key = bucket.get_key('myfile.txt',
version_id='8cjozv9Hmkzum8xj.8q8BZxR5CuXnzon’)
>>> key.get_contents_as_string()
'this is version 1 of my file’
Grabbing the
contents of a
version
91. >>> key = bucket.get_key('myfile.txt',
version_id='8cjozv9Hmkzum8xj.8q8BZxR5CuXnzon’)
>>> key.get_contents_as_string()
'this is version 1 of my file’
>>> key = bucket.get_key('myfile.txt',
version_id='xOJzMvMmGv0Bx2v4QpIypbkkH2XE2yyq’)
>>> key.get_contents_as_string()
'this is version 2 of my file’
92. >>> key = bucket.get_key('myfile.txt',
version_id='8cjozv9Hmkzum8xj.8q8BZxR5CuXnzon’)
>>> key.get_contents_as_string()
'this is version 1 of my file’
>>> key = bucket.get_key('myfile.txt',
version_id='xOJzMvMmGv0Bx2v4QpIypbkkH2XE2yyq’)
>>> key.get_contents_as_string()
Generating a ’10
'this is version 2 of my file’ minute time bombed’
url for an older
>>> key.generate_url(600) version
'https://mybucket.s3.amazonaws.com/myfile.txt?Signature=
ABCD&Expires=1358857379&AWSAccessKeyId=AB&
versionId=xOJzMvMmGv0Bx2v4QpIypbkkH2XE2yyq'
95. System metadata
Name Description Editable?
Date Object creation date No
Content-Length Object size in bytes No
Content-MD5 Base64 encoded 128bit MD5 digest No
x-amz-server-side-encryption Server side encryption enabled for object Yes
x-amz-version-id Object version No
x-amz-delete-marker Indicates a version enabled object is deleted No
x-amz-storage-class Storage class for the object Yes
x-amz-website-redirect-location Redirects request for the object to another object or external URL Yes
99. Download
Enable static and dynamic assets to be
served from edge locations
Global content distribution
Streaming
Serve RTMP directly from media files in
buckets
107. Stop doing these:
Capacity planning
Management of storage
Worrying about backing up the backup
Fixing broken hardware
108. Bootstrapping
Store scripts and drive EC2 Backups & archive
Application backends instances on startup Storage gateway, 3rd
Incorporate S3 SDKs into
party tools
your applications
and start doing these
Application logs Documentation
Store logs and analyse Web content Store documents with
with EMR versioning and security
Serve content and
models
distribute globally