SlideShare una empresa de Scribd logo
1 de 45
Descargar para leer sin conexión
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Uses & Best Practices
for Amazon Redshift
Rahul Pathak, AWS (rahulpathak@)
Daniel Mintz, Upworthy (danielmintz@)
July 10, 2014
Fast, simple, petabyte-scale data warehousing for less than $1,000/TB/Year
Amazon Redshift
Redshift
EMR
EC2
Analyze
Glacier
S3
DynamoDB
Store
Direct Connect
Collect
Kinesis
Petabyte scale
Massively parallel
Relational data warehouse
Fully managed; zero admin
Amazon
Redshift
a lot faster
a lot cheaper
a whole lot simpler
Common Customer Use Cases
• Reduce costs by
extending DW rather than
adding HW
• Migrate completely from
existing DW systems
• Respond faster to
business
• Improve performance by
an order of magnitude
• Make more data
available for analysis
• Access business data via
standard reporting tools
• Add analytic functionality
to applications
• Scale DW capacity as
demand grows
• Reduce HW & SW costs
by an order of magnitude
Traditional Enterprise DW Companies with Big Data SaaS Companies
Amazon Redshift Customers
Growing Ecosystem
AWS Marketplace
• Find software to use with
Amazon Redshift
• One-click deployments
• Flexible pricing options
http://aws.amazon.com/marketplace/redshift
Data Loading Options
• Parallel upload to Amazon S3
• AWS Direct Connect
• AWS Import/Export
• Amazon Kinesis
• Systems integrators
Data Integration Systems Integrators
Amazon Redshift Architecture
• Leader Node
– SQL endpoint
– Stores metadata
– Coordinates query execution
• Compute Nodes
– Local, columnar storage
– Execute queries in parallel
– Load, backup, restore via
Amazon S3; load from
Amazon DynamoDB or SSH
• Two hardware platforms
– Optimized for data processing
– DW1: HDD; scale from 2TB to 1.6PB
– DW2: SSD; scale from 160GB to 256TB
10 GigE
(HPC)
Ingestion
Backup
Restore
JDBC/ODBC
Amazon Redshift Node Types
• Optimized for I/O intensive workloads
• High disk density
• On demand at $0.85/hour
• As low as $1,000/TB/Year
• Scale from 2TB to 1.6PB
DW1.XL: 16 GB RAM, 2 Cores
3 Spindles, 2 TB compressed storage
DW1.8XL: 128 GB RAM, 16 Cores, 24
Spindles 16 TB compressed, 2 GB/sec scan
rate
• High performance at smaller storage size
• High compute and memory density
• On demand at $0.25/hour
• As low as $5,500/TB/Year
• Scale from 160GB to 256TB
DW2.L *New*: 16 GB RAM, 2 Cores,
160 GB compressed SSD storage
DW2.8XL *New*: 256 GB RAM, 32 Cores,
2.56 TB of compressed SSD storage
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage • With row storage you do
unnecessary I/O
• To get total amount, you have
to read everything
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
• With column storage, you
only read the data you need
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
analyze compression listing;
Table | Column | Encoding
---------+----------------+----------
listing | listid | delta
listing | sellerid | delta32k
listing | eventid | delta32k
listing | dateid | bytedict
listing | numtickets | bytedict
listing | priceperticket | delta32k
listing | totalprice | mostly32
listing | listtime | raw
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage • COPY compresses automatically
• You can analyze and override
• More performance, less cost
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
10 | 13 | 14 | 26 |…
… | 100 | 245 | 324
375 | 393 | 417…
… 512 | 549 | 623
637 | 712 | 809 …
… | 834 | 921 | 959
10
324
375
623
637
959
• Track the minimum and
maximum value for each block
• Skip over blocks that don’t
contain relevant data
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
• Use local storage for
performance
• Maximize scan rates
• Automatic replication and
continuous backup
• HDD & SSD platforms
Amazon Redshift parallelizes and distributes everything
• Query
• Load
• Backup/Restore
• Resize
• Load in parallel from Amazon S3,
Amazon EMR, Amazon DynamoDB
or any SSH connection
• Data automatically distributed and
sorted according to DDL
• Scales linearly with number of nodes
Amazon Redshift parallelizes and distributes everything
• Query
• Load
•
• Backup/Restore
• Resize
• Backups to Amazon S3 are automatic,
continuous and incremental
• Configurable system snapshot retention period.
Take user snapshots on-demand
• Cross region backups for disaster recovery
• Streaming restores enable you to resume
querying faster
Amazon Redshift parallelizes and distributes everything
• Query
• Load
• Backup/Restore
• Resize
• Resize while remaining online
• Provision a new cluster in the
background
• Copy data in parallel from node to node
• Only charged for source cluster
Amazon Redshift parallelizes and distributes everything
• Query
• Load
• Backup/Restore
• Resize
• Automatic SQL endpoint
switchover via DNS
• Decommission the source cluster
• Simple operation via Console or
API
Amazon Redshift parallelizes and distributes everything
• Query
• Load
• Backup/Restore
• Resize
Amazon Redshift is priced to let you analyze all your data
• Number of nodes x cost
per hour
• No charge for leader
node
• No upfront costs
• Pay as you go
DW1 (HDD)
Price Per Hour for
DW1.XL Single Node
Effective Annual
Price per TB
On-Demand $ 0.850 $ 3,723
1 Year Reservation $ 0.500 $ 2,190
3 Year Reservation $ 0.228 $ 999
DW2 (SSD)
Price Per Hour for
DW2.L Single Node
Effective Annual
Price per TB
On-Demand $ 0.250 $ 13,688
1 Year Reservation $ 0.161 $ 8,794
3 Year Reservation $ 0.100 $ 5,498
Amazon Redshift is easy to use
• Provision in minutes
• Monitor query performance
• Point and click resize
• Built in security
• Automatic backups
Amazon Redshift has security built-in
• SSL to secure data in transit
• Encryption to secure data at rest
– AES-256; hardware accelerated
– All blocks on disks and in Amazon S3
encrypted
– HSM Support so you control keys
• Audit logging & AWS CloudTrail
integration
• Amazon VPC support
• SOC 1/2/3, PCI DSS Level 1, Fedramp
and more
10 GigE
(HPC)
Ingestion
Backup
Restore
Customer VPC
Internal
VPC
JDBC/ODBC
Amazon Redshift continuously backs up your
data and recovers from failures
• Replication within the cluster and backup to Amazon S3 to maintain multiple copies of
data at all times
• Backups to Amazon S3 are continuous, automatic, and incremental
– Designed for eleven nines of durability
• Continuous monitoring and automated recovery from failures of drives and nodes
• Able to restore snapshots to any Availability Zone within a region
• Easily enable backups to a second region for disaster recovery
60+ new features since launch
• Regions – N. Virginia, Oregon, Dublin, Tokyo, Singapore, Sydney
• Certifications – PCI, SOC 1/2/3, FedRAMP, PCI-DSS Level 1, others
• Security – Load/unload encrypted files, Resource-level IAM, Temporary credentials, HSM, ECDHE
for perfect forward security
• Manageability – Snapshot sharing, backup/restore/resize progress indicators, Cross-region backups
• Query – Regex, Cursors, MD5, SHA1, Time zone, workload queue timeout, HLL, Concurrency to 50
• Ingestion – S3 Manifest, LZOP/LZO, JSON built-ins, UTF-8 4byte, invalid character substitution,
CSV, auto datetime format detection, epoch, Ingest from SSH, JSON, EMR
Amazon Redshift Feature Delivery
Service Launch (2/14)
PDX (4/2)
Temp Credentials (4/11)
DUB (4/25)
SOC1/2/3 (5/8)
Unload Encrypted Files
NRT (6/5)
JDBC Fetch Size (6/27)
Unload logs (7/5)
SHA1 Builtin (7/15)
4 byte UTF-8 (7/18)
Sharing snapshots (7/18)
Statement Timeout (7/22)
Timezone, Epoch, Autoformat (7/25)
WLM Timeout/Wildcards (8/1)
CRC32 Builtin, CSV, Restore Progress
(8/9)
Resource Level IAM (8/9)
PCI (8/22)
UTF-8 Substitution (8/29)
JSON, Regex, Cursors (9/10)
Split_part, Audit tables (10/3)
SIN/SYD (10/8)
HSM Support (11/11)
Kinesis EMR/HDFS/SSH copy,
Distributed Tables, Audit
Logging/CloudTrail, Concurrency,
Resize Perf., Approximate Count
Distinct, SNS Alerts, Cross Region
Backup (11/13)
Distributed Tables, Single Node Cursor
Support, Maximum Connections to 500
(12/13)
EIP Support for VPC Clusters (12/28)
New query monitoring system tables
and diststyle all (1/13)
Redshift on DW2 (SSD) Nodes (1/23)
Compression for COPY from SSH,
Fetch size support for single node
clusters, new system tables with commit
stats, row_number(), strotol() and query
termination (2/13)
Resize progress indicator & Cluster
Version (3/21)
Regex_Substr, COPY from JSON (3/25)
50 slots, COPY from EMR, ECDHE
ciphers (4/22)
3 new regex features, Unload to single
file, FedRAMP(5/6)
Rename Cluster (6/2)
Copy from multiple regions,
percentile_cont, percentile_disc (6/30)
Free Trial (7/1)
New Features
• UNLOAD to single file
• COPY from multiple regions
• Percentile_cont & percentile_disc window
functions
Try Amazon Redshift with BI & ETL for Free!
• http://aws.amazon.com/redshift/free-trial
• 2 months, 750 hours/month of free DW2.Large usage
• Also try BI & ETL for free from nine partners
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
A Year With Amazon Redshift
Why Upworthy Chose Redshift And How It’s Going
July 10, 2014
What’s Upworthy
• We’ve been called:
– “Social media with a mission” by our About Page
– “The fastest growing media site of all time” by Fast Company
– “The Fastest Rising Startup” by The Crunchies
– “That thing that’s all over my newsfeed” by my annoyed friends
– “The most data-driven media company in history” by me,
optimistically
What We Do
• We aim to drive
massive amounts of
attention to things that
really matter.
• We do that by finding,
packaging, and
distributing great,
meaningful content.
Our Use Case
When We Started
• Building a data warehouse from scratch
• One engineer on the project
• Object data in MongoDB
• Had discovered MoSQL
• Knew which two we’d choose:
– Comprehensive
– Ad Hoc
– Real-Time
The Decision
• Downsides to self-hosting
– Cost
– Maintenance
• Tried Redshift by the hour
Building it out initially
• ~50 events/second
• Snowflake or denormalized? Both.
Raw
S3
Events
Redshift
MongoDB
Browser Web Service S3 Drain
MoSQL PostgreSQL Objects
Master
EMR
Processed
Objects
Our system now
• Stats:
– ~5 TB of compressed data.
– Two main tables = 13 billion rows
– Average: ~1085 events/second
– Peak: ~2500 events/second
• 5-10 minute ETL cycle (Kinesis session later)
• Lots of rollup tables
• COPY happens quickly (5-10s) every 1-2
minutes
What We’ve Learned
Initial Lessons
• Columnar can be disorienting:
– What’s in the SELECT matters. A lot.
– SELECT COUNT(*) is really, really fast.
– SELECT * is really, really slow.
– Wide, tall tables with lots of NULLs = Fine.
• Sortkeys are hugely powerful.
• Bulk Operations (COPY, not INSERT) FTW
Staging to Master Pattern
INSERT INTO master
WITH to_insert as (
SELECT
col1,
col2,
col3
FROM
staging
)
SELECT
s.*
FROM
to_insert s
LEFT JOIN master m on m.col1 = s.col1
WHERE
m.col1 is NULL;
Hash multiple join keys into one
...
FROM
table1 t1
LEFT JOIN table2 t2 on
t1.col1 = t2.col1 AND t1.col2 = t2.col2 AND t1.col3 = t2.col3
md5(col1 || col2 || col3) as hashed_join_key
...
FROM
table1 t1
LEFT JOIN table2 t2 on t1.hashed_join_key = t2.hashed_join_key
Pain Points
• Data loading can be a bit tricky at first
– Actually read the documentation. It’ll help.
• COUNT(DISTINCT col1) can be painful
– Use APPROXIMATE COUNT(DISTINCT col1) instead
• No null-safe operator. NULL = NULL returns NULL
– NVL(t1.col1, 0) = NVL(t2.col1, 0) is null-safe
• Error messages can be less than informative
– AWS Redshift Forum is fantastic. Someone else has probably seen that error
before you.
Where We’re Going Next
In The Next Few Months
• More structure to our ETL (implementing luigi)
• Need to avoid Serializable Isolation Violations
• Dozens of people querying a cluster of 4 dw1.xlarge
nodes can get ugly
• “Write” node of dw1 nodes and “Read” cluster of dw2
“dense compute” nodes
Thank You!

Más contenido relacionado

La actualidad más candente

(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseAmazon Web Services
 
Getting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex QueriesGetting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex Queriestimonk
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon RedshiftAmazon Web Services
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAmazon Web Services
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features Amazon Web Services
 
Introduction to Database Services
Introduction to Database ServicesIntroduction to Database Services
Introduction to Database ServicesAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Building your data warehouse with Redshift
Building your data warehouse with RedshiftBuilding your data warehouse with Redshift
Building your data warehouse with RedshiftAmazon Web Services
 
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMRAmazon Web Services
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar SeriesGetting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar SeriesAmazon Web Services
 
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...Amazon Web Services
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedFlyData Inc.
 

La actualidad más candente (20)

(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
 
Getting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex QueriesGetting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex Queries
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
 
Amazon RDS Deep Dive
Amazon RDS Deep DiveAmazon RDS Deep Dive
Amazon RDS Deep Dive
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features
 
Amazon Redshift Deep Dive
Amazon Redshift Deep Dive Amazon Redshift Deep Dive
Amazon Redshift Deep Dive
 
Introduction to Database Services
Introduction to Database ServicesIntroduction to Database Services
Introduction to Database Services
 
Intro to AWS: Database Services
Intro to AWS: Database ServicesIntro to AWS: Database Services
Intro to AWS: Database Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Building your data warehouse with Redshift
Building your data warehouse with RedshiftBuilding your data warehouse with Redshift
Building your data warehouse with Redshift
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA 302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon Redshift
 
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar SeriesGetting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
 
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
Amazon RDS for Microsoft SQL: Performance, Security, Best Practices (DAT303) ...
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query Speed
 

Destacado

InSales - Рынок интернет-торговли в России в 2012 году
InSales - Рынок интернет-торговли в России в 2012 годуInSales - Рынок интернет-торговли в России в 2012 году
InSales - Рынок интернет-торговли в России в 2012 годуAnton Lapkin
 
Мобильный маркетинг: только практика, только хардкор
Мобильный маркетинг: только практика,только хардкорМобильный маркетинг: только практика,только хардкор
Мобильный маркетинг: только практика, только хардкорAnton Lapkin
 
PwC - 10 мифов о многоканальной розничной торговле
PwC - 10 мифов о многоканальной розничной торговлеPwC - 10 мифов о многоканальной розничной торговле
PwC - 10 мифов о многоканальной розничной торговлеAnton Lapkin
 
Email benchmarking report UK 2010
Email benchmarking report UK 2010Email benchmarking report UK 2010
Email benchmarking report UK 2010Anton Lapkin
 
The Forrester Wave: Demand- Side Platforms, Q4 2011
The Forrester Wave: Demand- Side Platforms, Q4 2011The Forrester Wave: Demand- Side Platforms, Q4 2011
The Forrester Wave: Demand- Side Platforms, Q4 2011Anton Lapkin
 
iConText: От контекстной рекламы к performance marketing
iConText: От контекстной рекламы к performance marketingiConText: От контекстной рекламы к performance marketing
iConText: От контекстной рекламы к performance marketingAnton Lapkin
 
Tinkoff Digital: Мобильная революция – завтра уже наступило
Tinkoff Digital: Мобильная революция – завтра уже наступилоTinkoff Digital: Мобильная революция – завтра уже наступило
Tinkoff Digital: Мобильная революция – завтра уже наступилоAnton Lapkin
 
Presentation sainsbury's
Presentation sainsbury'sPresentation sainsbury's
Presentation sainsbury'sAmélie Chzl
 

Destacado (8)

InSales - Рынок интернет-торговли в России в 2012 году
InSales - Рынок интернет-торговли в России в 2012 годуInSales - Рынок интернет-торговли в России в 2012 году
InSales - Рынок интернет-торговли в России в 2012 году
 
Мобильный маркетинг: только практика, только хардкор
Мобильный маркетинг: только практика,только хардкорМобильный маркетинг: только практика,только хардкор
Мобильный маркетинг: только практика, только хардкор
 
PwC - 10 мифов о многоканальной розничной торговле
PwC - 10 мифов о многоканальной розничной торговлеPwC - 10 мифов о многоканальной розничной торговле
PwC - 10 мифов о многоканальной розничной торговле
 
Email benchmarking report UK 2010
Email benchmarking report UK 2010Email benchmarking report UK 2010
Email benchmarking report UK 2010
 
The Forrester Wave: Demand- Side Platforms, Q4 2011
The Forrester Wave: Demand- Side Platforms, Q4 2011The Forrester Wave: Demand- Side Platforms, Q4 2011
The Forrester Wave: Demand- Side Platforms, Q4 2011
 
iConText: От контекстной рекламы к performance marketing
iConText: От контекстной рекламы к performance marketingiConText: От контекстной рекламы к performance marketing
iConText: От контекстной рекламы к performance marketing
 
Tinkoff Digital: Мобильная революция – завтра уже наступило
Tinkoff Digital: Мобильная революция – завтра уже наступилоTinkoff Digital: Мобильная революция – завтра уже наступило
Tinkoff Digital: Мобильная революция – завтра уже наступило
 
Presentation sainsbury's
Presentation sainsbury'sPresentation sainsbury's
Presentation sainsbury's
 

Similar a Amazon Redshift Uses, Best Practices, and Customer Case Study

AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)Amazon Web Services
 
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介Amazon Web Services Japan
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Managed Database Services on AWS - September 2016 Webina...
Getting Started with Managed Database Services on AWS - September 2016 Webina...Getting Started with Managed Database Services on AWS - September 2016 Webina...
Getting Started with Managed Database Services on AWS - September 2016 Webina...Amazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Selecting the Right AWS Database Solution - AWS 2017 Online Tech Talks
Selecting the Right AWS Database Solution - AWS 2017 Online Tech TalksSelecting the Right AWS Database Solution - AWS 2017 Online Tech Talks
Selecting the Right AWS Database Solution - AWS 2017 Online Tech TalksAmazon Web Services
 
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介Amazon Web Services Japan
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftAmazon Web Services
 
Getting started with amazon redshift - Toronto
Getting started with amazon redshift - TorontoGetting started with amazon redshift - Toronto
Getting started with amazon redshift - TorontoAmazon Web Services
 
Getting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSGetting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseAmazon Web Services
 
Getting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSGetting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSAmazon Web Services
 
(DAT202) Managed Database Options on AWS
(DAT202) Managed Database Options on AWS(DAT202) Managed Database Options on AWS
(DAT202) Managed Database Options on AWSAmazon Web Services
 

Similar a Amazon Redshift Uses, Best Practices, and Customer Case Study (20)

AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
 
Redshift overview
Redshift overviewRedshift overview
Redshift overview
 
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Managed Database Services on AWS - September 2016 Webina...
Getting Started with Managed Database Services on AWS - September 2016 Webina...Getting Started with Managed Database Services on AWS - September 2016 Webina...
Getting Started with Managed Database Services on AWS - September 2016 Webina...
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Selecting the Right AWS Database Solution - AWS 2017 Online Tech Talks
Selecting the Right AWS Database Solution - AWS 2017 Online Tech TalksSelecting the Right AWS Database Solution - AWS 2017 Online Tech Talks
Selecting the Right AWS Database Solution - AWS 2017 Online Tech Talks
 
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon Redshift
 
Getting started with amazon redshift - Toronto
Getting started with amazon redshift - TorontoGetting started with amazon redshift - Toronto
Getting started with amazon redshift - Toronto
 
Getting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSGetting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWS
 
AWS Analytics
AWS AnalyticsAWS Analytics
AWS Analytics
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Aws summit 2014 redshift
Aws summit 2014 redshiftAws summit 2014 redshift
Aws summit 2014 redshift
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data Warehouse
 
Getting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSGetting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWS
 
(DAT202) Managed Database Options on AWS
(DAT202) Managed Database Options on AWS(DAT202) Managed Database Options on AWS
(DAT202) Managed Database Options on AWS
 
Processing and Analytics
Processing and AnalyticsProcessing and Analytics
Processing and Analytics
 

Más de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 

Último (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 

Amazon Redshift Uses, Best Practices, and Customer Case Study

  • 1. © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. Uses & Best Practices for Amazon Redshift Rahul Pathak, AWS (rahulpathak@) Daniel Mintz, Upworthy (danielmintz@) July 10, 2014
  • 2. Fast, simple, petabyte-scale data warehousing for less than $1,000/TB/Year Amazon Redshift
  • 4. Petabyte scale Massively parallel Relational data warehouse Fully managed; zero admin Amazon Redshift a lot faster a lot cheaper a whole lot simpler
  • 5. Common Customer Use Cases • Reduce costs by extending DW rather than adding HW • Migrate completely from existing DW systems • Respond faster to business • Improve performance by an order of magnitude • Make more data available for analysis • Access business data via standard reporting tools • Add analytic functionality to applications • Scale DW capacity as demand grows • Reduce HW & SW costs by an order of magnitude Traditional Enterprise DW Companies with Big Data SaaS Companies
  • 8. AWS Marketplace • Find software to use with Amazon Redshift • One-click deployments • Flexible pricing options http://aws.amazon.com/marketplace/redshift
  • 9. Data Loading Options • Parallel upload to Amazon S3 • AWS Direct Connect • AWS Import/Export • Amazon Kinesis • Systems integrators Data Integration Systems Integrators
  • 10. Amazon Redshift Architecture • Leader Node – SQL endpoint – Stores metadata – Coordinates query execution • Compute Nodes – Local, columnar storage – Execute queries in parallel – Load, backup, restore via Amazon S3; load from Amazon DynamoDB or SSH • Two hardware platforms – Optimized for data processing – DW1: HDD; scale from 2TB to 1.6PB – DW2: SSD; scale from 160GB to 256TB 10 GigE (HPC) Ingestion Backup Restore JDBC/ODBC
  • 11. Amazon Redshift Node Types • Optimized for I/O intensive workloads • High disk density • On demand at $0.85/hour • As low as $1,000/TB/Year • Scale from 2TB to 1.6PB DW1.XL: 16 GB RAM, 2 Cores 3 Spindles, 2 TB compressed storage DW1.8XL: 128 GB RAM, 16 Cores, 24 Spindles 16 TB compressed, 2 GB/sec scan rate • High performance at smaller storage size • High compute and memory density • On demand at $0.25/hour • As low as $5,500/TB/Year • Scale from 160GB to 256TB DW2.L *New*: 16 GB RAM, 2 Cores, 160 GB compressed SSD storage DW2.8XL *New*: 256 GB RAM, 32 Cores, 2.56 TB of compressed SSD storage
  • 12. Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Zone maps • Direct-attached storage • With row storage you do unnecessary I/O • To get total amount, you have to read everything ID Age State Amount 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375
  • 13. • With column storage, you only read the data you need Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Zone maps • Direct-attached storage ID Age State Amount 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375
  • 14. analyze compression listing; Table | Column | Encoding ---------+----------------+---------- listing | listid | delta listing | sellerid | delta32k listing | eventid | delta32k listing | dateid | bytedict listing | numtickets | bytedict listing | priceperticket | delta32k listing | totalprice | mostly32 listing | listtime | raw Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Zone maps • Direct-attached storage • COPY compresses automatically • You can analyze and override • More performance, less cost
  • 15. Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Zone maps • Direct-attached storage 10 | 13 | 14 | 26 |… … | 100 | 245 | 324 375 | 393 | 417… … 512 | 549 | 623 637 | 712 | 809 … … | 834 | 921 | 959 10 324 375 623 637 959 • Track the minimum and maximum value for each block • Skip over blocks that don’t contain relevant data
  • 16. Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Zone maps • Direct-attached storage • Use local storage for performance • Maximize scan rates • Automatic replication and continuous backup • HDD & SSD platforms
  • 17. Amazon Redshift parallelizes and distributes everything • Query • Load • Backup/Restore • Resize
  • 18. • Load in parallel from Amazon S3, Amazon EMR, Amazon DynamoDB or any SSH connection • Data automatically distributed and sorted according to DDL • Scales linearly with number of nodes Amazon Redshift parallelizes and distributes everything • Query • Load • • Backup/Restore • Resize
  • 19. • Backups to Amazon S3 are automatic, continuous and incremental • Configurable system snapshot retention period. Take user snapshots on-demand • Cross region backups for disaster recovery • Streaming restores enable you to resume querying faster Amazon Redshift parallelizes and distributes everything • Query • Load • Backup/Restore • Resize
  • 20. • Resize while remaining online • Provision a new cluster in the background • Copy data in parallel from node to node • Only charged for source cluster Amazon Redshift parallelizes and distributes everything • Query • Load • Backup/Restore • Resize
  • 21. • Automatic SQL endpoint switchover via DNS • Decommission the source cluster • Simple operation via Console or API Amazon Redshift parallelizes and distributes everything • Query • Load • Backup/Restore • Resize
  • 22. Amazon Redshift is priced to let you analyze all your data • Number of nodes x cost per hour • No charge for leader node • No upfront costs • Pay as you go DW1 (HDD) Price Per Hour for DW1.XL Single Node Effective Annual Price per TB On-Demand $ 0.850 $ 3,723 1 Year Reservation $ 0.500 $ 2,190 3 Year Reservation $ 0.228 $ 999 DW2 (SSD) Price Per Hour for DW2.L Single Node Effective Annual Price per TB On-Demand $ 0.250 $ 13,688 1 Year Reservation $ 0.161 $ 8,794 3 Year Reservation $ 0.100 $ 5,498
  • 23. Amazon Redshift is easy to use • Provision in minutes • Monitor query performance • Point and click resize • Built in security • Automatic backups
  • 24. Amazon Redshift has security built-in • SSL to secure data in transit • Encryption to secure data at rest – AES-256; hardware accelerated – All blocks on disks and in Amazon S3 encrypted – HSM Support so you control keys • Audit logging & AWS CloudTrail integration • Amazon VPC support • SOC 1/2/3, PCI DSS Level 1, Fedramp and more 10 GigE (HPC) Ingestion Backup Restore Customer VPC Internal VPC JDBC/ODBC
  • 25. Amazon Redshift continuously backs up your data and recovers from failures • Replication within the cluster and backup to Amazon S3 to maintain multiple copies of data at all times • Backups to Amazon S3 are continuous, automatic, and incremental – Designed for eleven nines of durability • Continuous monitoring and automated recovery from failures of drives and nodes • Able to restore snapshots to any Availability Zone within a region • Easily enable backups to a second region for disaster recovery
  • 26. 60+ new features since launch • Regions – N. Virginia, Oregon, Dublin, Tokyo, Singapore, Sydney • Certifications – PCI, SOC 1/2/3, FedRAMP, PCI-DSS Level 1, others • Security – Load/unload encrypted files, Resource-level IAM, Temporary credentials, HSM, ECDHE for perfect forward security • Manageability – Snapshot sharing, backup/restore/resize progress indicators, Cross-region backups • Query – Regex, Cursors, MD5, SHA1, Time zone, workload queue timeout, HLL, Concurrency to 50 • Ingestion – S3 Manifest, LZOP/LZO, JSON built-ins, UTF-8 4byte, invalid character substitution, CSV, auto datetime format detection, epoch, Ingest from SSH, JSON, EMR
  • 27. Amazon Redshift Feature Delivery Service Launch (2/14) PDX (4/2) Temp Credentials (4/11) DUB (4/25) SOC1/2/3 (5/8) Unload Encrypted Files NRT (6/5) JDBC Fetch Size (6/27) Unload logs (7/5) SHA1 Builtin (7/15) 4 byte UTF-8 (7/18) Sharing snapshots (7/18) Statement Timeout (7/22) Timezone, Epoch, Autoformat (7/25) WLM Timeout/Wildcards (8/1) CRC32 Builtin, CSV, Restore Progress (8/9) Resource Level IAM (8/9) PCI (8/22) UTF-8 Substitution (8/29) JSON, Regex, Cursors (9/10) Split_part, Audit tables (10/3) SIN/SYD (10/8) HSM Support (11/11) Kinesis EMR/HDFS/SSH copy, Distributed Tables, Audit Logging/CloudTrail, Concurrency, Resize Perf., Approximate Count Distinct, SNS Alerts, Cross Region Backup (11/13) Distributed Tables, Single Node Cursor Support, Maximum Connections to 500 (12/13) EIP Support for VPC Clusters (12/28) New query monitoring system tables and diststyle all (1/13) Redshift on DW2 (SSD) Nodes (1/23) Compression for COPY from SSH, Fetch size support for single node clusters, new system tables with commit stats, row_number(), strotol() and query termination (2/13) Resize progress indicator & Cluster Version (3/21) Regex_Substr, COPY from JSON (3/25) 50 slots, COPY from EMR, ECDHE ciphers (4/22) 3 new regex features, Unload to single file, FedRAMP(5/6) Rename Cluster (6/2) Copy from multiple regions, percentile_cont, percentile_disc (6/30) Free Trial (7/1)
  • 28. New Features • UNLOAD to single file • COPY from multiple regions • Percentile_cont & percentile_disc window functions
  • 29. Try Amazon Redshift with BI & ETL for Free! • http://aws.amazon.com/redshift/free-trial • 2 months, 750 hours/month of free DW2.Large usage • Also try BI & ETL for free from nine partners
  • 30. © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. A Year With Amazon Redshift Why Upworthy Chose Redshift And How It’s Going July 10, 2014
  • 31. What’s Upworthy • We’ve been called: – “Social media with a mission” by our About Page – “The fastest growing media site of all time” by Fast Company – “The Fastest Rising Startup” by The Crunchies – “That thing that’s all over my newsfeed” by my annoyed friends – “The most data-driven media company in history” by me, optimistically
  • 32. What We Do • We aim to drive massive amounts of attention to things that really matter. • We do that by finding, packaging, and distributing great, meaningful content.
  • 34. When We Started • Building a data warehouse from scratch • One engineer on the project • Object data in MongoDB • Had discovered MoSQL • Knew which two we’d choose: – Comprehensive – Ad Hoc – Real-Time
  • 35. The Decision • Downsides to self-hosting – Cost – Maintenance • Tried Redshift by the hour
  • 36. Building it out initially • ~50 events/second • Snowflake or denormalized? Both. Raw S3 Events Redshift MongoDB Browser Web Service S3 Drain MoSQL PostgreSQL Objects Master EMR Processed Objects
  • 37. Our system now • Stats: – ~5 TB of compressed data. – Two main tables = 13 billion rows – Average: ~1085 events/second – Peak: ~2500 events/second • 5-10 minute ETL cycle (Kinesis session later) • Lots of rollup tables • COPY happens quickly (5-10s) every 1-2 minutes
  • 39. Initial Lessons • Columnar can be disorienting: – What’s in the SELECT matters. A lot. – SELECT COUNT(*) is really, really fast. – SELECT * is really, really slow. – Wide, tall tables with lots of NULLs = Fine. • Sortkeys are hugely powerful. • Bulk Operations (COPY, not INSERT) FTW
  • 40. Staging to Master Pattern INSERT INTO master WITH to_insert as ( SELECT col1, col2, col3 FROM staging ) SELECT s.* FROM to_insert s LEFT JOIN master m on m.col1 = s.col1 WHERE m.col1 is NULL;
  • 41. Hash multiple join keys into one ... FROM table1 t1 LEFT JOIN table2 t2 on t1.col1 = t2.col1 AND t1.col2 = t2.col2 AND t1.col3 = t2.col3 md5(col1 || col2 || col3) as hashed_join_key ... FROM table1 t1 LEFT JOIN table2 t2 on t1.hashed_join_key = t2.hashed_join_key
  • 42. Pain Points • Data loading can be a bit tricky at first – Actually read the documentation. It’ll help. • COUNT(DISTINCT col1) can be painful – Use APPROXIMATE COUNT(DISTINCT col1) instead • No null-safe operator. NULL = NULL returns NULL – NVL(t1.col1, 0) = NVL(t2.col1, 0) is null-safe • Error messages can be less than informative – AWS Redshift Forum is fantastic. Someone else has probably seen that error before you.
  • 44. In The Next Few Months • More structure to our ETL (implementing luigi) • Need to avoid Serializable Isolation Violations • Dozens of people querying a cluster of 4 dw1.xlarge nodes can get ugly • “Write” node of dw1 nodes and “Read” cluster of dw2 “dense compute” nodes