SlideShare una empresa de Scribd logo
1 de 72
Pop-up Loft
Deep Dive on Amazon Redshift
Storage Subsystem and Query Life Cycle
Dhanraj Pondicherry, Solutions Architect Manager
October 2017
Are you a user ?
Deep Dive Overview
• Redshift Cluster Architecture
• Storage Deep Dive
• Reduce I/O
• Parallelism
• Query Life Cycle
• Data Ingestion Best Practices
• Recently Released Features
• Open Q&A
Amazon Redshift Cluster Architecture
Redshift Cluster Architecture
• Massively parallel, shared nothing
• Leader node
– SQL endpoint
– Stores metadata
– Coordinates parallel SQL processing
• Compute nodes
– Local, columnar storage
– Executes queries in parallel
– Load, backup, restore
10	GigE
Ingestion
Backup
Restore
SQL	Clients/BI	Tools
128GB	RAM
16TB	disk
16	cores
S3	/	EMR	/	DynamoDB	/	SSH
JDBC/ODBC
128GB	RAM
16TB	disk
16	coresCompute	
Node
128GB	RAM
16TB	disk
16	coresCompute	
Node
128GB	RAM
16TB	disk
16	coresCompute	
Node
Leader
Node
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
Compute	Node Compute	Node Compute	Node
Leader	Node
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
• Parser	&	Rewriter
• Planner	&	Optimizer	
• Code	Generator
• Input:	Optimized	plan
• Output:	>=1	C++	functions
• Task	Scheduler
• WLM
• Admission
• Scheduling
• PostgreSQL	Catalog	Tables
128GB	RAM
16TB	disk
16	cores
Compute	Node Compute	Node Compute	Node
Leader	Node
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
• Query	execution	processes
• Backup	&	restore	processes
• Replication	processes
• Local	Storage
• Disks
• Slices
• Tables
• Columns
• Blocks
128GB	RAM
16TB	disk
16	cores
Compute	Node Compute	Node Compute	Node
Leader	Node
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
• Query	execution	processes
• Backup	&	restore	processes
• Replication	processes
• Local	Storage
• Disks
• Slices
• Tables
• Columns
• Blocks
Leader	Node
128GB	RAM
16TB	disk
16	cores
Compute	Node Compute	Node Compute	Node
Storage Deep Dive
Storage Deep Dive: Compute Node
• Node Types –
– Compute Optimized
– Storage Optimized
• Node Size –
• DC1.large –.8xLarge
• DS2.xlarge - 8xlarge
• Number of Nodes in the cluster
Storage Deep Dive: Disks
• Redshift utilizes locally attached storage devices
• Compute nodes have 2.5-3x the advertised storage capacity
• 1, 3, 8, or 24 disks depending on node type
• Each disk is split into two partitions
– Local data storage, accessed by local CN
– Mirrored data, accessed by remote CN
• Partitions are raw devices
– Local storage devices are ephemeral in nature
– Tolerant to multiple disk failures on a single node
Slices
• A slice can be thought of like a “virtual compute node”
– Unit of data partitioning
– Parallel query processing
• Facts about slices:
– Each compute node has either 2, 16, or 32 slices
– Table rows are distributed to slices
– A slice processes only its own data
Storage Deep Dive: Columns
• Column: Logical structure accessible via SQL
• Physical structure is a doubly linked list of blocks
• These blockchains exist on each slice for each column
• All sorted & unsorted blockchains compose a column
• Column properties include:
– Distribution Key
– Sort Key
– Compression Encoding
• Columns shrink and grow independently, 1 block at a time
• Three system columns per table-per slice for MVCC
Storage Deep Dive: Blocks
• Column data is persisted to 1MB immutable blocks
• Each block contains in-memory metadata:
– Zone Maps (MIN/MAX value)
– Location of previous/next block
• Blocks are individually compressed with 1 of 10 encodings
• A full block contains between 16 and 8.4 million values
Block Properties: Design Considerations
• Small writes:
• Batch processing system, optimized for processing massive amounts of data
• 1MB size + immutable blocks means that we clone blocks on write so as not to
introduce fragmentation
• Small write (~1-10 rows) has similar cost to a larger write (~100 K rows)
• UPDATE and DELETE:
• Immutable blocks means that we only logically delete rows on UPDATE or DELETE
• Must VACUUM or DEEP COPY to remove ghost rows from table
Column Properties: Design Considerations
• Compression:
• COPY automatically analyzes and compresses data when loading into empty tables
• ANALYZE COMPRESSION checks existing tables and proposes optimal
compression algorithms for each column
• Changing column encoding requires a table rebuild
• DISTKEY and SORTKEY significantly influence performance (orders of magnitude)
• Distribution Keys:
• A poor DISTKEY can introduce data skew and an unbalanced workload
• A query completes only as fast as the slowest slice completes
• Sort Keys:
• A sortkey is only effective as the data profile allows it to be
• Selectivity needs to be considered
Reducing I/O
Reduce I/O w/ Data Sorting
• Goal:
• Make queries run faster by optimizing the effectiveness of zone maps
• Typically on the columns that are filtered on (where clause predicates)
• Impact:
• Enables range restricted scans to prune blocks by leveraging zone maps
• Overall reduction in block I/O
• Achieved with the table property SORTKEY defined on one or more columns
• Optimal SORTKEY is dependent on:
• Query patterns
• Data profile
• Business requirements
Designed for I/O Reduction
• Columnar storage
• Data compression
• Zone maps
aid loc dt
1 SFO 2016-09-01
2 JFK 2016-09-14
3 SFO 2017-04-01
4 JFK 2017-05-14
• Accessing dt with row storage:
– Need to read everything
– Unnecessary I/O
aid loc dt
CREATE TABLE deep_dive (
aid INT --audience_id
,loc CHAR(3) --location
,dt DATE --date
);
Designed for I/O Reduction
• Columnar storage
• Data compression
• Zone maps
aid loc dt
1 SFO 2016-09-01
2 JFK 2016-09-14
3 SFO 2017-04-01
4 JFK 2017-05-14
• Accessing dt with columnar storage:
– Only scan blocks for relevant
column
aid loc dt
CREATE TABLE deep_dive (
aid INT --audience_id
,loc CHAR(3) --location
,dt DATE --date
);
Designed for I/O Reduction
• Columnar storage
• Data compression
• Zone maps
aid loc dt
1 SFO 2016-09-01
2 JFK 2016-09-14
3 SFO 2017-04-01
4 JFK 2017-05-14
• Columns	grow	and	shrink	independently
• Effective	compression	ratios	due	to	like	data	
• Reduces	storage	requirements
• Reduces	I/O	
aid loc dt
CREATE TABLE deep_dive (
aid INT ENCODE LZO
,loc CHAR(3) ENCODE BYTEDICT
,dt DATE ENCODE RUNLENGTH
);
Designed for I/O Reduction
• Columnar storage
• Data compression
• Zone maps
aid loc dt
1 SFO 2016-09-01
2 JFK 2016-09-14
3 SFO 2017-04-01
4 JFK 2017-05-14
aid loc dt
CREATE TABLE deep_dive (
aid INT --audience_id
,loc CHAR(3) --location
,dt DATE --date
);
• In-memory	block	metadata
• Contains	per-block	MIN	and	MAX	value
• Effectively	prunes	blocks	which	cannot	
contain	data	for	a	given	query
• Eliminates	unnecessary	I/O
SELECT COUNT(*) FROM deep_dive WHERE DATE = '09-JUNE-2013'
MIN: 01-JUNE-2013
MAX: 20-JUNE-2013
MIN: 08-JUNE-2013
MAX: 30-JUNE-2013
MIN: 12-JUNE-2013
MAX: 20-JUNE-2013
MIN: 02-JUNE-2013
MAX: 25-JUNE-2013
Unsorted	Table
MIN: 01-JUNE-2013
MAX: 06-JUNE-2013
MIN: 07-JUNE-2013
MAX: 12-JUNE-2013
MIN: 13-JUNE-2013
MAX: 18-JUNE-2013
MIN: 19-JUNE-2013
MAX: 24-JUNE-2013
Sorted	By	Date
Zone Maps
Table Design Considerations
• Denormalize/materialize heavily filtered columns into the fact table(s)
– Timestamp/date should be in fact tables rather than using time dimension tables
• Add SORT KEYS to medium and large sized tables on the primary columns that
are filtered on
– Not necessary to add them to small tables
• Keep variable length data types as long as required
– VARCHAR, CHAR and NUMERIC
• Add compression to tables
– Optimal compression can be found using ANALYZE COMPRESSION
Parallelism
Storage Deep Dive: Slices
• Each compute node has either 2, 16, or 32 slices
• A slice can be thought of like a “virtual compute node”
– Unit of data partitioning
– Parallel query processing
• Facts about slices:
– Table rows are distributed to slices
– A slice processes only its own data
– Within a compute node all slices read from and write to all disks
Terminology and Concepts: Data Distribution
• KEY
– The key creates an even distribution of data
– Joins are performed between large fact/dimension tables
– Optimizing joins and group by
• ALL
– Small and medium size dimension tables (< 2-3M)
• EVEN
– When key cannot produce an even distribution
Data Distribution
• Distribution style is a table property which dictates how that table’s data is
distributed throughout the cluster:
• KEY: Value is hashed, same value goes to same location (slice)
• ALL: Full table data goes to first slice of every node
• EVEN: Round robin
• Goals:
• Distribute data evenly for parallel processing
• Minimize data movement during query processing
KEY
ALL
Node	1
Slice	1 Slice	2
Node	2
Slice	3 Slice	4
Node	1
Slice	1 Slice	2
Node	2
Slice	3 Slice	4
Node	1
Slice	1 Slice	2
Node	2
Slice	3 Slice	4
EVEN
Data Distribution: Example
CREATE TABLE deep_dive (
aid INT --audience_id
,loc CHAR(3) --location
,dt DATE --date
) DISTSTYLE (EVEN|KEY|ALL);
CN1
Slice	0 Slice	1
CN2
Slice	2 Slice	3
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Data Distribution: EVEN Example
CREATE TABLE deep_dive (
aid INT --audience_id
,loc CHAR(3) --location
,dt DATE --date
) DISTSTYLE EVEN;
CN1
Slice	0 Slice	1
CN2
Slice	2 Slice	3
INSERT INTO deep_dive VALUES
(1, 'SFO', '2016-09-01'),
(2, 'JFK', '2016-09-14'),
(3, 'SFO', '2017-04-01'),
(4, 'JFK', '2017-05-14');
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Rows:	0 Rows:	0 Rows:	0 Rows:	0
(3	User	Columns	+	3	System	Columns)	x	(4	slices)	=	24	Blocks	(24MB)
Rows:	1 Rows:	1 Rows:	1 Rows:	1
Data Distribution: KEY Example #1
CREATE TABLE deep_dive (
aid INT --audience_id
,loc CHAR(3) --location
,dt DATE --date
) DISTSTYLE KEY DISTKEY (loc);
CN1
Slice	0 Slice	1
CN2
Slice	2 Slice	3
INSERT INTO deep_dive VALUES
(1, 'SFO', '2016-09-01'),
(2, 'JFK', '2016-09-14'),
(3, 'SFO', '2017-04-01'),
(4, 'JFK', '2017-05-14');
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Rows:	2 Rows:	0 Rows:	0
(3	User	Columns	+	3	System	Columns)	x	(2	slices)	=	12	Blocks	(12	MB)
Rows:	0Rows:	1
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Rows:	2Rows:	0Rows:	1
Data Distribution: KEY Example #2
CREATE TABLE deep_dive (
aid INT --audience_id
,loc CHAR(3) --location
,dt DATE --date
) DISTSTYLE KEY DISTKEY (aid);
CN1
Slice	0 Slice	1
CN2
Slice	2 Slice	3
INSERT INTO deep_dive VALUES
(1, 'SFO', '2016-09-01'),
(2, 'JFK', '2016-09-14'),
(3, 'SFO', '2017-04-01'),
(4, 'JFK', '2017-05-14');
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Rows:	0 Rows:	0 Rows:	0 Rows:	0
(3	User	Columns	+	3	System	Columns)	x	(4	slices)	=	24	Blocks	(24	MB)
Rows:	1 Rows:	1 Rows:	1 Rows:	1
Data Distribution: ALL Example
CREATE TABLE deep_dive (
aid INT --audience_id
,loc CHAR(3) --location
,dt DATE --date
) DISTSTYLE ALL;
CN1
Slice	0 Slice	1
CN2
Slice	2 Slice	3
INSERT INTO deep_dive VALUES
(1, 'SFO', '2016-09-01'),
(2, 'JFK', '2016-09-14'),
(3, 'SFO', '2017-04-01'),
(4, 'JFK', '2017-05-14');
Rows:	0 Rows:	0
(3	User	Columns	+	3	System	Columns)	x	(2	slice)	=	12	Blocks	(12MB)
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Rows:	0Rows:	1Rows:	2Rows:	4Rows:	3
Table: deep_dive
User	Columns System	Columns
aid loc dt ins del row
Rows:	0Rows:	1Rows:	2Rows:	4Rows:	3
Query Execution
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
• Parser	&	Rewriter
• Planner	&	Optimizer	
• Code	Generator
• Input:	Optimized	plan
• Output:	>=1	C++	functions
• Compiler
• Task	Scheduler
• WLM
• Admission
• Scheduling
• PostgreSQL	Catalog	Tables
• Redshift	System	Tables	(STV)
128GB	RAM
16TB	disk
16	cores
Compute	Node Compute	Node Compute	Node
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
Leader	Node
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
Compute	Node
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
• Parser	&	Rewriter
• Planner	&	Optimizer	
• Code	Generator
• Input:	Optimized	plan
• Output:	>=1	C++	functions
• Compiler
• Task	Scheduler
• WLM
• Admission
• Scheduling
• PostgreSQL	Catalog	Tables
• Redshift	System	Tables	(STV)
Compute	Node Compute	Node
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
128GB	RAM
16TB	disk
16	cores
Leader	Node
Query Execution Terminology
• Step: An individual operation needed during query execution. Steps need to be
combined to allow compute nodes to perform a join. Examples: scan, sort,
hash, aggr
• Segment: A combination of several steps that can be done by a single process.
The smallest compilation unit executable by a slice. Segments within a stream
run in parallel.
• Stream: A collection of combined segments which output to the next stream or
SQL client.
Visualizing Streams, Segments, and Steps
Stream	0
Segment	0
Step	0 Step	1 Step	2
Segment	1
Step	0 Step	1 Step	2 Step	3 Step	4
Segment	2
Step	0 Step	1 Step	2 Step	3
Segment	3
Step	0 Step	1 Step	2 Step	3 Step	4 Step	5
Stream	1
Segment	4
Step	0 Step	1 Step	2 Step	3
Segment	5
Step	0 Step	1 Step	2
Segment	6
Step	0 Step	1 Step	2 Step	3 Step	4
Stream	2
Segment	7
Step	0 Step	1
Segment	8
Step	0 Step	1
Time
client
JDBC ODBC
Leader Node
Parser
Query Planner
Code Generator
Final Computations
Generate code for
all segments of
one stream
Explain Plans
Compute Node
Receive Compiled Code
Run the Compiled Code
Return results to Leader
Compute Node
Receive Compiled Code
Run the Compiled Code
Return results to Leader
Return results to client
Segments in a stream are
executed concurrently.
Each step in a segment is
executed serially.
Query Lifecycle
Query Execution Deep Dive: Leader Node
1. The leader node receives the query and parses the SQL.
2. The parser produces a logical representation of the original query.
3. This query tree is input into the query optimizer (volt).
4. Volt rewrites the query to maximize its efficiency. Sometimes a single query will be
rewritten as several dependent statements in the background.
5. The rewritten query is sent to the planner which generates >= 1 query plans for the
execution with the best estimated performance.
6. The query plan is sent to the execution engine, where it’s translated into steps,
segments, and streams.
7. This translated plan is sent to the code generator, which generates a C++ function
for each segment.
8. This generated C++ is compiled with gcc to a .o file and distributed to the compute
nodes.
Query Execution Deep Dive: Compute Nodes
• Slices execute the query segments in parallel.
• Executable segments are created for one stream at a time. When the segments
of that stream are complete, the engine generates the segments for the next
stream.
• When the compute nodes are done, they return the query results to the leader
node for final processing.
• The leader node merges the data into a single result set and addresses any
needed sorting or aggregation.
• The leader node then returns the results to the client.
Visualizing Streams, Segments, and Steps
Stream	0
Segment	0
Step	0 Step	1 Step	2
Segment	1
Step	0 Step	1 Step	2 Step	3 Step	4
Segment	2
Step	0 Step	1 Step	2 Step	3
Segment	3
Step	0 Step	1 Step	2 Step	3 Step	4 Step	5
Stream	1
Segment	4
Step	0 Step	1 Step	2 Step	3
Segment	5
Step	0 Step	1 Step	2
Segment	6
Step	0 Step	1 Step	2 Step	3 Step	4
Stream	2
Segment	7
Step	0 Step	1
Segment	8
Step	0 Step	1
Time
Query Execution
Stream	0
Segment	0
Step	0 Step	1 Step	2
Segment	1
Step	0 Step	1 Step	2 Step	3 Step	4
Segment	2
Step	0 Step	1 Step	2 Step	3
Segment	3
Step	0 Step	1 Step	2 Step	3 Step	4 Step	5
Stream	1
Segment	4
Step	0 Step	1 Step	2 Step	3
Segment	5
Step	0 Step	1 Step	2
Segment	6
Step	0 Step	1 Step	2 Step	3 Step	4
Stream	2
Segment	7
Step	0 Step	1
Segment	8
Step	0 Step	1
Time
Stream	0
Segment	0
Step	0 Step	1 Step	2
Segment	1
Step	0 Step	1 Step	2 Step	3 Step	4
Segment	2
Step	0 Step	1 Step	2 Step	3
Segment	3
Step	0 Step	1 Step	2 Step	3 Step	4 Step	5
Stream	1
Segment	4
Step	0 Step	1 Step	2 Step	3
Segment	5
Step	0 Step	1 Step	2
Segment	6
Step	0 Step	1 Step	2 Step	3 Step	4
Stream	2
Segment	7
Step	0 Step	1
Segment	8
Step	0 Step	1
Stream	0
Segment	0
Step	0 Step	1 Step	2
Segment	1
Step	0 Step	1 Step	2 Step	3 Step	4
Segment	2
Step	0 Step	1 Step	2 Step	3
Segment	3
Step	0 Step	1 Step	2 Step	3 Step	4 Step	5
Stream	1
Segment	4
Step	0 Step	1 Step	2 Step	3
Segment	5
Step	0 Step	1 Step	2
Segment	6
Step	0 Step	1 Step	2 Step	3 Step	4
Stream	2
Segment	7
Step	0 Step	1
Segment	8
Step	0 Step	1
Stream	0
Segment	0
Step	0 Step	1 Step	2
Segment	1
Step	0 Step	1 Step	2 Step	3 Step	4
Segment	2
Step	0 Step	1 Step	2 Step	3
Segment	3
Step	0 Step	1 Step	2 Step	3 Step	4 Step	5
Stream	1
Segment	4
Step	0 Step	1 Step	2 Step	3
Segment	5
Step	0 Step	1 Step	2
Segment	6
Step	0 Step	1 Step	2 Step	3 Step	4
Stream	2
Segment	7
Step	0 Step	1
Segment	8
Step	0 Step	1
Slices
0
1
2
3
Amazon Redshift Spectrum
• Extend Redshift Queries To Amazon S3
Amazon Redshift
Query	S3	directly	or	join	data	
across	Redshift	and	S3
Support	for	CSV,	Parquet,	ORC,	
Grok	and	Avro	formats
Scale	Redshift	compute	and
storage	separately	
Redshift	Query	
Engine
Redshift	Data	
Data	Lake
Amazon S3	
Spectrum
Query
SELECT	COUNT(*)
FROM	S3.EXT_TABLE
GROUP	BY…
Life	of	a	query
Amazon
Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
Data Catalog
Apache Hive Metastore
1
Life	of	a	query
Amazon
Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
Data Catalog
Apache Hive Metastore
Query	is	optimized	and	compiled	at	the	
leader	node.	Determine	what	gets	run	
locally	and	what	goes	to	Amazon	Redshift	
Spectrum
2
Life	of	a	query
Amazon
Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
Data Catalog
Apache Hive Metastore
Query	plan	is	sent	to	all	
compute	nodes3
Life	of	a	query
Amazon
Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
Data Catalog
Apache Hive Metastore
Compute	nodes	obtain	partition	info	from	
Data	Catalog;	dynamically	prune	partitions	4
Life	of	a	query
Amazon
Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
Data Catalog
Apache Hive Metastore
Each	compute	node	issues	multiple	
requests	to	the	Amazon	Redshift	
Spectrum	layer
5
Life	of	a	query
Amazon
Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
Data Catalog
Apache Hive Metastore
Amazon	Redshift	Spectrum	nodes	
scan	your	S3	data	6
Life	of	a	query
Amazon
Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
Data Catalog
Apache Hive Metastore
7
Amazon	Redshift	
Spectrum	projects,	
filters,	joins	and	
aggregates
Life	of	a	query
Amazon
Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
Data Catalog
Apache Hive Metastore
Final	aggregations	and	joins	with	
local	Amazon	Redshift	tables	done	
in-cluster
8
Life	of	a	query
Amazon
Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
Data Catalog
Apache Hive Metastore
Result	is	sent	back	to	client9
Running an analytic query over an exabyte in S3
• Roughly	140	TB	of	customer	item	order	
detail	records	for	each	day	over	past	20	
years;
• 190	million	files	across	15,000	partitions	in	
S3.	One	partition	per	day	for	USA	and	rest	
of	world;
• Total	data	size	over	an	exabyte
Optimization:
• Compression	……………..….……..5X
• Columnar	file	format……….......…10X
• Scanning	with	2500	nodes…....2500X
• Static	partition	elimination…............2X
• Dynamic	partition	elimination..….350X
• Redshift’s	query	optimizer……......40X
SELECT
P.ASIN,
P.TITLE,
R.POSTAL_CODE,
P.RELEASE_DATE,
SUM(D.QUANTITY * D.OUR_PRICE) AS SALES_sum
FROM
s3.d_customer_order_item_details D,
asin_attributes A,
products P,
regions R
WHERE
D.ASIN = P.ASIN AND
P.ASIN = A.ASIN AND
D.REGION_ID = R.REGION_ID AND
A.EDITION LIKE '%FIRST%' AND
P.TITLE LIKE '%Potter%' AND
P.AUTHOR = 'J. K. Rowling' AND
R.COUNTRY_CODE = ‘US’ AND
R.CITY = ‘Seattle’ AND
R.STATE = ‘WA’ AND
D.ORDER_DAY :: DATE >= P.RELEASE_DATE AND
D.ORDER_DAY :: DATE < dateadd(day, 3, P.RELEASE_DATE)
GROUP BY P.ASIN, P.TITLE, R.POSTAL_CODE, P.RELEASE_DATE
ORDER BY SALES_sum DESC
LIMIT 20;
Hive	(1000	node	cluster) Redshift Spectrum
5	years 155s
*	Estimated	using	20	node	Hive	cluster	&	1.4TB,	assume	linear
*	Query	used	a	20	node	DC1.8XLarge	Amazon	Redshift	cluster
*	Not	actual	sales	data	- generated	for	this	demo	based	on	data	format	used	by	
Amazon	Retail.
Query
SELECT	COUNT(*)
FROM	S3.EXT_TABLE
GROUP	BY…
Multi-Cluster
Amazon
Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
Data Catalog
Apache Hive Metastore
Amazon Redshift Spectrum
Fast @ exabyte scale Elastic & highly available On-demand, pay-per-query
High concurrency:
Multiple clusters access
same data
No ETL: Query data in-
place using open file
formats
Full Amazon Redshift
SQL support
S3
SQL
Run	SQL	queries	directly	against	data	in	S3	using	thousands	of	nodes
Amazon Redshift Spectrum
• Amazon Redshift Spectrum seamlessly integrates with your existing SQL & BI
apps
• Support for complex joins, nested queries & window functions
• Support for data partitioned in S3 by any key
Date, time, and any other custom keys
e.g., year, month, day, hour
Data Ingestion Best Practices
Data Ingestion: COPY Statement
• Ingestion Throughput:
– Each slice’s query processors
can load one file at a time:
• Streaming decompression
• Parse
• Distribute
• Write
• Realizing only partial node
usage as 6.25% of slices are
active
0 2 4 6 8 10 12 141 3 5 7 9 11 13 15
DS2.8XL	Compute	Node
1	Input	File
Data Ingestion: COPY Statement
• Number of input files should be
a multiple of the number of
slices
• Splitting the single file into 16
input files, all slices are working
to maximize ingestion
performance
• COPY continues to scale
linearly as you add nodes
16	Input	Files
Recommendation	is	to	use	delimited	files	– 1MB	to	1GB	after	GZIP	compression
0 2 4 6 8 10 12 141 3 5 7 9 11 13 15
DS2.8XL	Compute	Node
Design Considerations: Data Ingestion
• Export Data from Source System
– Delimited Files are recommend
• Pick a simple delimiter '|' or ',' or tabs
• Pick a simple NULL character (N)
• Use double quotes and an escape character ('  ') for varchars
• UTF-8 varchar columns take 4 bytes per char
– Split Files so there is a multiple of the number of slices
• E.g. 8 Slice Cluster = 8, 16, 24, 32, 40… files
– Files sizes should be 1MB – 1GB after gzip compression
• Useful COPY Options
– MAXERRORS
– ACCEPTINVCHARS
– NULL AS
Data Ingestion: Deduplication/UPSERT
BEGIN;
CREATE TEMP TABLE staging(LIKE PRODUCTION);
COPY INTO staging : ' creds ' COMPUPDATE OFF;
DELETE production p
USING staging s WHERE p.id = s.id;
INSERT INTO production SELECT * FROM staging;
DROP TABLE staging;
COMMIT;
Recently Released Features
Recently Released Features
New Column Encoding ZSTD
Unload – Can now specify file sizes
New Functions - OCTET_LENGTH, APPROXIMATE PERCENTILE_DISC
New Data Type - TIMESTAMPTZ
• Support for Timestamp with Time zone : New TIMESTAMPTZ data type to input complete timestamp values that include the
date, the time of day, and a time zone.
Eg: 30 Nov 07:37:16 2016 PST
Multi-byte Object Names
• Support for Multi-byte (UTF-8) characters for tables, columns, and other database object names
User Connection Limits
• You can now set a limit on the number of database connections a user is permitted to have open concurrently
Automatic Data Compression for CTAS
• All newly created tables will leverage default encoding
Query
SELECT	COUNT(*)
FROM S3.EXT_TABLE
GROUP	BY…
Amazon
Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
Data Catalog
Apache Hive Metastore
Query Monitoring Rules (QMR)
• Goal - Automatic handling of run-away (poorly written) queries
• Extension to WLM
• Rules applied onto a WLM Queue
– Queries can be LOGGED, CANCELED or HOPPED
Query Monitoring Rules (QMR)
• Metrics with operators and values (e.g. query_cpu_time > 1000) create a
predicate
• Multiple predicates can be AND-ed together to create a rule
• Multiple rules can be defined for a queue in WLM. These rules are OR-ed
together
If { rule } then [action]
{ rule : metric operator value } e.g.: rows_scanned >
100000
• Metric: cpu_time, query_blocks_read, rows scanned,
query execution time, cpu & io skew per slice,
join_row_count, etc.
• Operator: <, >, ==
• Value: integer
[action]: hop, log, abort
BI tools SQL clientsAnalytics tools
Client AWS
Redshift
ADFS
Corporate
Active Directory IAM
Amazon Redshift
ODBC/JDBC
User groups Individual user
Single Sign-On
Identity providers
New Redshift
ODBC/JDBC
drivers. Grab the
ticket (userid) and
get a SAML
assertion.
IAM Authentication
SQL Scalar User-Defined Functions
Language SQL Support Added for Scalar UDFs
CREATE OR REPLACE FUNCTION inet_ntoa(bigint)
RETURNS varchar(15)
AS
$$
SELECT
CASE WHEN $1 BETWEEN 0 AND 4294967295 THEN
(($1 >> 24) & 255) || '.' ||
(($1 >> 16) & 255) || '.' ||
(($1 >> 8) & 255) || '.' ||
(($1 >> 0) & 255)
ELSE
NULL
END
$$ LANGUAGE SQL IMMUTABLE;
Example:
Resources
• https://github.com/awslabs/amazon-redshift-utils
• https://github.com/awslabs/amazon-redshift-monitoring
• https://github.com/awslabs/amazon-redshift-udfs
• Admin scripts
Collection of utilities for running diagnostics on your cluster
• Admin views
Collection of utilities for managing your cluster, generating schema DDL, etc.
• ColumnEncodingUtility
Gives you the ability to apply optimal column encoding to an established schema with
data already loaded
• Amazon Redshift Engineering’s Advanced Table Design Playbook
https://aws.amazon.com/blogs/big-data/amazon-redshift-engineerings-advanced-table-design-playbook-preamble-prerequisites-and-
prioritization/
Pop-up Loft
aws.amazon.com/activate
Thank you!

Más contenido relacionado

La actualidad más candente

Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
Dvir Volk
 

La actualidad más candente (20)

Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas Patil
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWS
 
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical Optimization
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Amazon Aurora: Under the Hood
Amazon Aurora: Under the HoodAmazon Aurora: Under the Hood
Amazon Aurora: Under the Hood
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
Apache Hive Tutorial
Apache Hive TutorialApache Hive Tutorial
Apache Hive Tutorial
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation Engines
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 

Similar a Deep Dive on Amazon Redshift

AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
Volodymyr Rovetskiy
 

Similar a Deep Dive on Amazon Redshift (20)

Amazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech TalksAmazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech Talks
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Data Warehousing in the Era of Big Data
Data Warehousing in the Era of Big DataData Warehousing in the Era of Big Data
Data Warehousing in the Era of Big Data
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
SRV405 Deep Dive on Amazon Redshift
SRV405 Deep Dive on Amazon RedshiftSRV405 Deep Dive on Amazon Redshift
SRV405 Deep Dive on Amazon Redshift
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
 
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
 
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftBest Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
 
Data Warehousing in the Era of Big Data: Deep Dive into Amazon Redshift
Data Warehousing in the Era of Big Data: Deep Dive into Amazon RedshiftData Warehousing in the Era of Big Data: Deep Dive into Amazon Redshift
Data Warehousing in the Era of Big Data: Deep Dive into Amazon Redshift
 
SQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedSQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - Advanced
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
16119 - Get to Know Your Data Sets (1).pdf
16119 - Get to Know Your Data Sets (1).pdf16119 - Get to Know Your Data Sets (1).pdf
16119 - Get to Know Your Data Sets (1).pdf
 
Building your data warehouse with Redshift
Building your data warehouse with RedshiftBuilding your data warehouse with Redshift
Building your data warehouse with Redshift
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
cache memory introduction, level, function
cache memory introduction, level, functioncache memory introduction, level, function
cache memory introduction, level, function
 

Más de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Deep Dive on Amazon Redshift

  • 1. Pop-up Loft Deep Dive on Amazon Redshift Storage Subsystem and Query Life Cycle Dhanraj Pondicherry, Solutions Architect Manager October 2017
  • 2. Are you a user ?
  • 3. Deep Dive Overview • Redshift Cluster Architecture • Storage Deep Dive • Reduce I/O • Parallelism • Query Life Cycle • Data Ingestion Best Practices • Recently Released Features • Open Q&A
  • 4. Amazon Redshift Cluster Architecture
  • 5. Redshift Cluster Architecture • Massively parallel, shared nothing • Leader node – SQL endpoint – Stores metadata – Coordinates parallel SQL processing • Compute nodes – Local, columnar storage – Executes queries in parallel – Load, backup, restore 10 GigE Ingestion Backup Restore SQL Clients/BI Tools 128GB RAM 16TB disk 16 cores S3 / EMR / DynamoDB / SSH JDBC/ODBC 128GB RAM 16TB disk 16 coresCompute Node 128GB RAM 16TB disk 16 coresCompute Node 128GB RAM 16TB disk 16 coresCompute Node Leader Node
  • 7. 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores • Parser & Rewriter • Planner & Optimizer • Code Generator • Input: Optimized plan • Output: >=1 C++ functions • Task Scheduler • WLM • Admission • Scheduling • PostgreSQL Catalog Tables 128GB RAM 16TB disk 16 cores Compute Node Compute Node Compute Node Leader Node
  • 8. 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores • Query execution processes • Backup & restore processes • Replication processes • Local Storage • Disks • Slices • Tables • Columns • Blocks 128GB RAM 16TB disk 16 cores Compute Node Compute Node Compute Node Leader Node
  • 9. 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores • Query execution processes • Backup & restore processes • Replication processes • Local Storage • Disks • Slices • Tables • Columns • Blocks Leader Node 128GB RAM 16TB disk 16 cores Compute Node Compute Node Compute Node
  • 11. Storage Deep Dive: Compute Node • Node Types – – Compute Optimized – Storage Optimized • Node Size – • DC1.large –.8xLarge • DS2.xlarge - 8xlarge • Number of Nodes in the cluster
  • 12. Storage Deep Dive: Disks • Redshift utilizes locally attached storage devices • Compute nodes have 2.5-3x the advertised storage capacity • 1, 3, 8, or 24 disks depending on node type • Each disk is split into two partitions – Local data storage, accessed by local CN – Mirrored data, accessed by remote CN • Partitions are raw devices – Local storage devices are ephemeral in nature – Tolerant to multiple disk failures on a single node
  • 13. Slices • A slice can be thought of like a “virtual compute node” – Unit of data partitioning – Parallel query processing • Facts about slices: – Each compute node has either 2, 16, or 32 slices – Table rows are distributed to slices – A slice processes only its own data
  • 14. Storage Deep Dive: Columns • Column: Logical structure accessible via SQL • Physical structure is a doubly linked list of blocks • These blockchains exist on each slice for each column • All sorted & unsorted blockchains compose a column • Column properties include: – Distribution Key – Sort Key – Compression Encoding • Columns shrink and grow independently, 1 block at a time • Three system columns per table-per slice for MVCC
  • 15. Storage Deep Dive: Blocks • Column data is persisted to 1MB immutable blocks • Each block contains in-memory metadata: – Zone Maps (MIN/MAX value) – Location of previous/next block • Blocks are individually compressed with 1 of 10 encodings • A full block contains between 16 and 8.4 million values
  • 16. Block Properties: Design Considerations • Small writes: • Batch processing system, optimized for processing massive amounts of data • 1MB size + immutable blocks means that we clone blocks on write so as not to introduce fragmentation • Small write (~1-10 rows) has similar cost to a larger write (~100 K rows) • UPDATE and DELETE: • Immutable blocks means that we only logically delete rows on UPDATE or DELETE • Must VACUUM or DEEP COPY to remove ghost rows from table
  • 17. Column Properties: Design Considerations • Compression: • COPY automatically analyzes and compresses data when loading into empty tables • ANALYZE COMPRESSION checks existing tables and proposes optimal compression algorithms for each column • Changing column encoding requires a table rebuild • DISTKEY and SORTKEY significantly influence performance (orders of magnitude) • Distribution Keys: • A poor DISTKEY can introduce data skew and an unbalanced workload • A query completes only as fast as the slowest slice completes • Sort Keys: • A sortkey is only effective as the data profile allows it to be • Selectivity needs to be considered
  • 19. Reduce I/O w/ Data Sorting • Goal: • Make queries run faster by optimizing the effectiveness of zone maps • Typically on the columns that are filtered on (where clause predicates) • Impact: • Enables range restricted scans to prune blocks by leveraging zone maps • Overall reduction in block I/O • Achieved with the table property SORTKEY defined on one or more columns • Optimal SORTKEY is dependent on: • Query patterns • Data profile • Business requirements
  • 20. Designed for I/O Reduction • Columnar storage • Data compression • Zone maps aid loc dt 1 SFO 2016-09-01 2 JFK 2016-09-14 3 SFO 2017-04-01 4 JFK 2017-05-14 • Accessing dt with row storage: – Need to read everything – Unnecessary I/O aid loc dt CREATE TABLE deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date );
  • 21. Designed for I/O Reduction • Columnar storage • Data compression • Zone maps aid loc dt 1 SFO 2016-09-01 2 JFK 2016-09-14 3 SFO 2017-04-01 4 JFK 2017-05-14 • Accessing dt with columnar storage: – Only scan blocks for relevant column aid loc dt CREATE TABLE deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date );
  • 22. Designed for I/O Reduction • Columnar storage • Data compression • Zone maps aid loc dt 1 SFO 2016-09-01 2 JFK 2016-09-14 3 SFO 2017-04-01 4 JFK 2017-05-14 • Columns grow and shrink independently • Effective compression ratios due to like data • Reduces storage requirements • Reduces I/O aid loc dt CREATE TABLE deep_dive ( aid INT ENCODE LZO ,loc CHAR(3) ENCODE BYTEDICT ,dt DATE ENCODE RUNLENGTH );
  • 23. Designed for I/O Reduction • Columnar storage • Data compression • Zone maps aid loc dt 1 SFO 2016-09-01 2 JFK 2016-09-14 3 SFO 2017-04-01 4 JFK 2017-05-14 aid loc dt CREATE TABLE deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date ); • In-memory block metadata • Contains per-block MIN and MAX value • Effectively prunes blocks which cannot contain data for a given query • Eliminates unnecessary I/O
  • 24. SELECT COUNT(*) FROM deep_dive WHERE DATE = '09-JUNE-2013' MIN: 01-JUNE-2013 MAX: 20-JUNE-2013 MIN: 08-JUNE-2013 MAX: 30-JUNE-2013 MIN: 12-JUNE-2013 MAX: 20-JUNE-2013 MIN: 02-JUNE-2013 MAX: 25-JUNE-2013 Unsorted Table MIN: 01-JUNE-2013 MAX: 06-JUNE-2013 MIN: 07-JUNE-2013 MAX: 12-JUNE-2013 MIN: 13-JUNE-2013 MAX: 18-JUNE-2013 MIN: 19-JUNE-2013 MAX: 24-JUNE-2013 Sorted By Date Zone Maps
  • 25. Table Design Considerations • Denormalize/materialize heavily filtered columns into the fact table(s) – Timestamp/date should be in fact tables rather than using time dimension tables • Add SORT KEYS to medium and large sized tables on the primary columns that are filtered on – Not necessary to add them to small tables • Keep variable length data types as long as required – VARCHAR, CHAR and NUMERIC • Add compression to tables – Optimal compression can be found using ANALYZE COMPRESSION
  • 27. Storage Deep Dive: Slices • Each compute node has either 2, 16, or 32 slices • A slice can be thought of like a “virtual compute node” – Unit of data partitioning – Parallel query processing • Facts about slices: – Table rows are distributed to slices – A slice processes only its own data – Within a compute node all slices read from and write to all disks
  • 28. Terminology and Concepts: Data Distribution • KEY – The key creates an even distribution of data – Joins are performed between large fact/dimension tables – Optimizing joins and group by • ALL – Small and medium size dimension tables (< 2-3M) • EVEN – When key cannot produce an even distribution
  • 29. Data Distribution • Distribution style is a table property which dictates how that table’s data is distributed throughout the cluster: • KEY: Value is hashed, same value goes to same location (slice) • ALL: Full table data goes to first slice of every node • EVEN: Round robin • Goals: • Distribute data evenly for parallel processing • Minimize data movement during query processing KEY ALL Node 1 Slice 1 Slice 2 Node 2 Slice 3 Slice 4 Node 1 Slice 1 Slice 2 Node 2 Slice 3 Slice 4 Node 1 Slice 1 Slice 2 Node 2 Slice 3 Slice 4 EVEN
  • 30. Data Distribution: Example CREATE TABLE deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date ) DISTSTYLE (EVEN|KEY|ALL); CN1 Slice 0 Slice 1 CN2 Slice 2 Slice 3 Table: deep_dive User Columns System Columns aid loc dt ins del row
  • 31. Data Distribution: EVEN Example CREATE TABLE deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date ) DISTSTYLE EVEN; CN1 Slice 0 Slice 1 CN2 Slice 2 Slice 3 INSERT INTO deep_dive VALUES (1, 'SFO', '2016-09-01'), (2, 'JFK', '2016-09-14'), (3, 'SFO', '2017-04-01'), (4, 'JFK', '2017-05-14'); Table: deep_dive User Columns System Columns aid loc dt ins del row Table: deep_dive User Columns System Columns aid loc dt ins del row Table: deep_dive User Columns System Columns aid loc dt ins del row Table: deep_dive User Columns System Columns aid loc dt ins del row Rows: 0 Rows: 0 Rows: 0 Rows: 0 (3 User Columns + 3 System Columns) x (4 slices) = 24 Blocks (24MB) Rows: 1 Rows: 1 Rows: 1 Rows: 1
  • 32. Data Distribution: KEY Example #1 CREATE TABLE deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date ) DISTSTYLE KEY DISTKEY (loc); CN1 Slice 0 Slice 1 CN2 Slice 2 Slice 3 INSERT INTO deep_dive VALUES (1, 'SFO', '2016-09-01'), (2, 'JFK', '2016-09-14'), (3, 'SFO', '2017-04-01'), (4, 'JFK', '2017-05-14'); Table: deep_dive User Columns System Columns aid loc dt ins del row Rows: 2 Rows: 0 Rows: 0 (3 User Columns + 3 System Columns) x (2 slices) = 12 Blocks (12 MB) Rows: 0Rows: 1 Table: deep_dive User Columns System Columns aid loc dt ins del row Rows: 2Rows: 0Rows: 1
  • 33. Data Distribution: KEY Example #2 CREATE TABLE deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date ) DISTSTYLE KEY DISTKEY (aid); CN1 Slice 0 Slice 1 CN2 Slice 2 Slice 3 INSERT INTO deep_dive VALUES (1, 'SFO', '2016-09-01'), (2, 'JFK', '2016-09-14'), (3, 'SFO', '2017-04-01'), (4, 'JFK', '2017-05-14'); Table: deep_dive User Columns System Columns aid loc dt ins del row Table: deep_dive User Columns System Columns aid loc dt ins del row Table: deep_dive User Columns System Columns aid loc dt ins del row Table: deep_dive User Columns System Columns aid loc dt ins del row Rows: 0 Rows: 0 Rows: 0 Rows: 0 (3 User Columns + 3 System Columns) x (4 slices) = 24 Blocks (24 MB) Rows: 1 Rows: 1 Rows: 1 Rows: 1
  • 34. Data Distribution: ALL Example CREATE TABLE deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date ) DISTSTYLE ALL; CN1 Slice 0 Slice 1 CN2 Slice 2 Slice 3 INSERT INTO deep_dive VALUES (1, 'SFO', '2016-09-01'), (2, 'JFK', '2016-09-14'), (3, 'SFO', '2017-04-01'), (4, 'JFK', '2017-05-14'); Rows: 0 Rows: 0 (3 User Columns + 3 System Columns) x (2 slice) = 12 Blocks (12MB) Table: deep_dive User Columns System Columns aid loc dt ins del row Rows: 0Rows: 1Rows: 2Rows: 4Rows: 3 Table: deep_dive User Columns System Columns aid loc dt ins del row Rows: 0Rows: 1Rows: 2Rows: 4Rows: 3
  • 36. 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores • Parser & Rewriter • Planner & Optimizer • Code Generator • Input: Optimized plan • Output: >=1 C++ functions • Compiler • Task Scheduler • WLM • Admission • Scheduling • PostgreSQL Catalog Tables • Redshift System Tables (STV) 128GB RAM 16TB disk 16 cores Compute Node Compute Node Compute Node 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores Leader Node
  • 37. 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores Compute Node 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores • Parser & Rewriter • Planner & Optimizer • Code Generator • Input: Optimized plan • Output: >=1 C++ functions • Compiler • Task Scheduler • WLM • Admission • Scheduling • PostgreSQL Catalog Tables • Redshift System Tables (STV) Compute Node Compute Node 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores 128GB RAM 16TB disk 16 cores Leader Node
  • 38. Query Execution Terminology • Step: An individual operation needed during query execution. Steps need to be combined to allow compute nodes to perform a join. Examples: scan, sort, hash, aggr • Segment: A combination of several steps that can be done by a single process. The smallest compilation unit executable by a slice. Segments within a stream run in parallel. • Stream: A collection of combined segments which output to the next stream or SQL client.
  • 39. Visualizing Streams, Segments, and Steps Stream 0 Segment 0 Step 0 Step 1 Step 2 Segment 1 Step 0 Step 1 Step 2 Step 3 Step 4 Segment 2 Step 0 Step 1 Step 2 Step 3 Segment 3 Step 0 Step 1 Step 2 Step 3 Step 4 Step 5 Stream 1 Segment 4 Step 0 Step 1 Step 2 Step 3 Segment 5 Step 0 Step 1 Step 2 Segment 6 Step 0 Step 1 Step 2 Step 3 Step 4 Stream 2 Segment 7 Step 0 Step 1 Segment 8 Step 0 Step 1 Time
  • 40. client JDBC ODBC Leader Node Parser Query Planner Code Generator Final Computations Generate code for all segments of one stream Explain Plans Compute Node Receive Compiled Code Run the Compiled Code Return results to Leader Compute Node Receive Compiled Code Run the Compiled Code Return results to Leader Return results to client Segments in a stream are executed concurrently. Each step in a segment is executed serially. Query Lifecycle
  • 41. Query Execution Deep Dive: Leader Node 1. The leader node receives the query and parses the SQL. 2. The parser produces a logical representation of the original query. 3. This query tree is input into the query optimizer (volt). 4. Volt rewrites the query to maximize its efficiency. Sometimes a single query will be rewritten as several dependent statements in the background. 5. The rewritten query is sent to the planner which generates >= 1 query plans for the execution with the best estimated performance. 6. The query plan is sent to the execution engine, where it’s translated into steps, segments, and streams. 7. This translated plan is sent to the code generator, which generates a C++ function for each segment. 8. This generated C++ is compiled with gcc to a .o file and distributed to the compute nodes.
  • 42. Query Execution Deep Dive: Compute Nodes • Slices execute the query segments in parallel. • Executable segments are created for one stream at a time. When the segments of that stream are complete, the engine generates the segments for the next stream. • When the compute nodes are done, they return the query results to the leader node for final processing. • The leader node merges the data into a single result set and addresses any needed sorting or aggregation. • The leader node then returns the results to the client.
  • 43. Visualizing Streams, Segments, and Steps Stream 0 Segment 0 Step 0 Step 1 Step 2 Segment 1 Step 0 Step 1 Step 2 Step 3 Step 4 Segment 2 Step 0 Step 1 Step 2 Step 3 Segment 3 Step 0 Step 1 Step 2 Step 3 Step 4 Step 5 Stream 1 Segment 4 Step 0 Step 1 Step 2 Step 3 Segment 5 Step 0 Step 1 Step 2 Segment 6 Step 0 Step 1 Step 2 Step 3 Step 4 Stream 2 Segment 7 Step 0 Step 1 Segment 8 Step 0 Step 1 Time
  • 44. Query Execution Stream 0 Segment 0 Step 0 Step 1 Step 2 Segment 1 Step 0 Step 1 Step 2 Step 3 Step 4 Segment 2 Step 0 Step 1 Step 2 Step 3 Segment 3 Step 0 Step 1 Step 2 Step 3 Step 4 Step 5 Stream 1 Segment 4 Step 0 Step 1 Step 2 Step 3 Segment 5 Step 0 Step 1 Step 2 Segment 6 Step 0 Step 1 Step 2 Step 3 Step 4 Stream 2 Segment 7 Step 0 Step 1 Segment 8 Step 0 Step 1 Time Stream 0 Segment 0 Step 0 Step 1 Step 2 Segment 1 Step 0 Step 1 Step 2 Step 3 Step 4 Segment 2 Step 0 Step 1 Step 2 Step 3 Segment 3 Step 0 Step 1 Step 2 Step 3 Step 4 Step 5 Stream 1 Segment 4 Step 0 Step 1 Step 2 Step 3 Segment 5 Step 0 Step 1 Step 2 Segment 6 Step 0 Step 1 Step 2 Step 3 Step 4 Stream 2 Segment 7 Step 0 Step 1 Segment 8 Step 0 Step 1 Stream 0 Segment 0 Step 0 Step 1 Step 2 Segment 1 Step 0 Step 1 Step 2 Step 3 Step 4 Segment 2 Step 0 Step 1 Step 2 Step 3 Segment 3 Step 0 Step 1 Step 2 Step 3 Step 4 Step 5 Stream 1 Segment 4 Step 0 Step 1 Step 2 Step 3 Segment 5 Step 0 Step 1 Step 2 Segment 6 Step 0 Step 1 Step 2 Step 3 Step 4 Stream 2 Segment 7 Step 0 Step 1 Segment 8 Step 0 Step 1 Stream 0 Segment 0 Step 0 Step 1 Step 2 Segment 1 Step 0 Step 1 Step 2 Step 3 Step 4 Segment 2 Step 0 Step 1 Step 2 Step 3 Segment 3 Step 0 Step 1 Step 2 Step 3 Step 4 Step 5 Stream 1 Segment 4 Step 0 Step 1 Step 2 Step 3 Segment 5 Step 0 Step 1 Step 2 Segment 6 Step 0 Step 1 Step 2 Step 3 Step 4 Stream 2 Segment 7 Step 0 Step 1 Segment 8 Step 0 Step 1 Slices 0 1 2 3
  • 45. Amazon Redshift Spectrum • Extend Redshift Queries To Amazon S3 Amazon Redshift Query S3 directly or join data across Redshift and S3 Support for CSV, Parquet, ORC, Grok and Avro formats Scale Redshift compute and storage separately Redshift Query Engine Redshift Data Data Lake Amazon S3 Spectrum
  • 46. Query SELECT COUNT(*) FROM S3.EXT_TABLE GROUP BY… Life of a query Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore 1
  • 47. Life of a query Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore Query is optimized and compiled at the leader node. Determine what gets run locally and what goes to Amazon Redshift Spectrum 2
  • 48. Life of a query Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore Query plan is sent to all compute nodes3
  • 49. Life of a query Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore Compute nodes obtain partition info from Data Catalog; dynamically prune partitions 4
  • 50. Life of a query Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore Each compute node issues multiple requests to the Amazon Redshift Spectrum layer 5
  • 51. Life of a query Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore Amazon Redshift Spectrum nodes scan your S3 data 6
  • 52. Life of a query Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore 7 Amazon Redshift Spectrum projects, filters, joins and aggregates
  • 53. Life of a query Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore Final aggregations and joins with local Amazon Redshift tables done in-cluster 8
  • 54. Life of a query Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore Result is sent back to client9
  • 55. Running an analytic query over an exabyte in S3 • Roughly 140 TB of customer item order detail records for each day over past 20 years; • 190 million files across 15,000 partitions in S3. One partition per day for USA and rest of world; • Total data size over an exabyte Optimization: • Compression ……………..….……..5X • Columnar file format……….......…10X • Scanning with 2500 nodes…....2500X • Static partition elimination…............2X • Dynamic partition elimination..….350X • Redshift’s query optimizer……......40X SELECT P.ASIN, P.TITLE, R.POSTAL_CODE, P.RELEASE_DATE, SUM(D.QUANTITY * D.OUR_PRICE) AS SALES_sum FROM s3.d_customer_order_item_details D, asin_attributes A, products P, regions R WHERE D.ASIN = P.ASIN AND P.ASIN = A.ASIN AND D.REGION_ID = R.REGION_ID AND A.EDITION LIKE '%FIRST%' AND P.TITLE LIKE '%Potter%' AND P.AUTHOR = 'J. K. Rowling' AND R.COUNTRY_CODE = ‘US’ AND R.CITY = ‘Seattle’ AND R.STATE = ‘WA’ AND D.ORDER_DAY :: DATE >= P.RELEASE_DATE AND D.ORDER_DAY :: DATE < dateadd(day, 3, P.RELEASE_DATE) GROUP BY P.ASIN, P.TITLE, R.POSTAL_CODE, P.RELEASE_DATE ORDER BY SALES_sum DESC LIMIT 20; Hive (1000 node cluster) Redshift Spectrum 5 years 155s * Estimated using 20 node Hive cluster & 1.4TB, assume linear * Query used a 20 node DC1.8XLarge Amazon Redshift cluster * Not actual sales data - generated for this demo based on data format used by Amazon Retail.
  • 56. Query SELECT COUNT(*) FROM S3.EXT_TABLE GROUP BY… Multi-Cluster Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore
  • 57. Amazon Redshift Spectrum Fast @ exabyte scale Elastic & highly available On-demand, pay-per-query High concurrency: Multiple clusters access same data No ETL: Query data in- place using open file formats Full Amazon Redshift SQL support S3 SQL Run SQL queries directly against data in S3 using thousands of nodes
  • 58. Amazon Redshift Spectrum • Amazon Redshift Spectrum seamlessly integrates with your existing SQL & BI apps • Support for complex joins, nested queries & window functions • Support for data partitioned in S3 by any key Date, time, and any other custom keys e.g., year, month, day, hour
  • 59. Data Ingestion Best Practices
  • 60. Data Ingestion: COPY Statement • Ingestion Throughput: – Each slice’s query processors can load one file at a time: • Streaming decompression • Parse • Distribute • Write • Realizing only partial node usage as 6.25% of slices are active 0 2 4 6 8 10 12 141 3 5 7 9 11 13 15 DS2.8XL Compute Node 1 Input File
  • 61. Data Ingestion: COPY Statement • Number of input files should be a multiple of the number of slices • Splitting the single file into 16 input files, all slices are working to maximize ingestion performance • COPY continues to scale linearly as you add nodes 16 Input Files Recommendation is to use delimited files – 1MB to 1GB after GZIP compression 0 2 4 6 8 10 12 141 3 5 7 9 11 13 15 DS2.8XL Compute Node
  • 62. Design Considerations: Data Ingestion • Export Data from Source System – Delimited Files are recommend • Pick a simple delimiter '|' or ',' or tabs • Pick a simple NULL character (N) • Use double quotes and an escape character (' ') for varchars • UTF-8 varchar columns take 4 bytes per char – Split Files so there is a multiple of the number of slices • E.g. 8 Slice Cluster = 8, 16, 24, 32, 40… files – Files sizes should be 1MB – 1GB after gzip compression • Useful COPY Options – MAXERRORS – ACCEPTINVCHARS – NULL AS
  • 63. Data Ingestion: Deduplication/UPSERT BEGIN; CREATE TEMP TABLE staging(LIKE PRODUCTION); COPY INTO staging : ' creds ' COMPUPDATE OFF; DELETE production p USING staging s WHERE p.id = s.id; INSERT INTO production SELECT * FROM staging; DROP TABLE staging; COMMIT;
  • 65. Recently Released Features New Column Encoding ZSTD Unload – Can now specify file sizes New Functions - OCTET_LENGTH, APPROXIMATE PERCENTILE_DISC New Data Type - TIMESTAMPTZ • Support for Timestamp with Time zone : New TIMESTAMPTZ data type to input complete timestamp values that include the date, the time of day, and a time zone. Eg: 30 Nov 07:37:16 2016 PST Multi-byte Object Names • Support for Multi-byte (UTF-8) characters for tables, columns, and other database object names User Connection Limits • You can now set a limit on the number of database connections a user is permitted to have open concurrently Automatic Data Compression for CTAS • All newly created tables will leverage default encoding
  • 66. Query SELECT COUNT(*) FROM S3.EXT_TABLE GROUP BY… Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore
  • 67. Query Monitoring Rules (QMR) • Goal - Automatic handling of run-away (poorly written) queries • Extension to WLM • Rules applied onto a WLM Queue – Queries can be LOGGED, CANCELED or HOPPED
  • 68. Query Monitoring Rules (QMR) • Metrics with operators and values (e.g. query_cpu_time > 1000) create a predicate • Multiple predicates can be AND-ed together to create a rule • Multiple rules can be defined for a queue in WLM. These rules are OR-ed together If { rule } then [action] { rule : metric operator value } e.g.: rows_scanned > 100000 • Metric: cpu_time, query_blocks_read, rows scanned, query execution time, cpu & io skew per slice, join_row_count, etc. • Operator: <, >, == • Value: integer [action]: hop, log, abort
  • 69. BI tools SQL clientsAnalytics tools Client AWS Redshift ADFS Corporate Active Directory IAM Amazon Redshift ODBC/JDBC User groups Individual user Single Sign-On Identity providers New Redshift ODBC/JDBC drivers. Grab the ticket (userid) and get a SAML assertion. IAM Authentication
  • 70. SQL Scalar User-Defined Functions Language SQL Support Added for Scalar UDFs CREATE OR REPLACE FUNCTION inet_ntoa(bigint) RETURNS varchar(15) AS $$ SELECT CASE WHEN $1 BETWEEN 0 AND 4294967295 THEN (($1 >> 24) & 255) || '.' || (($1 >> 16) & 255) || '.' || (($1 >> 8) & 255) || '.' || (($1 >> 0) & 255) ELSE NULL END $$ LANGUAGE SQL IMMUTABLE; Example:
  • 71. Resources • https://github.com/awslabs/amazon-redshift-utils • https://github.com/awslabs/amazon-redshift-monitoring • https://github.com/awslabs/amazon-redshift-udfs • Admin scripts Collection of utilities for running diagnostics on your cluster • Admin views Collection of utilities for managing your cluster, generating schema DDL, etc. • ColumnEncodingUtility Gives you the ability to apply optimal column encoding to an established schema with data already loaded • Amazon Redshift Engineering’s Advanced Table Design Playbook https://aws.amazon.com/blogs/big-data/amazon-redshift-engineerings-advanced-table-design-playbook-preamble-prerequisites-and- prioritization/