Still using
Windows 3.1?
So why stick to
SQL-92?
@ModernSQL - http://modern-sql.com/
@MarkusWinand
SQL:1999
LATERAL
Select-list sub-queries must be scalar[0]:
LATERAL Before SQL:1999
SELECT	…	
					,	(SELECT	column_1	
										FROM	t1	
	...
Select-list sub-queries must be scalar[0]:
LATERAL Before SQL:1999
SELECT	…	
					,	(SELECT	column_1	
										FROM	t1	
	...
Lateral derived tables lift both limitations and can be correlated:
LATERAL Since SQL:1999
SELECT	…	
					,	ldt.*	
		FROM	...
Lateral derived tables lift both limitations and can be correlated:
LATERAL Since SQL:1999
SELECT	…	
					,	ldt.*	
		FROM	...
FROM	t	
	JOIN	LATERAL	(SELECT	…	
																	FROM	…	
																WHERE	t.c=…	
																ORDER	BY	…	
							...
FROM	t	
	JOIN	LATERAL	(SELECT	…	
																	FROM	…	
																WHERE	t.c=…	
																ORDER	BY	…	
							...
LATERAL is the "for each" loop of SQL
LATERAL plays well with outer and cross joins	
LATERAL is great for Top-N subqueries...
LATERAL Availability
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1 MariaDB
MySQL
9.3 PostgreSQL
SQLite
9.1 DB2 LUW
11gR...
GROUPING	SETS
Only one GROUP	BY operation at a time:
GROUPING	SETS Before SQL:1999
SELECT	year	
					,	month	
					,	sum(revenue)	
		FRO...
GROUPING	SETS Before SQL:1999
SELECT	year	
					,	month	
					,	sum(revenue)	
		FROM	tbl	
	GROUP	BY	year,	month											...
GROUPING	SETS Before SQL:1999
SELECT	year	
					,	month	
					,	sum(revenue)	
		FROM	tbl	
	GROUP	BY	year,	month											...
GROUPING	SETS Since SQL:1999
SELECT	year	
					,	month	
					,	sum(revenue)	
		FROM	tbl	
	GROUP	BY	year,	month											
...
GROUPING	SETS are multiple GROUP	BYs in one go
() (empty brackets) build a group over all rows
GROUPING (function) disambi...
GROUPING	SETS Availability
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1[0]
MariaDB
5.0[0]
MySQL
9.5 PostgreSQL
SQLite
...
WITH
(Common Table Expressions)
WITH (non-recursive) The Problem
Nested queries are hard to read:
SELECT	…	
		FROM	(SELECT	…	
										FROM	t1	
									...
Understand
this first
WITH (non-recursive) The Problem
Nested queries are hard to read:
SELECT	…	
		FROM	(SELECT	…	
						...
Then this...
WITH (non-recursive) The Problem
Nested queries are hard to read:
SELECT	…	
		FROM	(SELECT	…	
										FROM	...
Then this...
WITH (non-recursive) The Problem
Nested queries are hard to read:
SELECT	…	
		FROM	(SELECT	…	
										FROM	...
Finally the first line makes sense
WITH (non-recursive) The Problem
Nested queries are hard to read:
SELECT	…	
		FROM	(SEL...
CTEs are statement-scoped views:
WITH	
	a	(c1,	c2,	c3)	
AS	(SELECT	c1,	c2,	c3	FROM	…),	
	b	(c4,	…)	
AS	(SELECT	c4,	…	
				...
CTEs are statement-scoped views:
WITH	
	a	(c1,	c2,	c3)	
AS	(SELECT	c1,	c2,	c3	FROM	…),	
	b	(c4,	…)	
AS	(SELECT	c4,	…	
				...
CTEs are statement-scoped views:
WITH	
	a	(c1,	c2,	c3)	
AS	(SELECT	c1,	c2,	c3	FROM	…),	
	b	(c4,	…)	
AS	(SELECT	c4,	…	
				...
CTEs are statement-scoped views:
WITH	
	a	(c1,	c2,	c3)	
AS	(SELECT	c1,	c2,	c3	FROM	…),	
	b	(c4,	…)	
AS	(SELECT	c4,	…	
				...
CTEs are statement-scoped views:
WITH	
	a	(c1,	c2,	c3)	
AS	(SELECT	c1,	c2,	c3	FROM	…),	
	b	(c4,	…)	
AS	(SELECT	c4,	…	
				...
CTEs are statement-scoped views:
WITH	
	a	(c1,	c2,	c3)	
AS	(SELECT	c1,	c2,	c3	FROM	…),	
	b	(c4,	…)	
AS	(SELECT	c4,	…	
				...
WITH	
	a	(c1,	c2,	c3)	
AS	(SELECT	c1,	c2,	c3	FROM	…),	
	b	(c4,	…)	
AS	(SELECT	c4,	…	
						FROM	t1	
						JOIN	a	
								...
WITH	
	a	(c1,	c2,	c3)	
AS	(SELECT	c1,	c2,	c3	FROM	…),	
	b	(c4,	…)	
AS	(SELECT	c4,	…	
						FROM	t1	
						JOIN	a	
								...
WITH	
	a	(c1,	c2,	c3)	
AS	(SELECT	c1,	c2,	c3	FROM	…),	
	b	(c4,	…)	
AS	(SELECT	c4,	…	
						FROM	t1	
						JOIN	a	
								...
CTEs are statement-scoped views:
WITH	
	a	(c1,	c2,	c3)	
AS	(SELECT	c1,	c2,	c3	FROM	…),	
	b	(c4,	…)	
AS	(SELECT	c4,	…	
				...
‣ Literate SQL


Organize SQL code to

improve maintainability
‣ Assign column names


to tables produced by values

or un...
WITH are the "private methods" of SQL
WITH is a prefix to SELECT
WITH queries are only visible in the SELECT

				 they pre...
PostgreSQL “issues”WITH (non-recursive)
In PostgreSQL WITH queries are “optimizer fences”:



WITH	cte	AS

(SELECT	*

			F...
CTE	Scan	on	cte	
	(rows=6370)	
	Filter:	topic	=	1	
	CTE	cte	
	->	Seq	Scan	on	news	
				(rows=10000001)
PostgreSQL “issues”...
CTE	Scan	on	cte	
	(rows=6370)	
	Filter:	topic	=	1	
	CTE	cte	
	->	Seq	Scan	on	news	
				(rows=10000001)
PostgreSQL “issues”...
CTE	Scan	on	cte	
	(rows=6370)	
	Filter:	topic	=	1	
	CTE	cte	
	->	Seq	Scan	on	news	
				(rows=10000001)
PostgreSQL “issues”...
Views and derived tables support "predicate pushdown":



SELECT	*

		FROM	(SELECT	*

										FROM	news

							)	n

	WH...
Views and derived tables support "predicate pushdown":



SELECT	*

		FROM	(SELECT	*

										FROM	news

							)	n

	WH...
PostgreSQL 9.1+ allows DML within WITH:


WITH	deleted_rows	AS	(

			DELETE	FROM	source_tbl

			RETURNING	*

)

INSERT	INT...
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1[0]
MariaDB
MySQL[1]
8.4 PostgreSQL
3.8.3[2]
SQLite
7 DB2 LUW
9iR2 Oracle
...
WITH	RECURSIVE
(Common Table Expressions)
CREATE	TABLE	t	(	
			id	NUMERIC	NOT	NULL,	
			parent_id	NUMERIC,	
			…	
			PRIMARY	KEY	(id)	
)
Coping with hierarchies in ...
SELECT	*	
		FROM	t	AS	d0	
		LEFT	JOIN	t	AS	d1		
				ON	(d1.parent_id=d0.id)	
		LEFT	JOIN	t	AS	d2		
				ON	(d2.parent_id=d1...
SELECT	*	
		FROM	t	AS	d0	
		LEFT	JOIN	t	AS	d1		
				ON	(d1.parent_id=d0.id)	
		LEFT	JOIN	t	AS	d2		
				ON	(d2.parent_id=d1...
SELECT	*	
		FROM	t	AS	d0	
		LEFT	JOIN	t	AS	d1		
				ON	(d1.parent_id=d0.id)	
		LEFT	JOIN	t	AS	d2		
				ON	(d2.parent_id=d1...
SELECT	*	
		FROM	t	AS	d0	
		LEFT	JOIN	t	AS	d1		
				ON	(d1.parent_id=d0.id)	
		LEFT	JOIN	t	AS	d2		
				ON	(d2.parent_id=d1...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Recursive common table expressions may refer to
themselves in the second leg of a UNION	[ALL]:
WITH	RECURSIVE	cte	(n)	
		A...
Use Cases
‣ Row generators


To fill gaps (e.g., in time series),
generate test data.
‣ Processing graphs


Shortest route ...
WITH	RECURSIVE is the “while” of SQL
WITH	RECURSIVE "supports" infinite loops	
Except PostgreSQL, databases generally don't...
AvailabilityWITH	RECURSIVE
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1[0]
MariaDB
MySQL[1]
8.4 PostgreSQL
3.8.3[2]
SQ...
SQL:2003
FILTER
SELECT	YEAR,		
							SUM(CASE	WHEN	MONTH	=	1	THEN	sales

																															ELSE	0

												END)	JAN,	
		...
SELECT	YEAR,	
							SUM(sales)	FILTER	(WHERE	MONTH	=	1)	JAN,	
							SUM(sales)	FILTER	(WHERE	MONTH	=	2)	FEB,	
							…	
...
FILTER Availability
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1 MariaDB
MySQL
9.4 PostgreSQL
SQLite
DB2 LUW
Oracle
SQ...
OVER
and
PARTITION	BY
OVER (PARTITION BY) The Problem
Two distinct concepts could not be used independently:
‣ Merge rows with the same key prop...
SELECT	c1	
					,	SUM(c2)	tot	
		FROM	t	
	GROUP	BY	c1
OVER (PARTITION BY) The Problem
Yes⇠Mergerows⇢No
No ⇠ Aggregate ⇢ Ye...
SELECT	c1	
					,	SUM(c2)	tot	
		FROM	t	
	GROUP	BY	c1
OVER (PARTITION BY) The Problem
Yes⇠Mergerows⇢No
No ⇠ Aggregate ⇢ Ye...
SELECT	c1	
					,	SUM(c2)	tot	
		FROM	t	
	GROUP	BY	c1
OVER (PARTITION BY) The Problem
Yes⇠Mergerows⇢No
No ⇠ Aggregate ⇢ Ye...
SELECT	c1	
					,	SUM(c2)	tot	
		FROM	t	
	GROUP	BY	c1
OVER (PARTITION BY) Since SQL:2003
Yes⇠Mergerows⇢No
No ⇠ Aggregate ⇢...
SELECT	dep,		
							salary,	
							SUM(salary)	
							OVER()	
		FROM	emp
dep salary ts
1 1000 6000
22 1000 6000
22 1000...
SELECT	dep,		
							salary,	
							SUM(salary)	
							OVER()	
		FROM	emp
dep salary ts
1 1000 6000
22 1000 6000
22 1000...
SELECT	dep,		
							salary,	
							SUM(salary)	
							OVER()	
		FROM	emp
dep salary ts
1 1000 6000
22 1000 6000
22 1000...
SELECT	dep,		
							salary,	
							SUM(salary)	
							OVER()	
		FROM	emp
dep salary ts
1 1000 6000
22 1000 6000
22 1000...
SELECT	dep,		
							salary,	
							SUM(salary)	
							OVER()	
		FROM	emp
dep salary ts
1 1000 6000
22 1000 6000
22 1000...
SELECT	dep,		
							salary,	
							SUM(salary)	
							OVER()	
		FROM	emp
OVER (PARTITION BY) How it works
)
dep salary ...
SELECT	dep,		
							salary,	
							SUM(salary)	
							OVER()	
		FROM	emp
dep salary ts
1 1000 1000
22 1000 2000
22 1000...
OVER
and
ORDER	BY	
(Framing & Ranking)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
OVER (ORDER BY) The ...
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
OVER (ORDER BY) The ...
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
OVER (ORDER BY) The ...
OVER (ORDER BY) Since SQL:2003
SELECT	id,	
							value,
	FROM	transactions	t
SUM(value)	
OVER	(	
)
acnt id value balance
...
OVER (ORDER BY) Since SQL:2003
SELECT	id,	
							value,
	FROM	transactions	t
SUM(value)	
OVER	(	
)
acnt id value balance
...
OVER (ORDER BY) Since SQL:2003
SELECT	id,	
							value,
	FROM	transactions	t
SUM(value)	
OVER	(	
)
acnt id value balance
...
OVER (ORDER BY) Since SQL:2003
SELECT	id,	
							value,
	FROM	transactions	t
SUM(value)	
OVER	(	
)
acnt id value balance
...
OVER (ORDER BY) Since SQL:2003
SELECT	id,	
							value,
	FROM	transactions	t
SUM(value)	
OVER	(	
)
acnt id value balance
...
OVER (ORDER BY) Since SQL:2003
SELECT	id,	
							value,
	FROM	transactions	t
SUM(value)	
OVER	(	
)
acnt id value balance
...
OVER (ORDER BY) Since SQL:2003
SELECT	id,	
							value,
	FROM	transactions	t
SUM(value)	
OVER	(	
)
acnt id value balance
...
OVER (ORDER BY) Since SQL:2003
SELECT	id,	
							value,
	FROM	transactions	t
SUM(value)	
OVER	(	
)
acnt id value balance
...
OVER (ORDER BY) Since SQL:2003
SELECT	id,	
							value,
	FROM	transactions	t
SUM(value)	
OVER	(	
)
acnt id value balance
...
OVER (ORDER BY) Since SQL:2003
With OVER	(ORDER	BY	n) a new type of functions make sense:
n ROW_NUMBER RANK DENSE_RANK PER...
‣ Aggregates without GROUP	BY
‣ Running totals,

moving averages
‣ Ranking
‣ Top-N per Group
‣ Avoiding self-joins
[… many...
OVER may follow any aggregate function	
OVER defines which rows are visible at each row	
OVER() makes all rows visible at e...
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1[0]
MariaDB
MySQL[1]
8.4 PostgreSQL
SQLite
7 DB2 LUW
8i Oracle
2005 SQL Se...
WITHIN	GROUP
SELECT	d1.val	
		FROM	data	d1	
		JOIN	data	d2	
				ON	(d1.val	<	d2.val	
							OR	(d1.val=d2.val	AND	d1.id<d2.id))	
	GROUP...
SELECT	d1.val	
		FROM	data	d1	
		JOIN	data	d2	
				ON	(d1.val	<	d2.val	
							OR	(d1.val=d2.val	AND	d1.id<d2.id))	
	GROUP...
SELECT	d1.val	
		FROM	data	d1	
		JOIN	data	d2	
				ON	(d1.val	<	d2.val	
							OR	(d1.val=d2.val	AND	d1.id<d2.id))	
	GROUP...
SELECT	d1.val	
		FROM	data	d1	
		JOIN	data	d2	
				ON	(d1.val	<	d2.val	
							OR	(d1.val=d2.val	AND	d1.id<d2.id))	
	GROUP...
SELECT	PERCENTILE_DISC(0.5)	
							WITHIN	GROUP	(ORDER	BY	val)	
		FROM	data
Median
Which value?
WITHIN	GROUP Since 2013
S...
SELECT	PERCENTILE_DISC(0.5)	
							WITHIN	GROUP	(ORDER	BY	val)	
		FROM	data
WITHIN	GROUP Since 2013
SQL:2003 introduced o...
WITHIN	GROUP Availability
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1 MariaDB
MySQL
9.4 PostgreSQL
SQLite
DB2 LUW
9iR...
TABLESAMPLE
TABLESAMPLE Availability
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1 MariaDB
MySQL
9.5[0]
PostgreSQL
SQLite
8.2[0]
DB...
SQL:2008
FETCH	FIRST
SELECT	*	
		FROM	(SELECT	*	
													,	ROW_NUMBER()	OVER(ORDER	BY	x)	rn	
										FROM	data)	numbered_data	
	WHERE	rn...
SELECT	*	
		FROM	(SELECT	*	
													,	ROW_NUMBER()	OVER(ORDER	BY	x)	rn	
										FROM	data)	numbered_data	
	WHERE	rn...
SELECT	*	
		FROM	data	
	ORDER	BY	x	
	FETCH	FIRST	10	ROWS	ONLY
FETCH	FIRST Since SQL:2008
SQL:2008 introduced the FETCH	FIR...
FETCH	FIRST Availability
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1 MariaDB
3.19.3[0]
MySQL
6.5[1]
8.4 PostgreSQL
2....
SQL:2011
OFFSET
SELECT	*	
		FROM	(SELECT	*	
													,	ROW_NUMBER()	OVER(ORDER	BY	x)	rn	
										FROM	data)	numbered_data	
	WHERE	rn...
SELECT	*	
		FROM	data	
	ORDER	BY	x	
OFFSET	10	ROWS	
	FETCH	NEXT	10	ROWS	ONLY
OFFSET Since SQL:2011
SQL:2011 introduced OFF...
SELECT	*	
		FROM	data	
	ORDER	BY	x	
OFFSET	10	ROWS	
	FETCH	NEXT	10	ROWS	ONLY
OFFSET Since SQL:2011
SQL:2011 introduced OFF...
OFFSET Since SQL:2011
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1 MariaDB
3.20.3[0]
4.0.6[1]
MySQL
6.5 PostgreSQL
2.1...
OVER
OVER (SQL:2011) The Problem
Direct access of other rows of the same window is not possible.
(E.g., calculate the difference...
WITH	numbered_t	AS	(SELECT	*	
																								
																												)	
SELECT	curr.*	
					,	curr.balance	
...
WITH	numbered_t	AS	(SELECT	*	
																								
																												)	
SELECT	curr.*	
					,	curr.balance	
...
WITH	numbered_t	AS	(SELECT	*	
																								
																												)	
SELECT	curr.*	
					,	curr.balance	
...
WITH	numbered_t	AS	(SELECT	*	
																								
																												)	
SELECT	curr.*	
					,	curr.balance	
...
WITH	numbered_t	AS	(SELECT	*	
																								
																												)	
SELECT	curr.*	
					,	curr.balance	
...
WITH	numbered_t	AS	(SELECT	*	
																								
																												)	
SELECT	curr.*	
					,	curr.balance	
...
SELECT	*,	balance	

										-	COALESCE(	LAG(balance)

																					OVER(ORDER	BY	x)

																				,	0)

	...
OVER (LEAD, LAG, …) Since SQL:2011
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1[0]
MariaDB
MySQL
8.4[1]
PostgreSQL
SQL...
Temporal Tables
(Time Traveling)
INSERT	
UPDATE	
DELETE	
are
DESTRUCTIVE
Temporal Tables The Problem
CREATE	TABLE	t	(...,
	start_ts	TIMESTAMP(9)	GENERATED

										ALWAYS	AS	ROW	START,
	end_ts			TIMESTAMP(9)	GENERATED

		...
ID Data start_ts end_ts
1 X 10:00:00
UPDATE	...	SET	DATA	=	'Y'	...
ID Data start_ts end_ts
1 X 10:00:00 11:00:00
1 Y 11:00...
ID Data start_ts end_ts
1 X 10:00:00
UPDATE	...	SET	DATA	=	'Y'	...
ID Data start_ts end_ts
1 X 10:00:00 11:00:00
1 Y 11:00...
Although multiple versions exist, only the “current”
one is visible per default.
After 12:00:00, SELECT	*	FROM	t doesn’t r...
ID Data start_ts end_ts
1 X 10:00:00 11:00:00
1 Y 11:00:00 12:00:00
With FOR	…	AS	OF you can query anything you like:
SELE...
It isn’t possible to define constraints to avoid overlapping periods.
Workarounds are possible, but no fun: CREATE	TRIGGER
...
SQL:2011 provides means to cope with temporal tables:



PRIMARY	KEY	(id,	period	WITHOUT	OVERLAPS)
Temporal Tables Since S...
Temporal Tables Since SQL:2011
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1 MariaDB
MySQL
PostgreSQL
SQLite
10.1 DB2 L...
SQL:2016(released: 2016-12-14)
MATCH_RECOGNIZE
(Row Pattern Matching)
Row Pattern Matching
Example: Logfile
Row Pattern Matching
Time
30 minutes
Example: Logfile
Row Pattern Matching
Example: Logfile
Time
30 minutes
Session 1 Session 2
Session 3
Session 4
Example problem:
‣ Average se...
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
			...
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
			...
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
			...
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
			...
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
			...
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
			...
SELECT	COUNT(*)	sessions	
					,	AVG(duration)	avg_duration	
		FROM	log	
							MATCH_RECOGNIZE(	
								ORDER	BY	ts	
			...
Row Pattern Matching Before SQL:2016
Time
30 minutes
Now, let’s try using window functions
SELECT	count(*)	sessions,	avg(duration)	avg_duration	
		FROM	(SELECT	MAX(ts)	-	MIN(ts)	duration	
										FROM	(SELECT	ts...
SELECT	count(*)	sessions,	avg(duration)	avg_duration	
		FROM	(SELECT	MAX(ts)	-	MIN(ts)	duration	
										FROM	(SELECT	ts...
SELECT	count(*)	sessions,	avg(duration)	avg_duration	
		FROM	(SELECT	MAX(ts)	-	MIN(ts)	duration	
										FROM	(SELECT	ts...
Row Pattern Matching Since SQL:2016
https://www.slideshare.net/MarkusWinand/row-pattern-matching-in-sql2016
Row Pattern Matching Availability
1999
2001
2003
2005
2007
2009
2011
2013
2015
MariaDB
MySQL
PostgreSQL
SQLite
DB2 LUW
12c...
LIST_AGG
Since SQL:2016
grp val
1 B
1 A
1 C
2 X
LIST_AGG
Since SQL:2016
grp val
1 B
1 A
1 C
2 X
SELECT	grp	
					,	LIST_AGG(val,	',	')

							WITHIN	GROUP	(ORDER	BY	val)	
		FROM	...
Since SQL:2016
grp val
1 B
1 A
1 C
2 X
grp val
1 A, B, C
2 X
SELECT	grp	
					,	LIST_AGG(val,	',	')

							WITHIN	GROUP	(...
Since SQL:2016
grp val
1 B
1 A
1 C
2 X
grp val
1 A, B, C
2 X
SELECT	grp	
					,	LIST_AGG(val,	',	')

							WITHIN	GROUP	(...
LIST_AGG Availability
1999
2001
2003
2005
2007
2009
2011
2013
2015
5.1[0]
MariaDB
4.1[0]
MySQL
7.4[1]
8.4[2]
9.0[3]
Postgr...
Also new in SQL:2016
JSON
DATE FORMAT
POLYMORPHIC TABLE FUNCTIONS
About @MarkusWinand
‣Training for Developers
‣ SQL Performance (Indexing)
‣ Modern SQL
‣ On-Site or Online
‣SQL Tuning
‣ I...
About @MarkusWinand
€0,-
€10-30
sql-performance-explained.com
About @MarkusWinand
@ModernSQL
http://modern-sql.com
Próxima SlideShare
Cargando en…5
×

Modern SQL in Open Source and Commercial Databases

346.152 visualizaciones

Publicado el

SQL has gone out of fashion lately—partly due to the NoSQL movement, but mostly because SQL is often still used like 20 years ago. As a matter of fact, the SQL standard continued to evolve during the past decades resulting in the current release of 2016. In this session, we will go through the most important additions since the widely known SQL-92. We will cover common table expressions and window functions in detail and have a very short look at the temporal features of SQL:2011 and row pattern matching from SQL:2016.

Links:
http://modern-sql.com/
http://winand.at/
http://sql-performance-explained.com/

Publicado en: Software, Tecnología
10 comentarios
425 recomendaciones
Estadísticas
Notas
Sin descargas
Visualizaciones
Visualizaciones totales
346.152
En SlideShare
0
De insertados
0
Número de insertados
183.809
Acciones
Compartido
0
Descargas
1.458
Comentarios
10
Recomendaciones
425
Insertados 0
No insertados

No hay notas en la diapositiva.

Modern SQL in Open Source and Commercial Databases

  1. 1. Still using Windows 3.1? So why stick to SQL-92? @ModernSQL - http://modern-sql.com/ @MarkusWinand
  2. 2. SQL:1999
  3. 3. LATERAL
  4. 4. Select-list sub-queries must be scalar[0]: LATERAL Before SQL:1999 SELECT … , (SELECT column_1 FROM t1 WHERE t1.x = t2.y ) AS c FROM t2 … (an atomic quantity that can hold only one value at a time[1]) [0] Neglecting row values and other workarounds here; [1] https://en.wikipedia.org/wiki/Scalar
  5. 5. Select-list sub-queries must be scalar[0]: LATERAL Before SQL:1999 SELECT … , (SELECT column_1 FROM t1 WHERE t1.x = t2.y ) AS c FROM t2 … (an atomic quantity that can hold only one value at a time[1]) [0] Neglecting row values and other workarounds here; [1] https://en.wikipedia.org/wiki/Scalar ✗, column_2 More than
 one column? Syntax error } More than
 one row? Runtime error!
  6. 6. Lateral derived tables lift both limitations and can be correlated: LATERAL Since SQL:1999 SELECT … , ldt.* FROM t2 LEFT JOIN LATERAL (SELECT column_1, column_2 FROM t1 WHERE t1.x = t2.y ) AS ldt ON (true) …
  7. 7. Lateral derived tables lift both limitations and can be correlated: LATERAL Since SQL:1999 SELECT … , ldt.* FROM t2 LEFT JOIN LATERAL (SELECT column_1, column_2 FROM t1 WHERE t1.x = t2.y ) AS ldt ON (true) … “Derived table” means
 it’s in the
 FROM/JOIN clause Still “correlated” Regular join semantics
  8. 8. FROM t JOIN LATERAL (SELECT … FROM … WHERE t.c=… ORDER BY … LIMIT 10 ) derived_table ‣ Top-N per group

 inside a lateral derived table
 FETCH FIRST (or LIMIT, TOP)
 applies per row from left tables. ‣ Also useful to find most recent news from several subscribed topics (“multi-source top-N”). Use-CasesLATERAL Add proper index
 for Top-N query http://use-the-index-luke.com/sql/partial-results/top-n-queries
  9. 9. FROM t JOIN LATERAL (SELECT … FROM … WHERE t.c=… ORDER BY … LIMIT 10 ) derived_table ‣ Top-N per group

 inside a lateral derived table
 FETCH FIRST (or LIMIT, TOP)
 applies per row from left tables. ‣ Also useful to find most recent news from several subscribed topics (“multi-source top-N”). ‣ Table function arguments

 (TABLE often implies LATERAL)
 Use-CasesLATERAL FROM t JOIN TABLE (your_func(t.c))
  10. 10. LATERAL is the "for each" loop of SQL LATERAL plays well with outer and cross joins LATERAL is great for Top-N subqueries LATERAL can join table functions (unnest!)
 LATERAL In a Nutshell
  11. 11. LATERAL Availability 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1 MariaDB MySQL 9.3 PostgreSQL SQLite 9.1 DB2 LUW 11gR1[0] 12c Oracle 2005[1] SQL Server [0] Undocumented. Requires setting trace event 22829. [1] LATERAL is not supported as of SQL Server 2016 but [CROSS|OUTER] APPLY can be used for the same effect.
  12. 12. GROUPING SETS
  13. 13. Only one GROUP BY operation at a time: GROUPING SETS Before SQL:1999 SELECT year , month , sum(revenue) FROM tbl GROUP BY year, month Monthly revenue Yearly revenue SELECT year , sum(revenue) FROM tbl GROUP BY year
  14. 14. GROUPING SETS Before SQL:1999 SELECT year , month , sum(revenue) FROM tbl GROUP BY year, month SELECT year , sum(revenue) FROM tbl GROUP BY year
  15. 15. GROUPING SETS Before SQL:1999 SELECT year , month , sum(revenue) FROM tbl GROUP BY year, month SELECT year , sum(revenue) FROM tbl GROUP BY year UNION ALL , null
  16. 16. GROUPING SETS Since SQL:1999 SELECT year , month , sum(revenue) FROM tbl GROUP BY year, month SELECT year , sum(revenue) FROM tbl GROUP BY year UNION ALL , null SELECT year , month , sum(revenue) FROM tbl GROUP BY GROUPING SETS ( (year, month) , (year) )
  17. 17. GROUPING SETS are multiple GROUP BYs in one go () (empty brackets) build a group over all rows GROUPING (function) disambiguates the meaning of NULL
 (was the grouped data NULL or is this column not currently grouped?) Permutations can be created using ROLLUP and CUBE
 (ROLLUP(a,b,c) = GROUPING SETS ((a,b,c), (a,b),(a),()) GROUPING SETS In a Nutshell
  18. 18. GROUPING SETS Availability 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1[0] MariaDB 5.0[0] MySQL 9.5 PostgreSQL SQLite 5 DB2 LUW 9iR1 Oracle 2008 SQL Server [0] Only ROLLUP
  19. 19. WITH (Common Table Expressions)
  20. 20. WITH (non-recursive) The Problem Nested queries are hard to read: SELECT … FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…)
  21. 21. Understand this first WITH (non-recursive) The Problem Nested queries are hard to read: SELECT … FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…)
  22. 22. Then this... WITH (non-recursive) The Problem Nested queries are hard to read: SELECT … FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…)
  23. 23. Then this... WITH (non-recursive) The Problem Nested queries are hard to read: SELECT … FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…)
  24. 24. Finally the first line makes sense WITH (non-recursive) The Problem Nested queries are hard to read: SELECT … FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…)
  25. 25. CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), WITH (non-recursive) Since SQL:1999
  26. 26. CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), Keyword WITH (non-recursive) Since SQL:1999
  27. 27. CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), Name of CTE and (here optional) column names WITH (non-recursive) Since SQL:1999
  28. 28. CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), Definition WITH (non-recursive) Since SQL:1999
  29. 29. CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), Introduces another CTE Don't repeat WITH WITH (non-recursive) Since SQL:1999
  30. 30. CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), May refer to
 previous CTEs WITH (non-recursive) Since SQL:1999
  31. 31. WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) AS (SELECT … FROM …) SELECT … FROM b JOIN c ON (…) Third CTE WITH (non-recursive) Since SQL:1999
  32. 32. WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) AS (SELECT … FROM …) SELECT … FROM b JOIN c ON (…) No comma! WITH (non-recursive) Since SQL:1999
  33. 33. WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) AS (SELECT … FROM …) SELECT … FROM b JOIN c ON (…) Main query WITH (non-recursive) Since SQL:1999
  34. 34. CTEs are statement-scoped views: WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) AS (SELECT … FROM …) SELECT … FROM b JOIN c ON (…) Read top down WITH (non-recursive) Since SQL:1999
  35. 35. ‣ Literate SQL

 Organize SQL code to
 improve maintainability ‣ Assign column names

 to tables produced by values
 or unnest. ‣ Overload tables (for testing)

 with queries hide tables
 of the same name. Use-CasesWITH (non-recursive) http://modern-sql.com/use-case/literate-sql http://modern-sql.com/use-case/naming-unnamed-columns http://modern-sql.com/use-case/unit-tests-on-transient-data
  36. 36. WITH are the "private methods" of SQL WITH is a prefix to SELECT WITH queries are only visible in the SELECT
 they precede WITH in detail: http://modern-sql.com/feature/with WITH (non-recursive) In a Nutshell
  37. 37. PostgreSQL “issues”WITH (non-recursive) In PostgreSQL WITH queries are “optimizer fences”:
 
 WITH cte AS
 (SELECT *
 FROM news)
 SELECT * 
 FROM cte
 WHERE topic=1
  38. 38. CTE Scan on cte (rows=6370) Filter: topic = 1 CTE cte -> Seq Scan on news (rows=10000001) PostgreSQL “issues”WITH (non-recursive) In PostgreSQL WITH queries are “optimizer fences”:
 
 WITH cte AS
 (SELECT *
 FROM news)
 SELECT * 
 FROM cte
 WHERE topic=1
  39. 39. CTE Scan on cte (rows=6370) Filter: topic = 1 CTE cte -> Seq Scan on news (rows=10000001) PostgreSQL “issues”WITH (non-recursive) In PostgreSQL WITH queries are “optimizer fences”:
 
 WITH cte AS
 (SELECT *
 FROM news)
 SELECT * 
 FROM cte
 WHERE topic=1
  40. 40. CTE Scan on cte (rows=6370) Filter: topic = 1 CTE cte -> Seq Scan on news (rows=10000001) PostgreSQL “issues”WITH (non-recursive) In PostgreSQL WITH queries are “optimizer fences”:
 
 WITH cte AS
 (SELECT *
 FROM news)
 SELECT * 
 FROM cte
 WHERE topic=1 CTE doesn't know about the outer filter
  41. 41. Views and derived tables support "predicate pushdown":
 
 SELECT *
 FROM (SELECT *
 FROM news
 ) n
 WHERE topic=1; PostgreSQL “issues”WITH (non-recursive)
  42. 42. Views and derived tables support "predicate pushdown":
 
 SELECT *
 FROM (SELECT *
 FROM news
 ) n
 WHERE topic=1; Bitmap Heap Scan on news (rows=6370) ->Bitmap Index Scan on idx (rows=6370) Cond: topic=1 PostgreSQL “issues”WITH (non-recursive)
  43. 43. PostgreSQL 9.1+ allows DML within WITH: 
 WITH deleted_rows AS (
 DELETE FROM source_tbl
 RETURNING *
 )
 INSERT INTO destination_tbl
 SELECT * FROM deleted_rows; PostgreSQL ExtensionWITH (non-recursive)
  44. 44. 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1[0] MariaDB MySQL[1] 8.4 PostgreSQL 3.8.3[2] SQLite 7 DB2 LUW 9iR2 Oracle 2005 SQL Server [0] Available MariaDB 10.2 alpha [1] Announced for 8.0: http://www.percona.com/blog/2016/09/01/percona-live-europe-featured-talk-manyi-lu [2] Only for top-level SELECT statements AvailabilityWITH (non-recursive)
  45. 45. WITH RECURSIVE (Common Table Expressions)
  46. 46. CREATE TABLE t ( id NUMERIC NOT NULL, parent_id NUMERIC, … PRIMARY KEY (id) ) Coping with hierarchies in the Adjacency List Model[0] WITH RECURSIVE The Problem [0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”
  47. 47. SELECT * FROM t AS d0 LEFT JOIN t AS d1 ON (d1.parent_id=d0.id) LEFT JOIN t AS d2 ON (d2.parent_id=d1.id) Coping with hierarchies in the Adjacency List Model[0] WITH RECURSIVE The Problem WHERE d0.id = ? [0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”
  48. 48. SELECT * FROM t AS d0 LEFT JOIN t AS d1 ON (d1.parent_id=d0.id) LEFT JOIN t AS d2 ON (d2.parent_id=d1.id) Coping with hierarchies in the Adjacency List Model[0] WITH RECURSIVE The Problem WHERE d0.id = ? [0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”
  49. 49. SELECT * FROM t AS d0 LEFT JOIN t AS d1 ON (d1.parent_id=d0.id) LEFT JOIN t AS d2 ON (d2.parent_id=d1.id) Coping with hierarchies in the Adjacency List Model[0] WITH RECURSIVE The Problem WHERE d0.id = ? [0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties”
  50. 50. SELECT * FROM t AS d0 LEFT JOIN t AS d1 ON (d1.parent_id=d0.id) LEFT JOIN t AS d2 ON (d2.parent_id=d1.id) WITH RECURSIVE Since SQL:1999 WHERE d0.id = ? WITH RECURSIVE d (id, parent, …) AS
 (SELECT id, parent, … FROM tbl WHERE id = ? UNION ALL SELECT id, parent, … FROM d LEFT JOIN tbl
 ON (tbl.parent=d.id) ) SELECT * FROM subtree
  51. 51. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte Since SQL:1999WITH RECURSIVE
  52. 52. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte Keyword Since SQL:1999WITH RECURSIVE
  53. 53. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte Column list mandatory here Since SQL:1999WITH RECURSIVE
  54. 54. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte Executed first Since SQL:1999WITH RECURSIVE
  55. 55. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte Result sent there Since SQL:1999WITH RECURSIVE
  56. 56. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte Result visible twice Since SQL:1999WITH RECURSIVE
  57. 57. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n --- 1 2 3 (3 rows) Once it becomes part of
 the final
 result Since SQL:1999WITH RECURSIVE
  58. 58. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n --- 1 2 3 (3 rows) Since SQL:1999WITH RECURSIVE
  59. 59. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n --- 1 2 3 (3 rows) Second
 leg of UNION 
 is executed Since SQL:1999WITH RECURSIVE
  60. 60. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n --- 1 2 3 (3 rows) Result sent there again Since SQL:1999WITH RECURSIVE
  61. 61. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n --- 1 2 3 (3 rows) Since SQL:1999WITH RECURSIVE
  62. 62. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n --- 1 2 3 (3 rows) It's a loop! Since SQL:1999WITH RECURSIVE
  63. 63. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n --- 1 2 3 (3 rows) It's a loop! Since SQL:1999WITH RECURSIVE
  64. 64. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n --- 1 2 3 (3 rows) It's a loop! Since SQL:1999WITH RECURSIVE
  65. 65. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n --- 1 2 3 (3 rows) n=3
 doesn't match Since SQL:1999WITH RECURSIVE
  66. 66. Recursive common table expressions may refer to themselves in the second leg of a UNION [ALL]: WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte n --- 1 2 3 (3 rows) n=3
 doesn't match Loop terminates Since SQL:1999WITH RECURSIVE
  67. 67. Use Cases ‣ Row generators

 To fill gaps (e.g., in time series), generate test data. ‣ Processing graphs

 Shortest route from person A to B in LinkedIn/Facebook/Twitter/… ‣ Finding distinct values

 with n*log(N)† time complexity. […many more…] As shown on previous slide http://aprogrammerwrites.eu/?p=1391 “[…] for certain classes of graphs, solutions utilizing relational database technology […] can offer performance superior to that of the dedicated graph databases.” event.cwi.nl/grades2013/07-welc.pdf http://wiki.postgresql.org/wiki/Loose_indexscan † n … # distinct values, N … # of table rows. Suitable index required WITH RECURSIVE
  68. 68. WITH RECURSIVE is the “while” of SQL WITH RECURSIVE "supports" infinite loops Except PostgreSQL, databases generally don't require the RECURSIVE keyword. DB2, SQL Server & Oracle don’t even know the keyword RECURSIVE, but allow recursive CTEs anyway. In a NutshellWITH RECURSIVE
  69. 69. AvailabilityWITH RECURSIVE 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1[0] MariaDB MySQL[1] 8.4 PostgreSQL 3.8.3[2] SQLite 7 DB2 LUW 11gR2 Oracle 2005 SQL Server [0] Expected in 10.2.2 [1] Announced for 8.0: http://www.percona.com/blog/2016/09/01/percona-live-europe-featured-talk-manyi-lu [2] Only for top-level SELECT statements
  70. 70. SQL:2003
  71. 71. FILTER
  72. 72. SELECT YEAR, SUM(CASE WHEN MONTH = 1 THEN sales
 ELSE 0
 END) JAN, SUM(CASE WHEN MONTH = 2 THEN sales
 ELSE 0
 END) FEB, … FROM sale_data GROUP BY YEAR FILTER The Problem Pivot table: Years on the Y axis, month on X:
  73. 73. SELECT YEAR, SUM(sales) FILTER (WHERE MONTH = 1) JAN, SUM(sales) FILTER (WHERE MONTH = 2) FEB, … FROM sale_data GROUP BY YEAR; FILTER Since SQL:2003 SQL:2003 allows FILTER (WHERE…) after aggregates:
  74. 74. FILTER Availability 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1 MariaDB MySQL 9.4 PostgreSQL SQLite DB2 LUW Oracle SQL Server
  75. 75. OVER and PARTITION BY
  76. 76. OVER (PARTITION BY) The Problem Two distinct concepts could not be used independently: ‣ Merge rows with the same key properties ‣ GROUP BY to specify key properties ‣ DISTINCT to use full row as key ‣ Aggregate data from related rows ‣ Requires GROUP BY to segregate the rows ‣ COUNT, SUM, AVG, MIN, MAX to aggregate grouped rows
  77. 77. SELECT c1 , SUM(c2) tot FROM t GROUP BY c1 OVER (PARTITION BY) The Problem Yes⇠Mergerows⇢No No ⇠ Aggregate ⇢ Yes SELECT c1 , c2 FROM t SELECT DISTINCT c1 , c2 FROM t SELECT c1 , c2 FROM t SELECT c1 , SUM(c2) tot FROM t GROUP BY c1
  78. 78. SELECT c1 , SUM(c2) tot FROM t GROUP BY c1 OVER (PARTITION BY) The Problem Yes⇠Mergerows⇢No No ⇠ Aggregate ⇢ Yes SELECT c1 , c2 FROM t SELECT DISTINCT c1 , c2 FROM t SELECT c1 , c2 FROM t JOIN ( ) ta ON (t.c1=ta.c1) SELECT c1 , SUM(c2) tot FROM t GROUP BY c1 , tot
  79. 79. SELECT c1 , SUM(c2) tot FROM t GROUP BY c1 OVER (PARTITION BY) The Problem Yes⇠Mergerows⇢No No ⇠ Aggregate ⇢ Yes SELECT c1 , c2 FROM t SELECT DISTINCT c1 , c2 FROM t SELECT c1 , c2 FROM t JOIN ( ) ta ON (t.c1=ta.c1) SELECT c1 , SUM(c2) tot FROM t GROUP BY c1 , tot
  80. 80. SELECT c1 , SUM(c2) tot FROM t GROUP BY c1 OVER (PARTITION BY) Since SQL:2003 Yes⇠Mergerows⇢No No ⇠ Aggregate ⇢ Yes SELECT c1 , c2 FROM t SELECT DISTINCT c1 , c2 FROM t SELECT c1 , c2 FROM t FROM t , SUM(c2) OVER (PARTITION BY c1)
  81. 81. SELECT dep, salary, SUM(salary) OVER() FROM emp dep salary ts 1 1000 6000 22 1000 6000 22 1000 6000 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works
  82. 82. SELECT dep, salary, SUM(salary) OVER() FROM emp dep salary ts 1 1000 6000 22 1000 6000 22 1000 6000 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works
  83. 83. SELECT dep, salary, SUM(salary) OVER() FROM emp dep salary ts 1 1000 6000 22 1000 6000 22 1000 6000 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works
  84. 84. SELECT dep, salary, SUM(salary) OVER() FROM emp dep salary ts 1 1000 6000 22 1000 6000 22 1000 6000 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works
  85. 85. SELECT dep, salary, SUM(salary) OVER() FROM emp dep salary ts 1 1000 6000 22 1000 6000 22 1000 6000 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works
  86. 86. SELECT dep, salary, SUM(salary) OVER() FROM emp OVER (PARTITION BY) How it works ) dep salary ts 1 1000 6000 22 1000 6000 22 1000 6000 333 1000 6000 333 1000 6000 333 1000 6000
  87. 87. SELECT dep, salary, SUM(salary) OVER() FROM emp dep salary ts 1 1000 1000 22 1000 2000 22 1000 2000 333 1000 3000 333 1000 3000 333 1000 3000 OVER (PARTITION BY) How it works )PARTITION BY dep
  88. 88. OVER and ORDER BY (Framing & Ranking)
  89. 89. acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 OVER (ORDER BY) The Problem SELECT id, value, FROM transactions t
  90. 90. acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 OVER (ORDER BY) The Problem SELECT id, value, (SELECT SUM(value) FROM transactions t2 WHERE t2.id <= t.id) FROM transactions t
  91. 91. acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 OVER (ORDER BY) The Problem SELECT id, value, (SELECT SUM(value) FROM transactions t2 WHERE t2.id <= t.id) FROM transactions t Range segregation (<=)
 not possible with
 GROUP BY or
 PARTITION BY
  92. 92. OVER (ORDER BY) Since SQL:2003 SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 ORDER BY id
  93. 93. OVER (ORDER BY) Since SQL:2003 SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING
  94. 94. OVER (ORDER BY) Since SQL:2003 SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW
  95. 95. OVER (ORDER BY) Since SQL:2003 SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW
  96. 96. OVER (ORDER BY) Since SQL:2003 SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW
  97. 97. OVER (ORDER BY) Since SQL:2003 SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW
  98. 98. OVER (ORDER BY) Since SQL:2003 SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW
  99. 99. OVER (ORDER BY) Since SQL:2003 SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW
  100. 100. OVER (ORDER BY) Since SQL:2003 SELECT id, value, FROM transactions t SUM(value) OVER ( ) acnt id value balance 1 1 +10 +10 22 2 +20 +20 22 3 -10 +10 333 4 +50 +50 333 5 -30 +20 333 6 -20 . 0 ORDER BY id ROWS BETWEEN
 UNBOUNDED PRECEDING AND CURRENT ROW PARTITION BY acnt
  101. 101. OVER (ORDER BY) Since SQL:2003 With OVER (ORDER BY n) a new type of functions make sense: n ROW_NUMBER RANK DENSE_RANK PERCENT_RANK CUME_DIST 1 1 1 1 0 0.25 2 2 2 2 0.33… 0.75 3 3 2 2 0.33… 0.75 4 4 4 3 1 1
  102. 102. ‣ Aggregates without GROUP BY ‣ Running totals,
 moving averages ‣ Ranking ‣ Top-N per Group ‣ Avoiding self-joins [… many more …] Use Cases SELECT * FROM (SELECT ROW_NUMBER() OVER(PARTITION BY … ORDER BY …) rn , t.* FROM t) numbered_t WHERE rn <= 3 AVG(…) OVER(ORDER BY … ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING) moving_avg OVER(SQL:2003)
  103. 103. OVER may follow any aggregate function OVER defines which rows are visible at each row OVER() makes all rows visible at every row OVER(PARTITION BY …) segregates like GROUP BY OVER(ORDER BY … BETWEEN) segregates using <, > In a NutshellOVER(SQL:2003)
  104. 104. 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1[0] MariaDB MySQL[1] 8.4 PostgreSQL SQLite 7 DB2 LUW 8i Oracle 2005 SQL Server [0] Available MariaDB 10.2 alpha [1] On the roadmap: http://www.slideshare.net/ManyiLu/optimizer-percona-liveams2015/47 OVER(SQL:2003) Availability Hive Impala Spark NuoDB
  105. 105. WITHIN GROUP
  106. 106. SELECT d1.val FROM data d1 JOIN data d2 ON (d1.val < d2.val OR (d1.val=d2.val AND d1.id<d2.id)) GROUP BY d1.val HAVING count(*) = (SELECT FLOOR(COUNT(*)/2) FROM data d3) WITHIN GROUP The Problem Grouped rows cannot be ordered prior aggregation. (how to get the middle value (median) of a set)
  107. 107. SELECT d1.val FROM data d1 JOIN data d2 ON (d1.val < d2.val OR (d1.val=d2.val AND d1.id<d2.id)) GROUP BY d1.val HAVING count(*) = (SELECT FLOOR(COUNT(*)/2) FROM data d3) WITHIN GROUP The Problem Grouped rows cannot be ordered prior aggregation. (how to get the middle value (median) of a set) Number rows Pick middle one
  108. 108. SELECT d1.val FROM data d1 JOIN data d2 ON (d1.val < d2.val OR (d1.val=d2.val AND d1.id<d2.id)) GROUP BY d1.val HAVING count(*) = (SELECT FLOOR(COUNT(*)/2) FROM data d3) WITHIN GROUP The Problem Grouped rows cannot be ordered prior aggregation. (how to get the middle value (median) of a set) Number rows Pick middle one
  109. 109. SELECT d1.val FROM data d1 JOIN data d2 ON (d1.val < d2.val OR (d1.val=d2.val AND d1.id<d2.id)) GROUP BY d1.val HAVING count(*) = (SELECT FLOOR(COUNT(*)/2) FROM data d3) WITHIN GROUP The Problem Grouped rows cannot be ordered prior aggregation. (how to get the middle value (median) of a set) Number rows Pick middle one
  110. 110. SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val) FROM data Median Which value? WITHIN GROUP Since 2013 SQL:2003 introduced ordered set functions:
  111. 111. SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val) FROM data WITHIN GROUP Since 2013 SQL:2003 introduced ordered set functions: SELECT RANK(123)
 WITHIN GROUP (ORDER BY val)
 FROM data …and hypothetical set-functions:
  112. 112. WITHIN GROUP Availability 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1 MariaDB MySQL 9.4 PostgreSQL SQLite DB2 LUW 9iR1 Oracle 2012[0] SQL Server [0] Only as window function (OVER required). Feature request 728969 closed as "won't fix"
  113. 113. TABLESAMPLE
  114. 114. TABLESAMPLE Availability 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1 MariaDB MySQL 9.5[0] PostgreSQL SQLite 8.2[0] DB2 LUW 8i[0] Oracle 2005[0] SQL Server [0] Not for derived tables
  115. 115. SQL:2008
  116. 116. FETCH FIRST
  117. 117. SELECT * FROM (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn FROM data) numbered_data WHERE rn <=10 FETCH FIRST The Problem Limit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary) SQL:2003 introduced ROW_NUMBER() to number rows.
 But this still requires wrapping to limit the result. And how about databases not supporting ROW_NUMBER()?
  118. 118. SELECT * FROM (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn FROM data) numbered_data WHERE rn <=10 FETCH FIRST The Problem Limit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary) SQL:2003 introduced ROW_NUMBER() to number rows.
 But this still requires wrapping to limit the result. And how about databases not supporting ROW_NUMBER()? Dammit! Let's take
 LIMIT
  119. 119. SELECT * FROM data ORDER BY x FETCH FIRST 10 ROWS ONLY FETCH FIRST Since SQL:2008 SQL:2008 introduced the FETCH FIRST … ROWS ONLY clause:
  120. 120. FETCH FIRST Availability 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1 MariaDB 3.19.3[0] MySQL 6.5[1] 8.4 PostgreSQL 2.1.0[1] SQLite 7 DB2 LUW 12c Oracle 7.0[2] 2012 SQL Server [0] Earliest mention of LIMIT. Probably inherited from mSQL [1] Functionality available using LIMIT [2] SELECT TOP n ... SQL Server 2000 also supports expressions and bind parameters
  121. 121. SQL:2011
  122. 122. OFFSET
  123. 123. SELECT * FROM (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn FROM data) numbered_data WHERE rn > 10 and rn <= 20 OFFSET The Problem How to fetch the rows after a limit?
 (pagination anybody?)
  124. 124. SELECT * FROM data ORDER BY x OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY OFFSET Since SQL:2011 SQL:2011 introduced OFFSET, unfortunately!
  125. 125. SELECT * FROM data ORDER BY x OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY OFFSET Since SQL:2011 SQL:2011 introduced OFFSET, unfortunately! OFFSET Grab coasters & stickers! http://use-the-index-luke.com/no-offset
  126. 126. OFFSET Since SQL:2011 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1 MariaDB 3.20.3[0] 4.0.6[1] MySQL 6.5 PostgreSQL 2.1.0 SQLite 9.7[2] 11.1 DB2 LUW 12c Oracle 2012 SQL Server [0] LIMIT [offset,] limit: "With this it's easy to do a poor man's next page/previous page WWW application." [1] The release notes say "Added PostgreSQL compatible LIMIT syntax" [2] Requires enabling the MySQL compatibility vector: db2set DB2_COMPATIBILITY_VECTOR=MYS
  127. 127. OVER
  128. 128. OVER (SQL:2011) The Problem Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows)
  129. 129. WITH numbered_t AS (SELECT * ) SELECT curr.* , curr.balance - COALESCE(prev.balance,0) FROM numbered_t curr LEFT JOIN numbered_t prev ON (curr.rn = prev.rn+1) OVER (SQL:2011) The Problem Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) curr balance … rn 50 … 1 90 … 2 70 … 3 30 … 4 FROM t
  130. 130. WITH numbered_t AS (SELECT * ) SELECT curr.* , curr.balance - COALESCE(prev.balance,0) FROM numbered_t curr LEFT JOIN numbered_t prev ON (curr.rn = prev.rn+1) OVER (SQL:2011) The Problem Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) curr balance … rn 50 … 1 90 … 2 70 … 3 30 … 4 FROM t , ROW_NUMBER() OVER(ORDER BY x) rn
  131. 131. WITH numbered_t AS (SELECT * ) SELECT curr.* , curr.balance - COALESCE(prev.balance,0) FROM numbered_t curr LEFT JOIN numbered_t prev ON (curr.rn = prev.rn+1) OVER (SQL:2011) The Problem Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) curr balance … rn 50 … 1 90 … 2 70 … 3 30 … 4 FROM t , ROW_NUMBER() OVER(ORDER BY x) rn
  132. 132. WITH numbered_t AS (SELECT * ) SELECT curr.* , curr.balance - COALESCE(prev.balance,0) FROM numbered_t curr LEFT JOIN numbered_t prev ON (curr.rn = prev.rn+1) OVER (SQL:2011) The Problem Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) curr balance … rn 50 … 1 90 … 2 70 … 3 30 … 4 FROM t , ROW_NUMBER() OVER(ORDER BY x) rn prev balance … rn 50 … 1 90 … 2 70 … 3 30 … 4
  133. 133. WITH numbered_t AS (SELECT * ) SELECT curr.* , curr.balance - COALESCE(prev.balance,0) FROM numbered_t curr LEFT JOIN numbered_t prev ON (curr.rn = prev.rn+1) OVER (SQL:2011) The Problem Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) curr balance … rn 50 … 1 90 … 2 70 … 3 30 … 4 FROM t , ROW_NUMBER() OVER(ORDER BY x) rn prev balance … rn 50 … 1 90 … 2 70 … 3 30 … 4
  134. 134. WITH numbered_t AS (SELECT * ) SELECT curr.* , curr.balance - COALESCE(prev.balance,0) FROM numbered_t curr LEFT JOIN numbered_t prev ON (curr.rn = prev.rn+1) OVER (SQL:2011) The Problem Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) curr balance … rn 50 … 1 90 … 2 70 … 3 30 … 4 FROM t , ROW_NUMBER() OVER(ORDER BY x) rn prev balance … rn 50 … 1 90 … 2 70 … 3 30 … 4 +50 +40 -20 -40
  135. 135. SELECT *, balance 
 - COALESCE( LAG(balance)
 OVER(ORDER BY x)
 , 0)
 FROM t Available functions: LEAD / LAG
 FIRST_VALUE / LAST_VALUE
 NTH_VALUE(col, n) FROM FIRST/LAST
 RESPECT/IGNORE NULLS OVER (SQL:2011) Since SQL:2011 SQL:2011 introduced LEAD, LAG, NTH_VALUE, … for that:
  136. 136. OVER (LEAD, LAG, …) Since SQL:2011 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1[0] MariaDB MySQL 8.4[1] PostgreSQL SQLite 9.5[2] 11.1 DB2 LUW 8i[2] 11gR2 Oracle 2012[2] SQL Server [0] Not yet available in MariaDB 10.2.2 (alpha). MDEV-8091 [1] No IGNORE NULLS and FROM LAST as of PostgreSQL 9.6 [2] No NTH_VALUE
  137. 137. Temporal Tables (Time Traveling)
  138. 138. INSERT UPDATE DELETE are DESTRUCTIVE Temporal Tables The Problem
  139. 139. CREATE TABLE t (..., start_ts TIMESTAMP(9) GENERATED
 ALWAYS AS ROW START, end_ts TIMESTAMP(9) GENERATED
 ALWAYS AS ROW END,
 PERIOD FOR SYSTEM TIME (start_ts, end_ts) ) WITH SYSTEM VERSIONING Temporal Tables Since SQL:2011 Table can be system versioned, application versioned or both.
  140. 140. ID Data start_ts end_ts 1 X 10:00:00 UPDATE ... SET DATA = 'Y' ... ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 DELETE ... WHERE ID = 1 INSERT ... (ID, DATA) VALUES (1, 'X') Temporal Tables Since SQL:2011
  141. 141. ID Data start_ts end_ts 1 X 10:00:00 UPDATE ... SET DATA = 'Y' ... ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 DELETE ... WHERE ID = 1 ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 12:00:00 Temporal Tables Since SQL:2011
  142. 142. Although multiple versions exist, only the “current” one is visible per default. After 12:00:00, SELECT * FROM t doesn’t return anything anymore. ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 12:00:00 Temporal Tables Since SQL:2011
  143. 143. ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 12:00:00 With FOR … AS OF you can query anything you like: SELECT * FROM t FOR SYSTEM_TIME AS OF TIMESTAMP '2015-04-02 10:30:00' ID Data start_ts end_ts 1 X 10:00:00 11:00:00 Temporal Tables Since SQL:2011
  144. 144. It isn’t possible to define constraints to avoid overlapping periods. Workarounds are possible, but no fun: CREATE TRIGGER id begin end 1 8:00 9:00 1 9:00 11:00 1 10:00 12:00 Temporal Tables The Problem
  145. 145. SQL:2011 provides means to cope with temporal tables:
 
 PRIMARY KEY (id, period WITHOUT OVERLAPS) Temporal Tables Since SQL:2011 Temporal support in SQL:2011 goes way further. Please read this paper to get the idea: Temporal features in SQL:2011 http://cs.ulb.ac.be/public/_media/teaching/infoh415/tempfeaturessql2011.pdf
  146. 146. Temporal Tables Since SQL:2011 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1 MariaDB MySQL PostgreSQL SQLite 10.1 DB2 LUW 10gR1[0] 12cR1[1] Oracle 2016[2] SQL Server [0] Limited system versioning via Flashback [1] Limited application versioning added (e.g. no WITHOUT OVERLAPS) [2] Only system versioning
  147. 147. SQL:2016(released: 2016-12-14)
  148. 148. MATCH_RECOGNIZE (Row Pattern Matching)
  149. 149. Row Pattern Matching Example: Logfile
  150. 150. Row Pattern Matching Time 30 minutes Example: Logfile
  151. 151. Row Pattern Matching Example: Logfile Time 30 minutes Session 1 Session 2 Session 3 Session 4 Example problem: ‣ Average session duration Two approaches: ‣ Row pattern matching ‣ Start-of-group tagging
  152. 152. SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( new cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t Since SQL:2016Row Pattern Matching Time 30 minutes define
 continuation Oracle doesn’t support avg on intervals — query doesn’t work as shown
  153. 153. SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( new cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t Since SQL:2016Row Pattern Matching Time 30 minutes undefined
 pattern variable: matches any row Oracle doesn’t support avg on intervals — query doesn’t work as shown
  154. 154. SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( new cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t Since SQL:2016Row Pattern Matching Time 30 minutes any number
 of “cont”
 rows Oracle doesn’t support avg on intervals — query doesn’t work as shown
  155. 155. SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( new cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t Since SQL:2016Row Pattern Matching Time 30 minutes Very much
 like GROUP BY Oracle doesn’t support avg on intervals — query doesn’t work as shown
  156. 156. SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( new cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t Since SQL:2016Row Pattern Matching Time 30 minutes Very much
 like SELECT Oracle doesn’t support avg on intervals — query doesn’t work as shown
  157. 157. SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( new cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t Since SQL:2016Row Pattern Matching Time 30 minutes Oracle doesn’t support avg on intervals — query doesn’t work as shown
  158. 158. SELECT COUNT(*) sessions , AVG(duration) avg_duration FROM log MATCH_RECOGNIZE( ORDER BY ts MEASURES LAST(ts) - FIRST(ts) AS duration ONE ROW PER MATCH PATTERN ( new cont* ) DEFINE cont AS ts < PREV(ts) + INTERVAL '30' minute ) t Since SQL:2016Row Pattern Matching Time 30 minutes Oracle doesn’t support avg on intervals — query doesn’t work as shown
  159. 159. Row Pattern Matching Before SQL:2016 Time 30 minutes Now, let’s try using window functions
  160. 160. SELECT count(*) sessions, avg(duration) avg_duration FROM (SELECT MAX(ts) - MIN(ts) duration FROM (SELECT ts, COUNT(grp_start) OVER(ORDER BY ts) session_no FROM (SELECT ts, CASE WHEN ts >= LAG( ts, 1, DATE’1900-01-1' ) OVER( ORDER BY ts ) + INTERVAL '30' minute THEN 1 END grp_start FROM log ) tagged ) numbered GROUP BY session_no ) grouped Row Pattern Matching Before SQL:2016 Time 30 minutes Start-of-group tags
  161. 161. SELECT count(*) sessions, avg(duration) avg_duration FROM (SELECT MAX(ts) - MIN(ts) duration FROM (SELECT ts, COUNT(grp_start) OVER(ORDER BY ts) session_no FROM (SELECT ts, CASE WHEN ts >= LAG( ts, 1, DATE’1900-01-1' ) OVER( ORDER BY ts ) + INTERVAL '30' minute THEN 1 END grp_start FROM log ) tagged ) numbered GROUP BY session_no ) grouped Row Pattern Matching Before SQL:2016 Time 30 minutes number sessions 2222 2 33 3 44 42 3 4 1
  162. 162. SELECT count(*) sessions, avg(duration) avg_duration FROM (SELECT MAX(ts) - MIN(ts) duration FROM (SELECT ts, COUNT(grp_start) OVER(ORDER BY ts) session_no FROM (SELECT ts, CASE WHEN ts >= LAG( ts, 1, DATE’1900-01-1' ) OVER( ORDER BY ts ) + INTERVAL '30' minute THEN 1 END grp_start FROM log ) tagged ) numbered GROUP BY session_no ) grouped Row Pattern Matching Before SQL:2016 Time 30 minutes 2222 2 33 3 44 42 3 4 1
  163. 163. Row Pattern Matching Since SQL:2016 https://www.slideshare.net/MarkusWinand/row-pattern-matching-in-sql2016
  164. 164. Row Pattern Matching Availability 1999 2001 2003 2005 2007 2009 2011 2013 2015 MariaDB MySQL PostgreSQL SQLite DB2 LUW 12cR1 Oracle SQL Server
  165. 165. LIST_AGG
  166. 166. Since SQL:2016 grp val 1 B 1 A 1 C 2 X LIST_AGG
  167. 167. Since SQL:2016 grp val 1 B 1 A 1 C 2 X SELECT grp , LIST_AGG(val, ', ')
 WITHIN GROUP (ORDER BY val) FROM t GROUP BY grp LIST_AGG
  168. 168. Since SQL:2016 grp val 1 B 1 A 1 C 2 X grp val 1 A, B, C 2 X SELECT grp , LIST_AGG(val, ', ')
 WITHIN GROUP (ORDER BY val) FROM t GROUP BY grp LIST_AGG
  169. 169. Since SQL:2016 grp val 1 B 1 A 1 C 2 X grp val 1 A, B, C 2 X SELECT grp , LIST_AGG(val, ', ')
 WITHIN GROUP (ORDER BY val) FROM t GROUP BY grp LIST_AGG(val, ', ' ON OVERFLOW TRUNCATE '...' WITH COUNT) ➔ 'A, B, ...(1)' LIST_AGG(val, ', ' ON OVERFLOW ERROR) Default LIST_AGG LIST_AGG(val, ', ' ON OVERFLOW TRUNCATE '...' WITHOUT COUNT) ➔ 'A, B, ...' Default
  170. 170. LIST_AGG Availability 1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1[0] MariaDB 4.1[0] MySQL 7.4[1] 8.4[2] 9.0[3] PostgreSQL 3.5.4[4] SQLite 10.5[5] DB2 LUW 11gR1 12cR2 Oracle SQL Server[6] [0] group_concat [1] array_to_string [2] array_agg [3] [0] group_concat [1] array_to_string [2] array_agg [3] string_agg [4] group_concat w/o ORDER BY [5] No ON OVERFLOW clause [6] string_agg announced for vNext
  171. 171. Also new in SQL:2016 JSON DATE FORMAT POLYMORPHIC TABLE FUNCTIONS
  172. 172. About @MarkusWinand ‣Training for Developers ‣ SQL Performance (Indexing) ‣ Modern SQL ‣ On-Site or Online ‣SQL Tuning ‣ Index-Redesign ‣ Query Improvements ‣ On-Site or Online http://winand.at/
  173. 173. About @MarkusWinand €0,- €10-30 sql-performance-explained.com
  174. 174. About @MarkusWinand @ModernSQL http://modern-sql.com

×