Introduced in Oracle Database 12c, the new MATCH_RECOGNIZE clause allows pattern matching across rows and is often associated with Big Data, complex event processing, etc. Should SQL developers who are not (yet) faced with such tasks ignore it? No way! The new feature is powerful enough to simplify a lot of day-to-day tasks and to solve them in a new, simple and efficient way. The insight into a new syntax is given based on common examples, as finding gaps, merging temporal intervals or grouping on fuzzy criteria. Providing more straightforward approach for solving known problems, the new functionality is worth to be a part of every developer’s toolbox.
1. BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF
HAMBURG KOPENHAGEN LAUSANNE MÜNCHEN STUTTGART WIEN ZÜRICH
12c SQL Pattern Matching –
wann werde ich das benutzen?
Andrej Pashchenko
Senior Consultant
Trivadis GmbH
2. Unser Unternehmen.
12c SQL Pattern Matching – wann werde ich das benutzen?2 19.11.2015
Trivadis ist führend bei der IT-Beratung, der Systemintegration, dem Solution
Engineering und der Erbringung von IT-Services mit Fokussierung auf -
und -Technologien in der Schweiz, Deutschland, Österreich und
Dänemark. Trivadis erbringt ihre Leistungen aus den strategischen Geschäftsfeldern:
Trivadis Services übernimmt den korrespondierenden Betrieb Ihrer IT Systeme.
B E T R I E B
3. KOPENHAGEN
MÜNCHEN
LAUSANNE
BERN
ZÜRICH
BRUGG
GENF
HAMBURG
DÜSSELDORF
FRANKFURT
STUTTGART
FREIBURG
BASEL
WIEN
Mit über 600 IT- und Fachexperten bei Ihnen vor Ort.
12c SQL Pattern Matching – wann werde ich das benutzen?3 19.11.2015
14 Trivadis Niederlassungen mit
über 600 Mitarbeitenden.
Über 200 Service Level Agreements.
Mehr als 4'000 Trainingsteilnehmer.
Forschungs- und Entwicklungsbudget:
CHF 5.0 Mio.
Finanziell unabhängig und
nachhaltig profitabel.
Erfahrung aus mehr als 1'900 Projekten
pro Jahr bei über 800 Kunden.
4. Über mich
12c SQL Pattern Matching – wann werde ich das benutzen?4 19.11.2015
Senior Consultant bei der Trivadis GmbH, Düsseldorf
Schwerpunkt Oracle
– Application Development
– Application Performance
– Data Warehousing
22 Jahre IT-Erfahrung, davon 16 Jahre mit Oracle DB
Kurs-Referent „Oracle 12c New Features für Entwickler“
und „Beyond SQL and PL/SQL“
Blog: http://blog.sqlora.com
5. Agenda
12c SQL Pattern Matching – wann werde ich das benutzen?5 19.11.2015
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
6. 12c SQL Pattern Matching – wann werde ich das benutzen?6 19.11.2015
Introduction
7. Introduction
12c SQL Pattern Matching – wann werde ich das benutzen?7 19.11.2015
Analytic
functions
Analytic
functions
enhancements
SQL Model
Clause
LISTAGG
NTH_VALUE
PIVOT/UNPIVOT
clause
Pattern
Matching
Top-N
8. Introduction
Oracle 12c database supports SQL Pattern Matching with the new
clause - MATCH_RECOGNIZE
pattern matching in a sequences of rows
nothing to do with string patterns (PL/SQL REGEXP_...
functions)
it‘s a clause, not a function
after the table name in FROM clause
patterns are expressed with regular expression syntax over
pattern variables
pattern variables are defined as SQL expressions
19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen?8
9. Introduction
19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen?9
MATCH_RECOGNIZE
( [ PARTITION BY <cols> ]
[ ORDER BY <cols> ]
[ MEASURES <cols> ]
[ ONE ROW PER MATCH | ALL ROWS PER MATCH ]
[ SKIP_TO <option> ]
PATTERN ( <row pattern> )
[ SUBSET <subset list> ]
DEFINE <definition list> )
10. Introduction
Example: Find Mappings in the ETL logging table, which were
increasingly faster over a period of four days. Output: start and end dates
of the period, elapsed time at the beginning and the end of the period,
average elapsed time.
19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen?10
12. Introduction
12c SQL Pattern Matching – wann werde ich das benutzen?12
SELECT *
FROM dwh_etl_runs MATCH_RECOGNIZE (
PARTITION BY mapping_name
ORDER BY etl_date
MEASURES FIRST (etl_date) AS start_date
, LAST (etl_date) AS end_date
, FIRST (elapsed) AS first_elapsed
, LAST (elapsed) AS last_elapsed
, AVG(elapsed) AS avg_elapsed
PATTERN (STRT DOWN{3})
DEFINE DOWN AS elapsed < PREV(elapsed) )
As for analytic functions:
partition and order
Define measures, which are
accessible in the main query
Define search pattern with
regular expression over boolean
pattern variables
Define pattern variables
Navigation operators:
▪ PREV, NEXT – physical offset
▪ FIRST, LAST – logical offset
19.11.2015
13. Introduction
12c SQL Pattern Matching – wann werde ich das benutzen?13
PATTERN: Subset of Perl syntax for regular expressions
– * — 0 or more iterations
– + — 1 or more iterations
– ? — 0 or 1 iterations
– {n} — n iterations (n > 0)
– {n,} — n or more iterations (n >= 0)
– {n,m} — between n and m (inclusive) iterations (0 <= n <= m, 0 < m)
– {,m} — between 0 and m (inclusive) iterations (m > 0)
– ( ) – Grouping
– | – Alternation
– {- … -} – Exclusion
– ^ - before the first row in the Partition
– $ - after the last row in the partition
– ? – “reluctant” vs. “greedy”
– ….
19.11.2015
14. Introduction
12c SQL Pattern Matching – wann werde ich das benutzen?14
Patterns are everywhere
Financial
Telcos
Retail Traffic
Automotive
Transport /
Logistics
Fraud Detection
Quality of Service
Trouble Ticketing
Price Trends
Buying Patterns
Stock Market Money
Laundering
Sensor Data
Network Activity
Advertising
Campaigns
Sessionization
Frequent Flyer
Programms
Process Chain
CRM
19.11.2015
15. Introduction
12c SQL Pattern Matching – wann werde ich das benutzen?15
SQL had no efficient way to handle such questions
pre 12c solutions
self-joins, subqueries (NOT) IN, (NOT) EXISTS
switch to PL/SQL - „Do it yourself“, often multiple SQL queries
transfer some logic to pipelined functions and integrate them in
the main query
analytic (window) functions
– ORA-30483: window functions are not allowed here
– not possible to use in WHERE clause
– not possible to nest them
– unable to access the output of analytic functions in other rows
– often leads to nesting queries, self-joins, etc.
19.11.2015
16. Agenda
12c SQL Pattern Matching – wann werde ich das benutzen?16 19.11.2015
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
17. 12c SQL Pattern Matching – wann werde ich das benutzen?17 19.11.2015
Find consecutive ranges and gaps
18. Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?18
SLA, QoS: find the longest period without outage
Table T_GAPS
Find consecutive ranges in the values of column ID
Output: Start- and End-ID of consecutive range
ID
1
2
3
5
6
10
11
12
14
20
21
…
mr_consecutive.sql
Start of Range End of Range
1 3
5 6
10 12
19.11.2015
19. Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?19
Pre 12c solution using analytic functionsID
1
2
3
5
6
10
11
12
14
20
21
…
WITH groups_marked AS (
SELECT id
, CASE
WHEN id != LAG(id,1,id) OVER(ORDER BY id) + 1 THEN 1
ELSE 0
END new_grp
FROM t_gaps)
, sum_grp AS (
SELECT id, SUM(new_grp) OVER(ORDER BY id) grp_sum
FROM groups_marked )
SELECT MIN(id) start_of_range
, MAX(id) end_of_range
FROM sum_grp
GROUP BY grp_sum
ORDER BY grp_sum;
mr_consecutive.sql
19.11.2015
20. Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?20
„Tabibitosan“- method*
* - https://community.oracle.com/message/3991177#3991177
ID
1
2
3
5
6
10
11
12
14
20
21
…
SELECT MIN(id) start_of_range
, MAX(id) end_of_range
FROM (SELECT id
, id - ROW_NUMBER() OVER(ORDER BY id) distance
FROM t_gaps)
GROUP BY distance
ORDER BY distance;
mr_consecutive.sql
19.11.2015
21. Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?21
12c solution with MATCH_RECOGINZEID
1
2
3
5
6
10
11
12
14
20
21
…
SELECT *
FROM t_gaps MATCH_RECOGNIZE (
ORDER BY id
MEASURES FIRST(id) start_of_range
, LAST(id) end_of_range
, COUNT(*) cnt
ONE ROW PER MATCH
PATTERN (strt cont*)
DEFINE cont AS id = PREV(id)+1
);
mr_consecutive.sql
19.11.2015
22. Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?22
Table T_GAPS, numeric column ID with gaps
Find the gaps in the values of column ID
Output: start- and end-ID of the gap
ID
1
2
3
5
6
10
11
12
14
20
21
…
mr_gaps.sql
Start of Gap End of Gap
4 4
7 9
13 13
15 19
19.11.2015
23. Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?23
Solution with analytic functions
„Tabibitosan“-method*
* - https://community.oracle.com/message/3991177#3991177
ID
1
2
3
5
6
10
11
12
14
20
21
…
mr_gaps.sql
SELECT start_of_gap, end_of_gap
FROM ( SELECT id + 1 start_of_gap
, LEAD(id) OVER(ORDER BY id) - 1 end_of_gap
, CASE
WHEN id + 1 != LEAD(id) OVER(ORDER BY id) THEN 1
ELSE 0
END is_gap
FROM t_gaps)
WHERE is_gap = 1;
SELECT MAX(id) + 1 start_of_gap
, LEAD(MIN(id)) OVER (ORDER BY distance) -1 end_of_gap
FROM (SELECT id
, id - ROW_NUMBER() OVER(ORDER BY id) distance
FROM t_gaps)
GROUP BY distance;
19.11.2015
24. Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?24
12c solution with MATCH_RECOGINZEID
1
2
3
5
6
10
11
12
14
20
21
…
mr_gaps.sql
SELECT *
FROM t_gaps MATCH_RECOGNIZE (
ORDER BY id
MEASURES PREV(gap.id)+1 start_of_gap
, gap.id - 1 end_of_gap
ONE ROW PER MATCH
PATTERN (strt gap+)
DEFINE gap AS id != PREV(id)+1
);
19.11.2015
25. Agenda
12c SQL Pattern Matching – wann werde ich das benutzen?25 19.11.2015
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
26. 12c SQL Pattern Matching – wann werde ich das benutzen?26 19.11.2015
Trouble Ticket roundtrip
27. Trouble Ticket Roundtrip
12c SQL Pattern Matching – wann werde ich das benutzen?27
SCOTT
ADAMS
KING
ID Assignee Datum
1 SCOTT 01.02.2015
1 SCOTT 02.02.2015
1 ADAMS 03.02.2015
1 SCOTT 04.02.2015
2 ADAMS 01.02.2015
2 ADAMS 02.02.2015
2 SCOTT 03.02.2015
3 KING 01.02.2015
3 ADAMS 02.02.2015
3 ADAMS 03.02.2015
3 KING 04.02.2015
3 ADAMS 05.02.2015
4 KING 01.02.2015
4 ADAMS 02.02.2015
4 SCOTT 03.02.2015
4 KING 05.02.2015
▪ Find the tickets, which went
again to the same assignee
19.11.2015
28. Trouble Ticket Roundtrip
12c SQL Pattern Matching – wann werde ich das benutzen?28
Pre12c solution using self-joins
mr_trouble_ticket.sql
SELECT DISTINCT t1.ticket_id
, t1.assignee AS first_assignee
, t3.change_date AS last_change
FROM trouble_ticket t1
, trouble_ticket t2
, trouble_ticket t3
WHERE t1.ticket_id = t2.ticket_id
AND t1.assignee != t2.assignee
AND t2.change_date > t1.change_date
AND t3.assignee = t1.assignee
AND t3.ticket_id = t1.ticket_id
AND t3.change_date > t2.change_date
ORDER BY ticket_id
19.11.2015
29. Trouble Ticket Roundtrip
12c SQL Pattern Matching – wann werde ich das benutzen?29
12c solution using MATCH_RECOGINZE clause
New:
– Row Pattern Skip To:
where to start over after
match?
– match overlaping patterns
mr_trouble_ticket.sql
SELECT *
FROM trouble_ticket
MATCH_RECOGNIZE(
PARTITION BY ticket_id
ORDER BY change_date
MEASURES strt.assignee as first_assignee
, LAST(same.change_date) as letzte_bearbeitung
AFTER MATCH SKIP TO FIRST another
PATTERN (strt another+ same+)
DEFINE same AS same.assignee = strt.assignee,
another AS another.assignee != strt.assignee
);
Where to start over after a
match is found?
19.11.2015
30. Agenda
12c SQL Pattern Matching – wann werde ich das benutzen?30 19.11.2015
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
31. 12c SQL Pattern Matching – wann werde ich das benutzen?31 19.11.2015
Grouping on fuzzy criteria
32. Grouping over fuzzy criteria
12c SQL Pattern Matching – wann werde ich das benutzen?32
„Sessionization“
– Group rows together where the gap between the timestamps is less
than defined
...
PATTERN (STRT SESS+)
DEFINE SESS AS SESS.ins_date – PREV(SESS.ins_date)<= 10/24/60
– Group rows together that are within a defined interval relatively to the
first row, otherwise start next group
https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID
:13946369553642#3478381500346951056
...
PATTERN (A+)
DEFINE A AS ins_date < FIRST(ins_date) + 6/24
Group over running totals
– Split the data into the groups of defined capacity
19.11.2015
33. Grouping over fuzzy criteria
12c SQL Pattern Matching – wann werde ich das benutzen?33
Example-Schema SH (Sales History)
Task: split the data into the group of fixed
capacity
▪ Fit all customers ordered by age into
groups providing that total sales in every
group < 200 000$
19.11.2015
34. Grouping over fuzzy criteria
12c SQL Pattern Matching – wann werde ich das benutzen?34
12c solution with MATCH_RECOGINZE clause
mr_group_running_total.sql
WITH q AS (SELECT c.cust_id, c.cust_year_of_birth
, SUM(s.amount_sold) cust_amount_sold
FROM customers c JOIN sales s ON s.cust_id = c.cust_id
GROUP BY c.cust_id, c.cust_year_of_birth
)
SELECT *
FROM q
MATCH_RECOGNIZE(
ORDER BY cust_year_of_birth
MEASURES MATCH_NUMBER() gruppe
, SUM(cust_amount_sold) running_sum
, FINAL SUM(cust_amount_sold) final_sum
ALL ROWS PER MATCH
PATTERN (gr*)
DEFINE gr AS SUM(cust_amount_sold)<=200000
);
We need all matches
Aggregate function in
pattern variable‘s condition
function returns the macth
number
Aggregates in MEASURES:
Running vs. Final
19.11.2015
35. Agenda
12c SQL Pattern Matching – wann werde ich das benutzen?35 19.11.2015
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
36. 12c SQL Pattern Matching – wann werde ich das benutzen?36 19.11.2015
Merge temporal intervals
37. Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?37
Temporal version of SCOTT-Schema: the data in EMP, DEPT and
JOB have temporal validity (VALID_FROM - VALID_TO)
19.11.2015
38. Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?38
Task: Query the data for one employee joining four tables with
respect of temporal validity:
19.11.2015
39. Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?39
WITH joined AS (
SELECT e.empno,
g.valid_from,
LEAST( e.valid_to, d.valid_to, j.valid_to,
NVL(m.valid_to, e.valid_to),
LEAD(g.valid_from - 1, 1, e.valid_to) OVER(
PARTITION BY e.empno ORDER BY g.valid_from )
) AS valid_to,
e.ename, j.job, e.mgr, m.ename AS mgr_ename, e.hiredate,
e.sal, e.comm, e.deptno, d.dname
FROM empv e
INNER JOIN (SELECT valid_from FROM empv
UNION
SELECT valid_from FROM deptv
UNION
SELECT valid_from FROM jobv
UNION
SELECT valid_to + 1 FROM empv
WHERE valid_to != DATE '9999-12-31'
UNION
SELECT valid_to + 1 FROM deptv
WHERE valid_to != DATE '9999-12-31'
UNION
SELECT valid_to + 1 FROM jobv
WHERE valid_to != DATE '9999-12-31') g
ON g.valid_from BETWEEN e.valid_from AND e.valid_to
INNER JOIN deptv d
ON d.deptno = e.deptno AND g.valid_from BETWEEN d.valid_from AND d.valid_to
INNER JOIN jobv j
ON j.jobno = e.jobno AND g.valid_from BETWEEN j.valid_from AND j.valid_to
LEFT JOIN empv m
ON m.empno = e.mgr AND g.valid_from BETWEEN m.valid_from AND m.valid_to )
...
Quelle: Philipp Salvisberg:
http://www.salvis.com/blog/2012/12/28/joining-temporal-intervals-part-2/
19.11.2015
40. Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?40
...
SELECT empno, valid_from, valid_to, ename, job, mgr,
mgr_ename, hiredate, sal, comm, deptno, dname
FROM joined
MATCH_RECOGNIZE (
PARTITION BY empno, ename, job, mgr,
mgr_ename, hiredate, sal, comm,
deptno, dname
ORDER BY valid_from
MEASURES FIRST(valid_from) valid_from,
LAST(valid_to) valid_to
PATTERN ( strt nxt* )
DEFINE nxt as valid_from = prev(valid_to) + 1
)
WHERE empno = 7788;
19.11.2015
41. Conclusion
12c SQL Pattern Matching – wann werde ich das benutzen?41
Very powerful feature
Significantly simplifies a lot of queries (self-joins, semi-, anti-joins, nested queries),
mostly with performance benefit
Since 2007 a proposal for ANSI-SQL
Requires thinking in patterns
Complicated syntax (at first sight )
But in many cases the code looks like the requirement in „plain English“
19.11.2015
42. Further information...
12c SQL Pattern Matching – wann werde ich das benutzen?42
Database Data Warehousing Guide - SQL for Pattern Matching -
http://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8956
Stewart Ashton‘s Blog - https://stewashton.wordpress.com
Oracle Whitepaper - Patterns everywhere - Find them Fast! -
http://www.oracle.com/ocom/groups/public/@otn/documents/webcontent/1965433.pdf
19.11.2015
43. 12c SQL Pattern Matching – wann werde ich das benutzen?43 19.11.2015
Trivadis an der DOAG 2015
Ebene 3 - gleich neben der Rolltreppe
Wir freuen uns auf Ihren Besuch.
Denn mit Trivadis gewinnen Sie immer.