SlideShare una empresa de Scribd logo
1 de 60
Descargar para leer sin conexión
You Can Do That
    Without Perl?
 Lists and Recursion and Trees, Oh My!
 YAPC10, Pittsburgh 2009

Copyright © 2009
David Fetter david.fetter@pgexperts.com
All Rights Reserved
Subject: The YAPC::NA 2009 Survey
Lists:
Windowing Functions
What are Windowing Functions?




        PGCon2009, 21-22 May 2009 Ottawa
What are the Windowing Functions?

• Similar to classical aggregates but does more!
• Provides access to set of rows from the current row
• Introduced SQL:2003 and more detail in SQL:2008
   – SQL:2008 (http://wiscorp.com/sql200n.zip) 02: SQL/Foundation
      • 4.14.9 Window tables
      • 4.15.3 Window functions
      • 6.10 <window function>
      • 7.8 <window clause>
• Supported by Oracle, SQL Server, Sybase and DB2
   – No open source RDBMS so far except PostgreSQL (Firebird trying)
• Used in OLAP mainly but also useful in OLTP
   – Analysis and reporting by rankings, cumulative aggregates




                        PGCon2009, 21-22 May 2009 Ottawa
How they work and what you get




        PGCon2009, 21-22 May 2009 Ottawa
How they work and what do you get




• Windowed table
  – Operates on a windowed table
  – Returns a value for each row
  – Returned value is calculated from the rows in the window




                         PGCon2009, 21-22 May 2009 Ottawa
How they work and what do you get




     • You can use…
        – New window functions
        – Existing aggregate functions
        – User-defined window functions
        – User-defined aggregate functions




            PGCon2009, 21-22 May 2009 Ottawa
How they work and what do you get




• Completely new concept!
  – With Windowing Functions, you can reach outside the current row




                      PGCon2009, 21-22 May 2009 Ottawa
How they work and what you get
[Aggregates]   SELECT key, SUM(val) FROM tbl GROUP BY key;




                    PGCon2009, 21-22 May 2009 Ottawa
How they work and what you get
[Windowing Functions]   SELECT key, SUM(val) OVER (PARTITION BY key) FROM tbl;




                             PGCon2009, 21-22 May 2009 Ottawa
How they work and what you get
Who is the highest paid relatively compared with the department average?


        depname             empno                    salary
        develop             10                       5200
        sales               1                        5000
        personnel           5                        3500
        sales               4                        4800
        sales               6                        550
        personnel           2                        3900
        develop             7                        4200
        develop             9                        4500
        sales               3                        4800
        develop             8                        6000
        develop             11                       5200

                        PGCon2009, 21-22 May 2009 Ottawa
How they work and what you get
SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname)::int,

        salary - avg(salary) OVER (PARTITION BY depname)::int AS diff
FROM empsalary ORDER BY diff DESC


          depname      empno         salary         avg        diff
          develop      8             6000           5020       980
          sales        6             5500           5025       475
          personnel    2             3900           3700       200
          develop      11            5200           5020       180
          develop      10            5200           5020       180
          sales        1             5000           5025       -25
          personnel    5             3500           3700       -200
          sales        4             4800           5025       -225
          sales        3             4800           5025       -225
          develop      9             4500           5020       -520
          develop      7             4200           5020       -820


                            PGCon2009, 21-22 May 2009 Ottawa
Anatomy of a Window
• Represents set of rows abstractly as:
• A partition
  – Specified by PARTITION BY clause
  – Never moves
  – Can contain:
• A frame
  – Specified by ORDER BY clause and frame clause
  – Defined in a partition
  – Moves within a partition
  – Never goes across two partitions
• Some functions take values from a partition.
  Others take them from a frame.
                    PGCon2009, 21-22 May 2009 Ottawa
A partition
• Specified by PARTITION BY clause in OVER()
• Allows to subdivide the table, much like
  GROUP BY clause
• Without a PARTITION BY clause, the whole
  table is in a single partition

func
(args)
             OVER   (                                                 )
                        partition-clause    order-
                                            clause
                                                             frame-
                                                             clause


         PARTITION BY expr, …
                          PGCon2009, 21-22 May 2009 Ottawa
A frame
• Specified by ORDER BY clause and frame
  clause in OVER()
• Allows to tell how far the set is applied
• Also defines ordering of the set
• Without order and frame clauses, the whole of
  partition is a single frame
func
(args)
            OVER   (                                          )
                       partition-clause   order-
                                          clause
                                                   frame-
                                                   clause

ORDER BY expr [ASC|DESC]             (ROWS|RANGE) BETWEEN UNBOUNDED
[NULLS FIRST|LAST], …                (PRECEDING|FOLLOWING) AND CURRENT
                                     ROW
                         PGCon2009, 21-22 May 2009 Ottawa
The WINDOW clause
• You can specify common window definitions
  in one place and name it, for convenience
• You can use the window at the same query
  level
SELECT target_list, … wfunc() OVER w
FROM table_list, …
WHERE qual_list, ….
GROUP BY groupkey_list, ….
HAVING groupqual_list, …
WINDOW w      A   (                                                )
              S
                       partition-clause     order-
                                            clause
                                                          frame-
                                                          clause




                       PGCon2009, 21-22 May 2009 Ottawa
Built-in Windowing Functions




         PGCon2009, 21-22 May 2009 Ottawa
List of built-in Windowing Functions

•   row_number()                 •   lag()
•   rank()                       •   lead()
•   dense_rank()                 •   first_value()
•   percent_rank()               •   last_value()
•   cume_dist()                  •   nth_value()
•   ntile()


              PGCon2009, 21-22 May 2009 Ottawa
row_number()
     • Returns number of the current row

   SELECT val, row_number() OVER (ORDER BY val DESC) FROM tbl;

     val                             row_number()
     5                               1
     5                               2
     3                               3
     1                               4
Note: row_number() always incremented values independent of frame



                    PGCon2009, 21-22 May 2009 Ottawa
rank()
• Returns rank of the current row with gap

    SELECT val, rank() OVER (ORDER BY val DESC) FROM tbl;


    val                             rank()
    5                               1
                                                                   gap
    5                               1
    3                               3
    1                               4
    Note: rank() OVER(*empty*) returns 1 for all rows, since all rows
    are peers to each other


                       PGCon2009, 21-22 May 2009 Ottawa
dense_rank()
• Returns rank of the current row without gap

       SELECT val, dense_rank() OVER (ORDER BY val DESC) FROM tbl;


       val                             dense_rank()
       5                               1
                                                                     no gap
       5                               1
       3                               2
       1                               3
Note: dense_rank() OVER(*empty*) returns 1 for all rows, since all rows
are peers to each other

                          PGCon2009, 21-22 May 2009 Ottawa
percent_rank()
• Returns relative rank; (rank() – 1) / (total row – 1)
   SELECT val, percent_rank() OVER (ORDER BY val DESC) FROM tbl;

       val                             percent_rank()
       5                               0
       5                               0
       3                               0.666666666666667
       1                               1

      values are always between 0 and 1 inclusive.

Note: percent_rank() OVER(*empty*) returns 0 for all rows, since all rows
are peers to each other



                         PGCon2009, 21-22 May 2009 Ottawa
cume_dist()
• Returns relative rank; (# of preced. or peers) / (total row)

 SELECT val, cume_dist() OVER (ORDER BY val DESC) FROM tbl;


   val                             cume_dist()
   5                               0.5                          =2/4

   5                               0.5                          =2/4

   3                               0.75                         =3/4

   1                               1                            =4/4
       The result can be emulated by
       “count(*) OVER (ORDER BY val DESC) / count(*) OVER ()”
       Note: cume_dist() OVER(*empty*) returns 1 for all rows, since all rows
       are peers to each other
                          PGCon2009, 21-22 May 2009 Ottawa
ntile()
• Returns dividing bucket number

   SELECT val, ntile(3) OVER (ORDER BY val DESC) FROM tbl;


   val                               ntile(3)
   5                                 1                                    4%3=1

   5                                 1
   3                                 2
   1                                 3
    The results are the divided positions, but if there’s remainder add
    row from the head
    Note: ntile() OVER (*empty*) returns same values as above, since
    ntile() doesn’t care the frame but works against the partition
                        PGCon2009, 21-22 May 2009 Ottawa
lag()
      • Returns value of row above

SELECT val, lag(val) OVER (ORDER BY val DESC) FROM tbl;


val                            lag(val)
5                              NULL
5                              5
3                              5
1                              3
            Note: lag() only acts on a partition.



                  PGCon2009, 21-22 May 2009 Ottawa
lead()
• Returns value of the row below

SELECT val, lead(val) OVER (ORDER BY val DESC) FROM tbl;


val                            lead(val)
5                              5
5                              3
3                              1
1                              NULL

         Note: lead() acts against a partition.


                  PGCon2009, 21-22 May 2009 Ottawa
first_value()
• Returns the first value of the frame

    SELECT val, first_value(val) OVER (ORDER BY val DESC) FROM tbl;


    val                            first_value(val)
    5                              5
    5                              5
    3                              5
    1                              5



                      PGCon2009, 21-22 May 2009 Ottawa
last_value()
• Returns the last value of the frame
    SELECT val, last_value(val) OVER
     (ORDER BY val DESC ROWS BETWEEN UNBOUNDED PRECEDING
    AND UNBOUNDED FOLLOWING) FROM tbl;

    val                            last_value(val)
    5                              1
    5                              1
    3                              1
    1                              1
    Note: frame clause is necessary since you have a frame between
    the first row and the current row by only the order clause


                      PGCon2009, 21-22 May 2009 Ottawa
nth_value()
• Returns the n-th value of the frame
    SELECT val, nth_value(val, val) OVER
     (ORDER BY val DESC ROWS BETWEEN UNBOUNDED PRECEDING
    AND UNBOUNDED FOLLOWING) FROM tbl;

    val                               nth_value(val, val)
    5                                 NULL
    5                                 NULL
    3                                 3
    1                                 5
   Note: frame clause is necessary since you have a frame between
   the first row and the current row by only the order clause
                      PGCon2009, 21-22 May 2009 Ottawa
aggregates(all peers)
• Returns the same values along the frame

    SELECT val, sum(val) OVER () FROM tbl;


    val                            sum(val)
    5                              14
    5                              14
    3                              14
    1                              14
   Note: all rows are the peers to each other



                      PGCon2009, 21-22 May 2009 Ottawa
cumulative aggregates
• Returns different values along the frame

    SELECT val, sum(val) OVER (ORDER BY val DESC) FROM tbl;


    val                            sum(val)
    5                              10
    5                              10
    3                              13
    1                              14
    Note: row#1 and row#2 return the same value since they are the peers.
    the result of row#3 is sum(val of row#1…#3)


                      PGCon2009, 21-22 May 2009 Ottawa
Recursion and Trees, Oh My:
Common Table Expressions
Thanks to:
•   The ISO SQL Standards Committee

•   Yoshiyuki Asaba

•   Ishii Tatsuo

•   Jeff Davis

•   Gregory Stark

•   Tom Lane

•   etc., etc., etc.
Recursion
Recursion in General

•
Initial Condition
•
Recursion step
•
Termination
condition
E1 List Table
CREATE TABLE employee(
    id INTEGER NOT NULL,
    boss_id INTEGER,
    UNIQUE(id, boss_id)/*, etc., etc. */
);

INSERT INTO employee(id, boss_id)
VALUES(1,NULL), /* El capo di tutti capi */
(2,1),(3,1),(4,1),
(5,2),(6,2),(7,2),(8,3),(9,3),(10,4),
(11,5),(12,5),(13,6),(14,7),(15,8),
(1,9);
Tree Query Initiation
WITH RECURSIVE t(node, path) AS (
   SELECT id, ARRAY[id] FROM employee WHERE boss_id IS NULL
   /* Initiation Step */
UNION ALL
    SELECT e1.id, t.path || ARRAY[e1.id]
    FROM employee e1 JOIN t ON (e1.boss_id = t.node)
    WHERE id NOT IN (t.path)
)
SELECT
    CASE WHEN array_upper(path,1)>1 THEN '+-' ELSE '' END ||
    REPEAT('--', array_upper(path,1)-2) ||
    node AS "Branch"
FROM t
ORDER BY path;
Tree Query Recursion
WITH RECURSIVE t(node, path) AS (
   SELECT id, ARRAY[id] FROM employee WHERE boss_id IS NULL
UNION ALL
    SELECT e1.id, t.path || ARRAY[e1.id] /* Recursion */
    FROM employee e1 JOIN t ON (e1.boss_id = t.node)
    WHERE id NOT IN (t.path)
)
SELECT
    CASE WHEN array_upper(path,1)>1 THEN '+-' ELSE '' END ||
    REPEAT('--', array_upper(path,1)-2) ||
    node AS "Branch"
FROM t
ORDER BY path;
Tree Query
                 Termination
WITH RECURSIVE t(node, path) AS (
    SELECT id, ARRAY[id] FROM employee WHERE boss_id IS NULL
UNION ALL
    SELECT e1.id, t.path || ARRAY[e1.id]
    FROM employee e1 JOIN t ON (e1.boss_id = t.node)
    WHERE id NOT IN (t.path) /* Termination Condition */
)
SELECT
    CASE WHEN array_upper(path,1)>1 THEN '+-' ELSE '' END ||
    REPEAT('--', array_upper(path,1)-2) ||
    node AS "Branch"
FROM t
ORDER BY path;
Tree Query Display
WITH RECURSIVE t(node, path) AS (
   SELECT id, ARRAY[id] FROM employee WHERE boss_id IS NULL
UNION ALL
    SELECT e1.id, t.path || ARRAY[e1.id]
    FROM employee e1 JOIN t ON (e1.boss_id = t.node)
    WHERE id NOT IN (t.path)
)
SELECT
    CASE WHEN array_upper(path,1)>1 THEN '+-' ELSE '' END ||
    REPEAT('--', array_upper(path,1)-2) ||
    node AS "Branch" /* Display */
FROM t
ORDER BY path;
Tree Query Initiation
           Branch
        ----------
         1
         +-2
         +---5
         +-----11
         +-----12
         +---6
         +-----13
         +---7
         +-----14
         +-3
         +---8
         +-----15
         +---9
         +-4
         +---10
         +-9
        (16 rows)
Travelling Salesman Problem
Given a number of cities and the costs of travelling
from any city to any other city, what is the least-
cost round-trip route that visits each city exactly
once and then returns to the starting city?
TSP Schema
CREATE TABLE pairs (
    from_city TEXT NOT NULL,
    to_city TEXT NOT NULL,
    distance INTEGER NOT NULL,
    PRIMARY KEY(from_city, to_city),
    CHECK (from_city < to_city)
);
TSP Data
INSERT INTO pairs
VALUES
    ('Bari','Bologna',672),
    ('Bari','Bolzano',939),
    ('Bari','Firenze',723),
    ('Bari','Genova',944),
    ('Bari','Milan',881),
    ('Bari','Napoli',257),
    ('Bari','Palermo',708),
    ('Bari','Reggio Calabria',464),
    ....
TSP Program:
     Symmetric Setup

 WITH RECURSIVE both_ways(
      from_city,
      to_city,
      distance
 )           /* Working Table */
 AS (
      SELECT
           from_city,
           to_city,
           distance
      FROM
           pairs
 UNION ALL
      SELECT
           to_city AS "from_city",
           from_city AS "to_city",
           distance
      FROM
           pairs
 ),
TSP Program:
     Symmetric Setup

 WITH RECURSIVE both_ways(
     from_city,
     to_city,
     distance
 )
 AS (/* Distances One Way */
     SELECT
          from_city,
          to_city,
          distance
     FROM
          pairs
 UNION ALL
     SELECT
          to_city AS "from_city",
          from_city AS "to_city",
          distance
     FROM
          pairs
 ),
TSP Program:
    Symmetric Setup

 WITH RECURSIVE both_ways(
      from_city,
      to_city,
      distance
 )
 AS (
      SELECT
           from_city,
           to_city,
           distance
      FROM
           pairs
 UNION ALL /* Distances Other Way */
      SELECT
           to_city AS "from_city",
           from_city AS "to_city",
           distance
      FROM
           pairs
 ),
TSP Program:
      Path Initialization Step
paths (
     from_city,
     to_city,
     distance,
     path
)
AS (
     SELECT
          from_city,
          to_city,
          distance,
          ARRAY[from_city] AS "path"
     FROM
          both_ways b1
     WHERE
          b1.from_city = 'Roma'
UNION ALL
TSP Program:
             Path Recursion Step

    SELECT
         b2.from_city,
         b2.to_city,
         p.distance + b2.distance,
         p.path || b2.from_city
    FROM
         both_ways b2
    JOIN
         paths p
         ON (
              p.to_city = b2.from_city
         AND
              b2.from_city <> ALL (p.path[
                  2:array_upper(p.path,1)
              ]) /* Prevent re-tracing */
         AND
              array_upper(p.path,1) < 6
         )
)
TSP Program:
          Timely Termination Step

    SELECT
         b2.from_city,
         b2.to_city,
         p.distance + b2.distance,
         p.path || b2.from_city
    FROM
         both_ways b2
    JOIN
         paths p
         ON (
              p.to_city = b2.from_city
         AND
              b2.from_city <> ALL (p.path[
                  2:array_upper(p.path,1)
              ]) /* Prevent re-tracing */
         AND
              array_upper(p.path,1) < 6 /* Timely Termination */
         )
)
TSP Program:
               Filter and Display

SELECT
     path || to_city AS "path",
     distance
FROM
     paths
WHERE
     to_city = 'Roma'
AND
     ARRAY['Milan','Firenze','Napoli'] <@ path
ORDER BY distance, path
LIMIT 1;
TSP Program:
                 Filter and Display



davidfetter@tsp=# i travelling_salesman.sql
               path               | distance
----------------------------------+----------
 {Roma,Firenze,Milan,Napoli,Roma} |     1553
(1 row)

Time: 11679.503 ms
TSP Program:
                                                   Gritty Details

                                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=44.14..44.15 rows=1 width=68) (actual time=16131.326..16131.327 rows=1 loops=1)
   CTE both_ways
     -> Append (cost=0.00..4.64 rows=132 width=19) (actual time=0.011..0.090 rows=132 loops=1)
            -> Seq Scan on pairs (cost=0.00..1.66 rows=66 width=19) (actual time=0.010..0.026 rows=66 loops=1)
            -> Seq Scan on pairs (cost=0.00..1.66 rows=66 width=19) (actual time=0.004..0.029 rows=66 loops=1)
   CTE paths
     -> Recursive Union (cost=0.00..38.72 rows=31 width=104) (actual time=0.102..10631.385 rows=1092883 loops=1)
            -> CTE Scan on both_ways b1 (cost=0.00..2.97 rows=1 width=68) (actual time=0.100..0.251 rows=11 loops=1)
                  Filter: (from_city = 'Roma'::text)
            -> Hash Join (cost=0.29..3.51 rows=3 width=104) (actual time=180.125..958.459 rows=182145 loops=6)
                  Hash Cond: (b2.from_city = p.to_city)
                  Join Filter: (b2.from_city <> ALL (p.path[2:array_upper(p.path, 1)]))
                  -> CTE Scan on both_ways b2 (cost=0.00..2.64 rows=132 width=68) (actual time=0.001..0.056 rows=132 loops=5)
                  -> Hash (cost=0.25..0.25 rows=3 width=68) (actual time=172.292..172.292 rows=22427 loops=6)
                        -> WorkTable Scan on paths p (cost=0.00..0.25 rows=3 width=68) (actual time=123.167..146.659 rows=22427 loops=6)
                              Filter: (array_upper(path, 1) < 6)
   -> Sort (cost=0.79..0.79 rows=1 width=68) (actual time=16131.324..16131.324 rows=1 loops=1)
          Sort Key: paths.distance, ((paths.path || paths.to_city))
          Sort Method: top-N heapsort Memory: 17kB
          -> CTE Scan on paths (cost=0.00..0.78 rows=1 width=68) (actual time=36.621..16127.841 rows=4146 loops=1)
                Filter: (('{Milan,Firenze,Napoli}'::text[] <@ path) AND (to_city = 'Roma'::text))
 Total runtime: 16180.335 ms
(22 rows)
Hausdorf-Besicovich-
          WTF Dimension
WITH RECURSIVE
Z(Ix, Iy, Cx, Cy, X, Y, I)
AS (
     SELECT Ix, Iy, X::float, Y::float, X::float, Y::float, 0
     FROM
          (SELECT -2.2 + 0.031 * i, i FROM generate_series(0,101) AS i) AS xgen(x,ix)
     CROSS JOIN
          (SELECT -1.5 + 0.031 * i, i FROM generate_series(0,101) AS i) AS ygen(y,iy)
     UNION ALL
     SELECT Ix, Iy, Cx, Cy, X * X - Y * Y + Cx AS X, Y * X * 2 + Cy, I + 1
     FROM Z
     WHERE X * X + Y * Y < 16::float
     AND I < 100
),
Filter
Zt (Ix, Iy, I) AS (
    SELECT Ix, Iy, MAX(I) AS I
    FROM Z
    GROUP BY Iy, Ix
    ORDER BY Iy, Ix
)
Display

SELECT array_to_string(
    array_agg(
         SUBSTRING(
             ' .,,,-----++++%%%%@@@@#### ',
             GREATEST(I,1)
    ),''
)
FROM Zt
GROUP BY Iy
ORDER BY Iy;
Questions?
Comments?
Straitjackets?
Thank You!
Copyright © 2009
David Fetter david.fetter@pgexperts.com
All Rights Reserved

Más contenido relacionado

Similar a You can do THAT without Perl?

Enabling Applications with Informix' new OLAP functionality
 Enabling Applications with Informix' new OLAP functionality Enabling Applications with Informix' new OLAP functionality
Enabling Applications with Informix' new OLAP functionalityAjay Gupte
 
Olap Functions Suport in Informix
Olap Functions Suport in InformixOlap Functions Suport in Informix
Olap Functions Suport in InformixBingjie Miao
 
Programmable logic devices
Programmable logic devicesProgrammable logic devices
Programmable logic devicesAmmara Javed
 
R07_Senegacnik_CBO.pdf
R07_Senegacnik_CBO.pdfR07_Senegacnik_CBO.pdf
R07_Senegacnik_CBO.pdfcookie1969
 
Firebird 3 Windows Functions
Firebird 3 Windows  FunctionsFirebird 3 Windows  Functions
Firebird 3 Windows FunctionsMind The Firebird
 
Big Science, Big Data: Simon Metson at Eduserv Symposium 2012
Big Science, Big Data: Simon Metson at Eduserv Symposium 2012Big Science, Big Data: Simon Metson at Eduserv Symposium 2012
Big Science, Big Data: Simon Metson at Eduserv Symposium 2012Eduserv
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Yahoo Developer Network
 

Similar a You can do THAT without Perl? (10)

SQL Windowing
SQL WindowingSQL Windowing
SQL Windowing
 
Enabling Applications with Informix' new OLAP functionality
 Enabling Applications with Informix' new OLAP functionality Enabling Applications with Informix' new OLAP functionality
Enabling Applications with Informix' new OLAP functionality
 
MySQL 5.1 and beyond
MySQL 5.1 and beyondMySQL 5.1 and beyond
MySQL 5.1 and beyond
 
Olap Functions Suport in Informix
Olap Functions Suport in InformixOlap Functions Suport in Informix
Olap Functions Suport in Informix
 
Mutant Tests Too: The SQL
Mutant Tests Too: The SQLMutant Tests Too: The SQL
Mutant Tests Too: The SQL
 
Programmable logic devices
Programmable logic devicesProgrammable logic devices
Programmable logic devices
 
R07_Senegacnik_CBO.pdf
R07_Senegacnik_CBO.pdfR07_Senegacnik_CBO.pdf
R07_Senegacnik_CBO.pdf
 
Firebird 3 Windows Functions
Firebird 3 Windows  FunctionsFirebird 3 Windows  Functions
Firebird 3 Windows Functions
 
Big Science, Big Data: Simon Metson at Eduserv Symposium 2012
Big Science, Big Data: Simon Metson at Eduserv Symposium 2012Big Science, Big Data: Simon Metson at Eduserv Symposium 2012
Big Science, Big Data: Simon Metson at Eduserv Symposium 2012
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
 

Más de PostgreSQL Experts, Inc.

PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALEPostgreSQL Experts, Inc.
 
Elephant Roads: PostgreSQL Patches and Variants
Elephant Roads: PostgreSQL Patches and VariantsElephant Roads: PostgreSQL Patches and Variants
Elephant Roads: PostgreSQL Patches and VariantsPostgreSQL Experts, Inc.
 

Más de PostgreSQL Experts, Inc. (20)

Shootout at the AWS Corral
Shootout at the AWS CorralShootout at the AWS Corral
Shootout at the AWS Corral
 
Fail over fail_back
Fail over fail_backFail over fail_back
Fail over fail_back
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
 
HowTo DR
HowTo DRHowTo DR
HowTo DR
 
Pg py-and-squid-pypgday
Pg py-and-squid-pypgdayPg py-and-squid-pypgday
Pg py-and-squid-pypgday
 
92 grand prix_2013
92 grand prix_201392 grand prix_2013
92 grand prix_2013
 
Five steps perform_2013
Five steps perform_2013Five steps perform_2013
Five steps perform_2013
 
7 Ways To Crash Postgres
7 Ways To Crash Postgres7 Ways To Crash Postgres
7 Ways To Crash Postgres
 
PWNage: Producing a newsletter with Perl
PWNage: Producing a newsletter with PerlPWNage: Producing a newsletter with Perl
PWNage: Producing a newsletter with Perl
 
Open Source Press Relations
Open Source Press RelationsOpen Source Press Relations
Open Source Press Relations
 
5 (more) Ways To Destroy Your Community
5 (more) Ways To Destroy Your Community5 (more) Ways To Destroy Your Community
5 (more) Ways To Destroy Your Community
 
Development of 8.3 In India
Development of 8.3 In IndiaDevelopment of 8.3 In India
Development of 8.3 In India
 
PostgreSQL and MySQL
PostgreSQL and MySQLPostgreSQL and MySQL
PostgreSQL and MySQL
 
50 Ways To Love Your Project
50 Ways To Love Your Project50 Ways To Love Your Project
50 Ways To Love Your Project
 
8.4 Upcoming Features
8.4 Upcoming Features 8.4 Upcoming Features
8.4 Upcoming Features
 
Elephant Roads: PostgreSQL Patches and Variants
Elephant Roads: PostgreSQL Patches and VariantsElephant Roads: PostgreSQL Patches and Variants
Elephant Roads: PostgreSQL Patches and Variants
 
Writeable CTEs: The Next Big Thing
Writeable CTEs: The Next Big ThingWriteable CTEs: The Next Big Thing
Writeable CTEs: The Next Big Thing
 
PostgreSQL Development Today: 9.0
PostgreSQL Development Today: 9.0PostgreSQL Development Today: 9.0
PostgreSQL Development Today: 9.0
 
9.1 Mystery Tour
9.1 Mystery Tour9.1 Mystery Tour
9.1 Mystery Tour
 
Postgres Open Keynote: The Next 25 Years
Postgres Open Keynote: The Next 25 YearsPostgres Open Keynote: The Next 25 Years
Postgres Open Keynote: The Next 25 Years
 

Último

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 

Último (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 

You can do THAT without Perl?

  • 1. You Can Do That Without Perl? Lists and Recursion and Trees, Oh My! YAPC10, Pittsburgh 2009 Copyright © 2009 David Fetter david.fetter@pgexperts.com All Rights Reserved
  • 2. Subject: The YAPC::NA 2009 Survey
  • 4. What are Windowing Functions? PGCon2009, 21-22 May 2009 Ottawa
  • 5. What are the Windowing Functions? • Similar to classical aggregates but does more! • Provides access to set of rows from the current row • Introduced SQL:2003 and more detail in SQL:2008 – SQL:2008 (http://wiscorp.com/sql200n.zip) 02: SQL/Foundation • 4.14.9 Window tables • 4.15.3 Window functions • 6.10 <window function> • 7.8 <window clause> • Supported by Oracle, SQL Server, Sybase and DB2 – No open source RDBMS so far except PostgreSQL (Firebird trying) • Used in OLAP mainly but also useful in OLTP – Analysis and reporting by rankings, cumulative aggregates PGCon2009, 21-22 May 2009 Ottawa
  • 6. How they work and what you get PGCon2009, 21-22 May 2009 Ottawa
  • 7. How they work and what do you get • Windowed table – Operates on a windowed table – Returns a value for each row – Returned value is calculated from the rows in the window PGCon2009, 21-22 May 2009 Ottawa
  • 8. How they work and what do you get • You can use… – New window functions – Existing aggregate functions – User-defined window functions – User-defined aggregate functions PGCon2009, 21-22 May 2009 Ottawa
  • 9. How they work and what do you get • Completely new concept! – With Windowing Functions, you can reach outside the current row PGCon2009, 21-22 May 2009 Ottawa
  • 10. How they work and what you get [Aggregates] SELECT key, SUM(val) FROM tbl GROUP BY key; PGCon2009, 21-22 May 2009 Ottawa
  • 11. How they work and what you get [Windowing Functions] SELECT key, SUM(val) OVER (PARTITION BY key) FROM tbl; PGCon2009, 21-22 May 2009 Ottawa
  • 12. How they work and what you get Who is the highest paid relatively compared with the department average? depname empno salary develop 10 5200 sales 1 5000 personnel 5 3500 sales 4 4800 sales 6 550 personnel 2 3900 develop 7 4200 develop 9 4500 sales 3 4800 develop 8 6000 develop 11 5200 PGCon2009, 21-22 May 2009 Ottawa
  • 13. How they work and what you get SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname)::int, salary - avg(salary) OVER (PARTITION BY depname)::int AS diff FROM empsalary ORDER BY diff DESC depname empno salary avg diff develop 8 6000 5020 980 sales 6 5500 5025 475 personnel 2 3900 3700 200 develop 11 5200 5020 180 develop 10 5200 5020 180 sales 1 5000 5025 -25 personnel 5 3500 3700 -200 sales 4 4800 5025 -225 sales 3 4800 5025 -225 develop 9 4500 5020 -520 develop 7 4200 5020 -820 PGCon2009, 21-22 May 2009 Ottawa
  • 14. Anatomy of a Window • Represents set of rows abstractly as: • A partition – Specified by PARTITION BY clause – Never moves – Can contain: • A frame – Specified by ORDER BY clause and frame clause – Defined in a partition – Moves within a partition – Never goes across two partitions • Some functions take values from a partition. Others take them from a frame. PGCon2009, 21-22 May 2009 Ottawa
  • 15. A partition • Specified by PARTITION BY clause in OVER() • Allows to subdivide the table, much like GROUP BY clause • Without a PARTITION BY clause, the whole table is in a single partition func (args) OVER ( ) partition-clause order- clause frame- clause PARTITION BY expr, … PGCon2009, 21-22 May 2009 Ottawa
  • 16. A frame • Specified by ORDER BY clause and frame clause in OVER() • Allows to tell how far the set is applied • Also defines ordering of the set • Without order and frame clauses, the whole of partition is a single frame func (args) OVER ( ) partition-clause order- clause frame- clause ORDER BY expr [ASC|DESC] (ROWS|RANGE) BETWEEN UNBOUNDED [NULLS FIRST|LAST], … (PRECEDING|FOLLOWING) AND CURRENT ROW PGCon2009, 21-22 May 2009 Ottawa
  • 17. The WINDOW clause • You can specify common window definitions in one place and name it, for convenience • You can use the window at the same query level SELECT target_list, … wfunc() OVER w FROM table_list, … WHERE qual_list, …. GROUP BY groupkey_list, …. HAVING groupqual_list, … WINDOW w A ( ) S partition-clause order- clause frame- clause PGCon2009, 21-22 May 2009 Ottawa
  • 18. Built-in Windowing Functions PGCon2009, 21-22 May 2009 Ottawa
  • 19. List of built-in Windowing Functions • row_number() • lag() • rank() • lead() • dense_rank() • first_value() • percent_rank() • last_value() • cume_dist() • nth_value() • ntile() PGCon2009, 21-22 May 2009 Ottawa
  • 20. row_number() • Returns number of the current row SELECT val, row_number() OVER (ORDER BY val DESC) FROM tbl; val row_number() 5 1 5 2 3 3 1 4 Note: row_number() always incremented values independent of frame PGCon2009, 21-22 May 2009 Ottawa
  • 21. rank() • Returns rank of the current row with gap SELECT val, rank() OVER (ORDER BY val DESC) FROM tbl; val rank() 5 1 gap 5 1 3 3 1 4 Note: rank() OVER(*empty*) returns 1 for all rows, since all rows are peers to each other PGCon2009, 21-22 May 2009 Ottawa
  • 22. dense_rank() • Returns rank of the current row without gap SELECT val, dense_rank() OVER (ORDER BY val DESC) FROM tbl; val dense_rank() 5 1 no gap 5 1 3 2 1 3 Note: dense_rank() OVER(*empty*) returns 1 for all rows, since all rows are peers to each other PGCon2009, 21-22 May 2009 Ottawa
  • 23. percent_rank() • Returns relative rank; (rank() – 1) / (total row – 1) SELECT val, percent_rank() OVER (ORDER BY val DESC) FROM tbl; val percent_rank() 5 0 5 0 3 0.666666666666667 1 1 values are always between 0 and 1 inclusive. Note: percent_rank() OVER(*empty*) returns 0 for all rows, since all rows are peers to each other PGCon2009, 21-22 May 2009 Ottawa
  • 24. cume_dist() • Returns relative rank; (# of preced. or peers) / (total row) SELECT val, cume_dist() OVER (ORDER BY val DESC) FROM tbl; val cume_dist() 5 0.5 =2/4 5 0.5 =2/4 3 0.75 =3/4 1 1 =4/4 The result can be emulated by “count(*) OVER (ORDER BY val DESC) / count(*) OVER ()” Note: cume_dist() OVER(*empty*) returns 1 for all rows, since all rows are peers to each other PGCon2009, 21-22 May 2009 Ottawa
  • 25. ntile() • Returns dividing bucket number SELECT val, ntile(3) OVER (ORDER BY val DESC) FROM tbl; val ntile(3) 5 1 4%3=1 5 1 3 2 1 3 The results are the divided positions, but if there’s remainder add row from the head Note: ntile() OVER (*empty*) returns same values as above, since ntile() doesn’t care the frame but works against the partition PGCon2009, 21-22 May 2009 Ottawa
  • 26. lag() • Returns value of row above SELECT val, lag(val) OVER (ORDER BY val DESC) FROM tbl; val lag(val) 5 NULL 5 5 3 5 1 3 Note: lag() only acts on a partition. PGCon2009, 21-22 May 2009 Ottawa
  • 27. lead() • Returns value of the row below SELECT val, lead(val) OVER (ORDER BY val DESC) FROM tbl; val lead(val) 5 5 5 3 3 1 1 NULL Note: lead() acts against a partition. PGCon2009, 21-22 May 2009 Ottawa
  • 28. first_value() • Returns the first value of the frame SELECT val, first_value(val) OVER (ORDER BY val DESC) FROM tbl; val first_value(val) 5 5 5 5 3 5 1 5 PGCon2009, 21-22 May 2009 Ottawa
  • 29. last_value() • Returns the last value of the frame SELECT val, last_value(val) OVER (ORDER BY val DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FROM tbl; val last_value(val) 5 1 5 1 3 1 1 1 Note: frame clause is necessary since you have a frame between the first row and the current row by only the order clause PGCon2009, 21-22 May 2009 Ottawa
  • 30. nth_value() • Returns the n-th value of the frame SELECT val, nth_value(val, val) OVER (ORDER BY val DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FROM tbl; val nth_value(val, val) 5 NULL 5 NULL 3 3 1 5 Note: frame clause is necessary since you have a frame between the first row and the current row by only the order clause PGCon2009, 21-22 May 2009 Ottawa
  • 31. aggregates(all peers) • Returns the same values along the frame SELECT val, sum(val) OVER () FROM tbl; val sum(val) 5 14 5 14 3 14 1 14 Note: all rows are the peers to each other PGCon2009, 21-22 May 2009 Ottawa
  • 32. cumulative aggregates • Returns different values along the frame SELECT val, sum(val) OVER (ORDER BY val DESC) FROM tbl; val sum(val) 5 10 5 10 3 13 1 14 Note: row#1 and row#2 return the same value since they are the peers. the result of row#3 is sum(val of row#1…#3) PGCon2009, 21-22 May 2009 Ottawa
  • 33. Recursion and Trees, Oh My: Common Table Expressions
  • 34. Thanks to: • The ISO SQL Standards Committee • Yoshiyuki Asaba • Ishii Tatsuo • Jeff Davis • Gregory Stark • Tom Lane • etc., etc., etc.
  • 36. Recursion in General • Initial Condition • Recursion step • Termination condition
  • 37. E1 List Table CREATE TABLE employee( id INTEGER NOT NULL, boss_id INTEGER, UNIQUE(id, boss_id)/*, etc., etc. */ ); INSERT INTO employee(id, boss_id) VALUES(1,NULL), /* El capo di tutti capi */ (2,1),(3,1),(4,1), (5,2),(6,2),(7,2),(8,3),(9,3),(10,4), (11,5),(12,5),(13,6),(14,7),(15,8), (1,9);
  • 38. Tree Query Initiation WITH RECURSIVE t(node, path) AS ( SELECT id, ARRAY[id] FROM employee WHERE boss_id IS NULL /* Initiation Step */ UNION ALL SELECT e1.id, t.path || ARRAY[e1.id] FROM employee e1 JOIN t ON (e1.boss_id = t.node) WHERE id NOT IN (t.path) ) SELECT CASE WHEN array_upper(path,1)>1 THEN '+-' ELSE '' END || REPEAT('--', array_upper(path,1)-2) || node AS "Branch" FROM t ORDER BY path;
  • 39. Tree Query Recursion WITH RECURSIVE t(node, path) AS ( SELECT id, ARRAY[id] FROM employee WHERE boss_id IS NULL UNION ALL SELECT e1.id, t.path || ARRAY[e1.id] /* Recursion */ FROM employee e1 JOIN t ON (e1.boss_id = t.node) WHERE id NOT IN (t.path) ) SELECT CASE WHEN array_upper(path,1)>1 THEN '+-' ELSE '' END || REPEAT('--', array_upper(path,1)-2) || node AS "Branch" FROM t ORDER BY path;
  • 40. Tree Query Termination WITH RECURSIVE t(node, path) AS ( SELECT id, ARRAY[id] FROM employee WHERE boss_id IS NULL UNION ALL SELECT e1.id, t.path || ARRAY[e1.id] FROM employee e1 JOIN t ON (e1.boss_id = t.node) WHERE id NOT IN (t.path) /* Termination Condition */ ) SELECT CASE WHEN array_upper(path,1)>1 THEN '+-' ELSE '' END || REPEAT('--', array_upper(path,1)-2) || node AS "Branch" FROM t ORDER BY path;
  • 41. Tree Query Display WITH RECURSIVE t(node, path) AS ( SELECT id, ARRAY[id] FROM employee WHERE boss_id IS NULL UNION ALL SELECT e1.id, t.path || ARRAY[e1.id] FROM employee e1 JOIN t ON (e1.boss_id = t.node) WHERE id NOT IN (t.path) ) SELECT CASE WHEN array_upper(path,1)>1 THEN '+-' ELSE '' END || REPEAT('--', array_upper(path,1)-2) || node AS "Branch" /* Display */ FROM t ORDER BY path;
  • 42. Tree Query Initiation Branch ---------- 1 +-2 +---5 +-----11 +-----12 +---6 +-----13 +---7 +-----14 +-3 +---8 +-----15 +---9 +-4 +---10 +-9 (16 rows)
  • 43. Travelling Salesman Problem Given a number of cities and the costs of travelling from any city to any other city, what is the least- cost round-trip route that visits each city exactly once and then returns to the starting city?
  • 44. TSP Schema CREATE TABLE pairs ( from_city TEXT NOT NULL, to_city TEXT NOT NULL, distance INTEGER NOT NULL, PRIMARY KEY(from_city, to_city), CHECK (from_city < to_city) );
  • 45. TSP Data INSERT INTO pairs VALUES ('Bari','Bologna',672), ('Bari','Bolzano',939), ('Bari','Firenze',723), ('Bari','Genova',944), ('Bari','Milan',881), ('Bari','Napoli',257), ('Bari','Palermo',708), ('Bari','Reggio Calabria',464), ....
  • 46. TSP Program: Symmetric Setup WITH RECURSIVE both_ways( from_city, to_city, distance ) /* Working Table */ AS ( SELECT from_city, to_city, distance FROM pairs UNION ALL SELECT to_city AS "from_city", from_city AS "to_city", distance FROM pairs ),
  • 47. TSP Program: Symmetric Setup WITH RECURSIVE both_ways( from_city, to_city, distance ) AS (/* Distances One Way */ SELECT from_city, to_city, distance FROM pairs UNION ALL SELECT to_city AS "from_city", from_city AS "to_city", distance FROM pairs ),
  • 48. TSP Program: Symmetric Setup WITH RECURSIVE both_ways( from_city, to_city, distance ) AS ( SELECT from_city, to_city, distance FROM pairs UNION ALL /* Distances Other Way */ SELECT to_city AS "from_city", from_city AS "to_city", distance FROM pairs ),
  • 49. TSP Program: Path Initialization Step paths ( from_city, to_city, distance, path ) AS ( SELECT from_city, to_city, distance, ARRAY[from_city] AS "path" FROM both_ways b1 WHERE b1.from_city = 'Roma' UNION ALL
  • 50. TSP Program: Path Recursion Step SELECT b2.from_city, b2.to_city, p.distance + b2.distance, p.path || b2.from_city FROM both_ways b2 JOIN paths p ON ( p.to_city = b2.from_city AND b2.from_city <> ALL (p.path[ 2:array_upper(p.path,1) ]) /* Prevent re-tracing */ AND array_upper(p.path,1) < 6 ) )
  • 51. TSP Program: Timely Termination Step SELECT b2.from_city, b2.to_city, p.distance + b2.distance, p.path || b2.from_city FROM both_ways b2 JOIN paths p ON ( p.to_city = b2.from_city AND b2.from_city <> ALL (p.path[ 2:array_upper(p.path,1) ]) /* Prevent re-tracing */ AND array_upper(p.path,1) < 6 /* Timely Termination */ ) )
  • 52. TSP Program: Filter and Display SELECT path || to_city AS "path", distance FROM paths WHERE to_city = 'Roma' AND ARRAY['Milan','Firenze','Napoli'] <@ path ORDER BY distance, path LIMIT 1;
  • 53. TSP Program: Filter and Display davidfetter@tsp=# i travelling_salesman.sql path | distance ----------------------------------+---------- {Roma,Firenze,Milan,Napoli,Roma} | 1553 (1 row) Time: 11679.503 ms
  • 54. TSP Program: Gritty Details QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=44.14..44.15 rows=1 width=68) (actual time=16131.326..16131.327 rows=1 loops=1) CTE both_ways -> Append (cost=0.00..4.64 rows=132 width=19) (actual time=0.011..0.090 rows=132 loops=1) -> Seq Scan on pairs (cost=0.00..1.66 rows=66 width=19) (actual time=0.010..0.026 rows=66 loops=1) -> Seq Scan on pairs (cost=0.00..1.66 rows=66 width=19) (actual time=0.004..0.029 rows=66 loops=1) CTE paths -> Recursive Union (cost=0.00..38.72 rows=31 width=104) (actual time=0.102..10631.385 rows=1092883 loops=1) -> CTE Scan on both_ways b1 (cost=0.00..2.97 rows=1 width=68) (actual time=0.100..0.251 rows=11 loops=1) Filter: (from_city = 'Roma'::text) -> Hash Join (cost=0.29..3.51 rows=3 width=104) (actual time=180.125..958.459 rows=182145 loops=6) Hash Cond: (b2.from_city = p.to_city) Join Filter: (b2.from_city <> ALL (p.path[2:array_upper(p.path, 1)])) -> CTE Scan on both_ways b2 (cost=0.00..2.64 rows=132 width=68) (actual time=0.001..0.056 rows=132 loops=5) -> Hash (cost=0.25..0.25 rows=3 width=68) (actual time=172.292..172.292 rows=22427 loops=6) -> WorkTable Scan on paths p (cost=0.00..0.25 rows=3 width=68) (actual time=123.167..146.659 rows=22427 loops=6) Filter: (array_upper(path, 1) < 6) -> Sort (cost=0.79..0.79 rows=1 width=68) (actual time=16131.324..16131.324 rows=1 loops=1) Sort Key: paths.distance, ((paths.path || paths.to_city)) Sort Method: top-N heapsort Memory: 17kB -> CTE Scan on paths (cost=0.00..0.78 rows=1 width=68) (actual time=36.621..16127.841 rows=4146 loops=1) Filter: (('{Milan,Firenze,Napoli}'::text[] <@ path) AND (to_city = 'Roma'::text)) Total runtime: 16180.335 ms (22 rows)
  • 55. Hausdorf-Besicovich- WTF Dimension WITH RECURSIVE Z(Ix, Iy, Cx, Cy, X, Y, I) AS ( SELECT Ix, Iy, X::float, Y::float, X::float, Y::float, 0 FROM (SELECT -2.2 + 0.031 * i, i FROM generate_series(0,101) AS i) AS xgen(x,ix) CROSS JOIN (SELECT -1.5 + 0.031 * i, i FROM generate_series(0,101) AS i) AS ygen(y,iy) UNION ALL SELECT Ix, Iy, Cx, Cy, X * X - Y * Y + Cx AS X, Y * X * 2 + Cy, I + 1 FROM Z WHERE X * X + Y * Y < 16::float AND I < 100 ),
  • 56. Filter Zt (Ix, Iy, I) AS ( SELECT Ix, Iy, MAX(I) AS I FROM Z GROUP BY Iy, Ix ORDER BY Iy, Ix )
  • 57. Display SELECT array_to_string( array_agg( SUBSTRING( ' .,,,-----++++%%%%@@@@#### ', GREATEST(I,1) ),'' ) FROM Zt GROUP BY Iy ORDER BY Iy;
  • 58.
  • 60. Thank You! Copyright © 2009 David Fetter david.fetter@pgexperts.com All Rights Reserved