SlideShare una empresa de Scribd logo
1 de 38
Analytic and Window Functions in Oracle

           Logan Palanisamy
Agenda

 Difference between aggregate and analytic
  functions
 Introduction to various analytic functions
 Functions that are both aggregate and
  analytic
 Break
 More examples
 Enhanced Aggregation (CUBE, ROLLUP)
Meeting Basics

 Put your phones/pagers on vibrate/mute
 Messenger: Change the status to offline or
  in-meeting
 Remote attendees: Mute yourself (*6). Ask
  questions via Adobe Connect.
Aggregates vs. Analytics
 Aggregate functions
   Rows are collapsed. One row per group

   Non-Group-By columns not allowed in SELECT list.

 Analytic functions
   Rows are not collapsed

   As many rows in the output as in the input

   No restrictions on the columns in the SELECT list

   Evaluated after joins, WHERE, GROUP BY, HAVING clauses

   Nesting not allowed

   Can appear only in the SELECT or ORDER BY clause

 analytic_aggr_diff.sql
Analytics vs. other methods
 Show the dept, empno, sal and the sum of all salaries
  in their dept
 Three possible ways
    Using Joins
    Using Scalar Sub-queries
    Using Analytic Functions

 analytics_vs_others.sql
Anatomy of an analytic funcion
 function (arg1, ..., argN) OVER ([partition_by_clause]
    [order_by_clause [windowing_clause]])
   The OVER keyword
   partition_by_clause: Optional. Not related to table/index
    partitions. Analogous to GROUP BY
   order_by_clause: Mandatory for Ranking and Windowing
    functions. Optional or meaningless for others
   windowing_clause: Optional. Should always be preceded by
    ORDER BY clause
Types of analytical functions
 Ranking functions
 FIRST_VALUE/LAST_VALUE/NTH_VALUE
 Windowing functions
 Reporting functions
 LAG/LEAD
 FIRST/LAST
Ranking Functions
 ROW_NUMBER()
 RANK() – Skips ranks after duplicate ranks
 DENSE_RANK() – Doesn't skip rank after duplicate ranks
 NTILE(n) – Sorts the rows into N equi-sized buckets
 CUME_DIST() – % of rows with values lower or equal
 PERCENT_RANK() - (rank of row -1)/(#rows – 1)
 function OVER ([PARTITION BY <c1,c2..>] ORDER BY
  <c3, ..>)
 PARTITION BY clause: Optional
 ORDER BY clause: Mandatory
 rank_dense_rank.sql
FIRST_VALUE/LAST_VALUE/NTH_VALUE

 Returns the first/last/nth value from an ordered set
 FIRST_VALUE(expr, [IGNORE NULLS]) OVER
    ([partitonby_clause] orderby_clause)
   IGNORE NULLS options helps you "carry forward".
    Often used in "Data Densification"
   Operates on Default Window (unbounded preceding
    and current row) when a window is not explicitly
    specified.
   NTH_VALUE introduced in 11gR2
   flnth_value.sql
Window functions

 Used for computing cumulative/running totals (YTD, MTD,
    QTD), moving/centered averages
   function(args) OVER([partition_by_clause] order_by_clause
    [windowing_clause])
   ORDER BY clause: mandatory.
   Windowing Clause: Optional. Defaults to: UNBOUNDED
    PRECEDING and CURRENT ROW
   anchored or sliding windows
   Two ways to specify windows: ROWS, RANGE
   [ROW | RANGE ] BETWEEN <start_exp> AND <end_exp>
   window.sql
ROWS type windows

 Physical offset. Number of rows before or after current
    row
   Non deterministic results if rows are not sorted uniquely
   Any number of columns in the ORDER BY clause
   ORDER BY columns can be of any type
   function(args) OVER ([partition_by_clause] order by c1,
    .., cN ROWS between <start_exp> and <end_exp>)
   windows_rows.sql
RANGE type Windows

 Logical offset
 non-unique rows treated as one logical row
 Only one column allowed in ORDER BY clause
 ORDER BY column should be numeric or date
 function(args) OVER ([partition_by_clause] order
  by c1 RANGE between <start_exp> and <end_exp>)
 windows_range.sql
Reporting function

 Computes the ratio of a value to the sum of a set of
  values
 RATIO_TO_REPORT(arg) OVER ([PARTITION BY
  <c1, .., cN>]
 PARTITION BY clause: Optional.
 ratio_to_report.sql
LAG/LEAD

 Gives the ability to access other rows without self-join.
 Allows you to treat cursor as an array
 Useful for making inter-row calculations (year-over-year
    comparison, time between events)
   LEAD (expr, <offset>, <default value>) [IGNORE
    NULLS] OVER ([partioning_clause] orderby_clause)
   Physical offset. Can be fixed or varying. default offset is 1
   default value: value returned if offset points to a non-
    existent row
   IGNORE NULLS determines whether null values of are
    included or eliminated from the calculation.
   lead_lag.sql
FIRST/LAST

 Very different from FIRST_VALUE/LAST_VALUE
 Returns the results of aggregate/analytic function applied
    on column B on the first or last ranked rows sorted by
    column A
   function (expr_with_colB) KEEP (DENSE_RANK
    FIRST/LAST ORDER BY colA) [OVER
    (<partitioning_clause)>)]
   Slightly different syntax. Note the word KEEP
   analytic clause is optional.
   first_last.sql
Above/Below average calculation
 Find the list of employees whose salary is higher than
  the department average.
 above_average.sql
Top-N queries
 Find the full details of "set of" employees with the top-
  N salaries
 Find the two most recent hires in each department
 List the names and employee count of departments
  with the highest employee count
 top_n.sql
Top-N% queries
 List the top 5% of the customers by revenue
 top_np.sql
Multi Top-N queries
 For each customer, find out
   the maximum sale in the last 7 days

   the date of that sale

   the maximum sale in the last 30 days

   the date of that sale

 multi_top.sql
De-Duping
 Deleting duplicate records
 dedup.txt
Hypothetical Ranking
 CUME_DIST, DENSE_RANK, RANK,
  PERCENT_RANK
 Used for what-if analysis
 hypothetical_rank.sql
Inverse Percentile functions

 Return the value corresponding to a certain
    percentile (opposite of CUME_DIST)
   PERCENTILE_CONT (continuous)
   PERCENTILE_DISC (discrete)
   PERCENTILE_CONT(0.5) is the same as MEDIAN
   inverse_p.sql
String Aggregation: LISTAGG, STRAGG

 Concatenated string of values for a particular group
    (e.g. employees working in a dept)
   Tom Kyte's STRAGG
   11gR2 has LISTAGG
   10g has COLLECT
   listagg.sql
Pivoting/Unpivoting
 Pivoting
   transposes rows to columns

   DECODE/CASE and GROUP BY used

 Unpivoting
   Converts columns to rows

   Join the base table with a one column serial number table



 11gR2 introduced PIVOT and UNPIVOT clauses to
  SELECT
 pivot.sql
Data Densification

 Data normally stored in sparse form (e.g. No rows
    if there is no sales for a particular period)
   Missing data needed for comparison (e.g. month-
    over-month comparison)
   Data Densification comes in handy
   LAG (col, INGORE NULLS), and PARTITION BY
    OUTER JOIN are used.
   http://hoopercharles.wordpress.com/2009/12/07
    /sql-filling-in-gaps-in-the-source-data/
When not to use analytics
 When a simple group by would do the job
 when_not_to_use_analytics.sq
Drawback of analytics
 Lot of sorting.
 Set
  PGA_AGGREGATE_TARGET/SORT_AREA_SIZE
  appropriately
 New versions reduce the number of sorts (same
  partition_by and order_by clauses on multiple
  analytic functions use single sort)
 http://asktom.oracle.com/pls/asktom/f?p=100:11:0::
  NO::P11_QUESTION_ID:1137250200346660664
 http://jonathanlewis.wordpress.com/2009/09/07/an
  alytic-agony/
Recap of Analytic Functions
 Analytic Functions:
   Were introduced in 8.1.6 (~1998)

   Are supported within PL/SQL only from 10g. Use "view" or
    "dynamic sql" older versions.
   Compute the 'aggregates' while preserving the 'details'

   Eliminate the need for self-joins or multiple passes on the same
    table
   Reduce the amount of data transferred between DB and client.

   Can be used only in SELECT and ORDER BY clauses. Use sub-
    queries if there is a need to filter.
   Are computed at the end - after join, where, group by, having
    clauses
Advanced Aggregation

 GROUP BY col1, col2
 GROUP BY ROLLUP(col1, col2)
 GROUP BY CUBE(col1, col2)
 GROUP BY GROUPING SETS ((col1, col2), col1)
ROLLUP

 GROUP BY ROLLUP(col1, col2)
 Generates subtotals automatically
 Generally used in hierarchical dimensions (region,
  state, city), (year, quarter, month, day)
 n + 1 different groupings where n is the number of
  expressions in the ROLLUP operator in the GROUP
  BY clause.
 Order of the columns in ROLLUP matter.
  ROLLUP(col1, col2), ROLLUP(col2, col1) produce
  different outputs
CUBE


 GROUP BY CUBE(col1, col2)
 Gives subtotals automatically for every possible combination
 Used in cross-tabular reports.
 Suitable when dimensions are independent of each other
 2n different groupings where n is the number of expressions
  in the CUBE operator in the GROUP BY clause.
 Have to be careful with higher values for n
 Order of the columns in CUBE doesn’t really matter.
  CUBE(col1, col2), CUBE(col2, col1) produce same results, but
  in a different order.
Grouping Sets

 GROUP BY GROUPING SETS (col1, (col1, col2))
 Explicitly lists the needed groupings
 GROUPING, GROUPING_ID, GROUP_ID functions
  help you differentiate one grouping from the other.
 Advanced aggregation functions more efficient than
  their UNION ALL equivalents (why?)
        Grouping   Equivalent GROUPING SETS
 advanced_agg.sql
       CUBE(a,b)     GROUPING SETS((a,b), (a), (b), ())
       ROLLUP(a,b)   GROUPING SETS((a,b), (a), ())
       ROLLUP(b,a)   GROUPING SETS((a,b), (b), ())
       ROLLUP(a)     GROUPING SETS((a), ())
Composite Columns

 Treat multiple columns as a single column
 Composite_columns.sql
Concatenated Groupings

 GROUP BY GROUPING SETs (a,b), GROUPING
  SETS (c,d)
 The above is same as GROUP BY GROUPING SETS
  ((a,c), (a,d), (b,c), (b,d))
References

 http://orafaq.com/node/55
 http://orafaq.com/node/56
 http://www.orafaq.com/node/1874
 http://www.morganslibrary.org/reference/a
  nalytic_functions.html
 http://morganslibrary.org/reference/rollup.
  html
 http://www.oracle.com/technology/oramag/
  oracle/05-mar/o25dba.html
 http://www.gennick.com/magic.html
References

 Chapter 12 of "Expert Database Architecture"
  by Tom Kyte
 Business-Savy SQL by Ganesh Variar, Oracle
  magazine, Mar/April 2002
 http://asktom.oracle.com/pls/asktom/asktom.searc
  h?p_string=rock+and+roll
 http://forums.oracle.com/forums/search.jspa?q=an
  alytic&objID=f75&dateRange=all&numResults=30&
  forumID=75&rankBy=10001&start=0
Q&A

 devel_oracle@
Predicate merging in views with analytics
 create view v select .. over(partition by ...) from t;
 select ... from v where col1 = 'A'
 In some cases predicates don't get merged.
 Reasons:
     http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QU
      ESTION_ID:12864646978683#30266389821111
     http://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_
      QUESTION_ID:1137250200346660664
     http://forums.oracle.com/forums/thread.jspa?messageID=416
      9151&#4169151

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Introduction of sql server indexing
Introduction of sql server indexingIntroduction of sql server indexing
Introduction of sql server indexing
 
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsOracle Performance Tuning Fundamentals
Oracle Performance Tuning Fundamentals
 
MySQL partitions tutorial
MySQL partitions tutorialMySQL partitions tutorial
MySQL partitions tutorial
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
Stored procedure
Stored procedureStored procedure
Stored procedure
 
Views, Triggers, Functions, Stored Procedures, Indexing and Joins
Views, Triggers, Functions, Stored Procedures,  Indexing and JoinsViews, Triggers, Functions, Stored Procedures,  Indexing and Joins
Views, Triggers, Functions, Stored Procedures, Indexing and Joins
 
Oracle statistics by example
Oracle statistics by exampleOracle statistics by example
Oracle statistics by example
 
SQL Outer Joins for Fun and Profit
SQL Outer Joins for Fun and ProfitSQL Outer Joins for Fun and Profit
SQL Outer Joins for Fun and Profit
 
Write Faster SQL with Trino.pdf
Write Faster SQL with Trino.pdfWrite Faster SQL with Trino.pdf
Write Faster SQL with Trino.pdf
 
MongoDB Aggregations Indexing and Profiling
MongoDB Aggregations Indexing and ProfilingMongoDB Aggregations Indexing and Profiling
MongoDB Aggregations Indexing and Profiling
 
Postgresql
PostgresqlPostgresql
Postgresql
 
Oracle sql high performance tuning
Oracle sql high performance tuningOracle sql high performance tuning
Oracle sql high performance tuning
 
Triggers and Stored Procedures
Triggers and Stored ProceduresTriggers and Stored Procedures
Triggers and Stored Procedures
 
An Introduction To Oracle Database
An Introduction To Oracle DatabaseAn Introduction To Oracle Database
An Introduction To Oracle Database
 
SQL subquery
SQL subquerySQL subquery
SQL subquery
 
Sql server performance tuning
Sql server performance tuningSql server performance tuning
Sql server performance tuning
 
[2015-06-12] Oracle 성능 최적화 및 품질 고도화 1
[2015-06-12] Oracle 성능 최적화 및 품질 고도화 1[2015-06-12] Oracle 성능 최적화 및 품질 고도화 1
[2015-06-12] Oracle 성능 최적화 및 품질 고도화 1
 
Performance tuning in sql server
Performance tuning in sql serverPerformance tuning in sql server
Performance tuning in sql server
 
Sql Constraints
Sql ConstraintsSql Constraints
Sql Constraints
 
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and AdvisorsYour tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
 

Similar a Analytic & Windowing functions in oracle

Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007
paulguerin
 
PostgreSQL 8.4 TriLUG 2009-11-12
PostgreSQL 8.4 TriLUG 2009-11-12PostgreSQL 8.4 TriLUG 2009-11-12
PostgreSQL 8.4 TriLUG 2009-11-12
Andrew Dunstan
 
SQL select statement and functions
SQL select statement and functionsSQL select statement and functions
SQL select statement and functions
Vikas Gupta
 
Key functions in_oracle_sql
Key functions in_oracle_sqlKey functions in_oracle_sql
Key functions in_oracle_sql
pgolhar
 

Similar a Analytic & Windowing functions in oracle (20)

Oracle_Analytical_function.pdf
Oracle_Analytical_function.pdfOracle_Analytical_function.pdf
Oracle_Analytical_function.pdf
 
Advanced functions in PL SQL
Advanced functions in PL SQLAdvanced functions in PL SQL
Advanced functions in PL SQL
 
Oracle SQL Advanced
Oracle SQL AdvancedOracle SQL Advanced
Oracle SQL Advanced
 
Oracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansOracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive Plans
 
Introduction to oracle functions
Introduction to oracle functionsIntroduction to oracle functions
Introduction to oracle functions
 
Advanced Sql Training
Advanced Sql TrainingAdvanced Sql Training
Advanced Sql Training
 
Introduction to Oracle Functions--(SQL)--Abhishek Sharma
Introduction to Oracle Functions--(SQL)--Abhishek SharmaIntroduction to Oracle Functions--(SQL)--Abhishek Sharma
Introduction to Oracle Functions--(SQL)--Abhishek Sharma
 
Simplifying SQL with CTE's and windowing functions
Simplifying SQL with CTE's and windowing functionsSimplifying SQL with CTE's and windowing functions
Simplifying SQL with CTE's and windowing functions
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007
 
5. Group Functions
5. Group Functions5. Group Functions
5. Group Functions
 
Sql interview questions
Sql interview questionsSql interview questions
Sql interview questions
 
PostgreSQL 8.4 TriLUG 2009-11-12
PostgreSQL 8.4 TriLUG 2009-11-12PostgreSQL 8.4 TriLUG 2009-11-12
PostgreSQL 8.4 TriLUG 2009-11-12
 
Oracle Advanced SQL and Analytic Functions
Oracle Advanced SQL and Analytic FunctionsOracle Advanced SQL and Analytic Functions
Oracle Advanced SQL and Analytic Functions
 
Adv.+SQL+PPT+final.pptx
Adv.+SQL+PPT+final.pptxAdv.+SQL+PPT+final.pptx
Adv.+SQL+PPT+final.pptx
 
Oracle: Functions
Oracle: FunctionsOracle: Functions
Oracle: Functions
 
Oracle: Functions
Oracle: FunctionsOracle: Functions
Oracle: Functions
 
SQL Server 2008 Performance Enhancements
SQL Server 2008 Performance EnhancementsSQL Server 2008 Performance Enhancements
SQL Server 2008 Performance Enhancements
 
SQL select statement and functions
SQL select statement and functionsSQL select statement and functions
SQL select statement and functions
 
Mandatory sql functions for beginners
Mandatory sql functions for beginnersMandatory sql functions for beginners
Mandatory sql functions for beginners
 
Key functions in_oracle_sql
Key functions in_oracle_sqlKey functions in_oracle_sql
Key functions in_oracle_sql
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Analytic & Windowing functions in oracle

  • 1. Analytic and Window Functions in Oracle Logan Palanisamy
  • 2. Agenda  Difference between aggregate and analytic functions  Introduction to various analytic functions  Functions that are both aggregate and analytic  Break  More examples  Enhanced Aggregation (CUBE, ROLLUP)
  • 3. Meeting Basics  Put your phones/pagers on vibrate/mute  Messenger: Change the status to offline or in-meeting  Remote attendees: Mute yourself (*6). Ask questions via Adobe Connect.
  • 4. Aggregates vs. Analytics  Aggregate functions  Rows are collapsed. One row per group  Non-Group-By columns not allowed in SELECT list.  Analytic functions  Rows are not collapsed  As many rows in the output as in the input  No restrictions on the columns in the SELECT list  Evaluated after joins, WHERE, GROUP BY, HAVING clauses  Nesting not allowed  Can appear only in the SELECT or ORDER BY clause  analytic_aggr_diff.sql
  • 5. Analytics vs. other methods  Show the dept, empno, sal and the sum of all salaries in their dept  Three possible ways  Using Joins  Using Scalar Sub-queries  Using Analytic Functions  analytics_vs_others.sql
  • 6. Anatomy of an analytic funcion  function (arg1, ..., argN) OVER ([partition_by_clause] [order_by_clause [windowing_clause]])  The OVER keyword  partition_by_clause: Optional. Not related to table/index partitions. Analogous to GROUP BY  order_by_clause: Mandatory for Ranking and Windowing functions. Optional or meaningless for others  windowing_clause: Optional. Should always be preceded by ORDER BY clause
  • 7. Types of analytical functions  Ranking functions  FIRST_VALUE/LAST_VALUE/NTH_VALUE  Windowing functions  Reporting functions  LAG/LEAD  FIRST/LAST
  • 8. Ranking Functions  ROW_NUMBER()  RANK() – Skips ranks after duplicate ranks  DENSE_RANK() – Doesn't skip rank after duplicate ranks  NTILE(n) – Sorts the rows into N equi-sized buckets  CUME_DIST() – % of rows with values lower or equal  PERCENT_RANK() - (rank of row -1)/(#rows – 1)  function OVER ([PARTITION BY <c1,c2..>] ORDER BY <c3, ..>)  PARTITION BY clause: Optional  ORDER BY clause: Mandatory  rank_dense_rank.sql
  • 9. FIRST_VALUE/LAST_VALUE/NTH_VALUE  Returns the first/last/nth value from an ordered set  FIRST_VALUE(expr, [IGNORE NULLS]) OVER ([partitonby_clause] orderby_clause)  IGNORE NULLS options helps you "carry forward". Often used in "Data Densification"  Operates on Default Window (unbounded preceding and current row) when a window is not explicitly specified.  NTH_VALUE introduced in 11gR2  flnth_value.sql
  • 10. Window functions  Used for computing cumulative/running totals (YTD, MTD, QTD), moving/centered averages  function(args) OVER([partition_by_clause] order_by_clause [windowing_clause])  ORDER BY clause: mandatory.  Windowing Clause: Optional. Defaults to: UNBOUNDED PRECEDING and CURRENT ROW  anchored or sliding windows  Two ways to specify windows: ROWS, RANGE  [ROW | RANGE ] BETWEEN <start_exp> AND <end_exp>  window.sql
  • 11. ROWS type windows  Physical offset. Number of rows before or after current row  Non deterministic results if rows are not sorted uniquely  Any number of columns in the ORDER BY clause  ORDER BY columns can be of any type  function(args) OVER ([partition_by_clause] order by c1, .., cN ROWS between <start_exp> and <end_exp>)  windows_rows.sql
  • 12. RANGE type Windows  Logical offset  non-unique rows treated as one logical row  Only one column allowed in ORDER BY clause  ORDER BY column should be numeric or date  function(args) OVER ([partition_by_clause] order by c1 RANGE between <start_exp> and <end_exp>)  windows_range.sql
  • 13. Reporting function  Computes the ratio of a value to the sum of a set of values  RATIO_TO_REPORT(arg) OVER ([PARTITION BY <c1, .., cN>]  PARTITION BY clause: Optional.  ratio_to_report.sql
  • 14. LAG/LEAD  Gives the ability to access other rows without self-join.  Allows you to treat cursor as an array  Useful for making inter-row calculations (year-over-year comparison, time between events)  LEAD (expr, <offset>, <default value>) [IGNORE NULLS] OVER ([partioning_clause] orderby_clause)  Physical offset. Can be fixed or varying. default offset is 1  default value: value returned if offset points to a non- existent row  IGNORE NULLS determines whether null values of are included or eliminated from the calculation.  lead_lag.sql
  • 15. FIRST/LAST  Very different from FIRST_VALUE/LAST_VALUE  Returns the results of aggregate/analytic function applied on column B on the first or last ranked rows sorted by column A  function (expr_with_colB) KEEP (DENSE_RANK FIRST/LAST ORDER BY colA) [OVER (<partitioning_clause)>)]  Slightly different syntax. Note the word KEEP  analytic clause is optional.  first_last.sql
  • 16. Above/Below average calculation  Find the list of employees whose salary is higher than the department average.  above_average.sql
  • 17. Top-N queries  Find the full details of "set of" employees with the top- N salaries  Find the two most recent hires in each department  List the names and employee count of departments with the highest employee count  top_n.sql
  • 18. Top-N% queries  List the top 5% of the customers by revenue  top_np.sql
  • 19. Multi Top-N queries  For each customer, find out  the maximum sale in the last 7 days  the date of that sale  the maximum sale in the last 30 days  the date of that sale  multi_top.sql
  • 20. De-Duping  Deleting duplicate records  dedup.txt
  • 21. Hypothetical Ranking  CUME_DIST, DENSE_RANK, RANK, PERCENT_RANK  Used for what-if analysis  hypothetical_rank.sql
  • 22. Inverse Percentile functions  Return the value corresponding to a certain percentile (opposite of CUME_DIST)  PERCENTILE_CONT (continuous)  PERCENTILE_DISC (discrete)  PERCENTILE_CONT(0.5) is the same as MEDIAN  inverse_p.sql
  • 23. String Aggregation: LISTAGG, STRAGG  Concatenated string of values for a particular group (e.g. employees working in a dept)  Tom Kyte's STRAGG  11gR2 has LISTAGG  10g has COLLECT  listagg.sql
  • 24. Pivoting/Unpivoting  Pivoting  transposes rows to columns  DECODE/CASE and GROUP BY used  Unpivoting  Converts columns to rows  Join the base table with a one column serial number table  11gR2 introduced PIVOT and UNPIVOT clauses to SELECT  pivot.sql
  • 25. Data Densification  Data normally stored in sparse form (e.g. No rows if there is no sales for a particular period)  Missing data needed for comparison (e.g. month- over-month comparison)  Data Densification comes in handy  LAG (col, INGORE NULLS), and PARTITION BY OUTER JOIN are used.  http://hoopercharles.wordpress.com/2009/12/07 /sql-filling-in-gaps-in-the-source-data/
  • 26. When not to use analytics  When a simple group by would do the job  when_not_to_use_analytics.sq
  • 27. Drawback of analytics  Lot of sorting.  Set PGA_AGGREGATE_TARGET/SORT_AREA_SIZE appropriately  New versions reduce the number of sorts (same partition_by and order_by clauses on multiple analytic functions use single sort)  http://asktom.oracle.com/pls/asktom/f?p=100:11:0:: NO::P11_QUESTION_ID:1137250200346660664  http://jonathanlewis.wordpress.com/2009/09/07/an alytic-agony/
  • 28. Recap of Analytic Functions  Analytic Functions:  Were introduced in 8.1.6 (~1998)  Are supported within PL/SQL only from 10g. Use "view" or "dynamic sql" older versions.  Compute the 'aggregates' while preserving the 'details'  Eliminate the need for self-joins or multiple passes on the same table  Reduce the amount of data transferred between DB and client.  Can be used only in SELECT and ORDER BY clauses. Use sub- queries if there is a need to filter.  Are computed at the end - after join, where, group by, having clauses
  • 29. Advanced Aggregation  GROUP BY col1, col2  GROUP BY ROLLUP(col1, col2)  GROUP BY CUBE(col1, col2)  GROUP BY GROUPING SETS ((col1, col2), col1)
  • 30. ROLLUP  GROUP BY ROLLUP(col1, col2)  Generates subtotals automatically  Generally used in hierarchical dimensions (region, state, city), (year, quarter, month, day)  n + 1 different groupings where n is the number of expressions in the ROLLUP operator in the GROUP BY clause.  Order of the columns in ROLLUP matter. ROLLUP(col1, col2), ROLLUP(col2, col1) produce different outputs
  • 31. CUBE  GROUP BY CUBE(col1, col2)  Gives subtotals automatically for every possible combination  Used in cross-tabular reports.  Suitable when dimensions are independent of each other  2n different groupings where n is the number of expressions in the CUBE operator in the GROUP BY clause.  Have to be careful with higher values for n  Order of the columns in CUBE doesn’t really matter. CUBE(col1, col2), CUBE(col2, col1) produce same results, but in a different order.
  • 32. Grouping Sets  GROUP BY GROUPING SETS (col1, (col1, col2))  Explicitly lists the needed groupings  GROUPING, GROUPING_ID, GROUP_ID functions help you differentiate one grouping from the other.  Advanced aggregation functions more efficient than their UNION ALL equivalents (why?) Grouping Equivalent GROUPING SETS  advanced_agg.sql CUBE(a,b) GROUPING SETS((a,b), (a), (b), ()) ROLLUP(a,b) GROUPING SETS((a,b), (a), ()) ROLLUP(b,a) GROUPING SETS((a,b), (b), ()) ROLLUP(a) GROUPING SETS((a), ())
  • 33. Composite Columns  Treat multiple columns as a single column  Composite_columns.sql
  • 34. Concatenated Groupings  GROUP BY GROUPING SETs (a,b), GROUPING SETS (c,d)  The above is same as GROUP BY GROUPING SETS ((a,c), (a,d), (b,c), (b,d))
  • 35. References  http://orafaq.com/node/55  http://orafaq.com/node/56  http://www.orafaq.com/node/1874  http://www.morganslibrary.org/reference/a nalytic_functions.html  http://morganslibrary.org/reference/rollup. html  http://www.oracle.com/technology/oramag/ oracle/05-mar/o25dba.html  http://www.gennick.com/magic.html
  • 36. References  Chapter 12 of "Expert Database Architecture" by Tom Kyte  Business-Savy SQL by Ganesh Variar, Oracle magazine, Mar/April 2002  http://asktom.oracle.com/pls/asktom/asktom.searc h?p_string=rock+and+roll  http://forums.oracle.com/forums/search.jspa?q=an alytic&objID=f75&dateRange=all&numResults=30& forumID=75&rankBy=10001&start=0
  • 38. Predicate merging in views with analytics  create view v select .. over(partition by ...) from t;  select ... from v where col1 = 'A'  In some cases predicates don't get merged.  Reasons:  http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QU ESTION_ID:12864646978683#30266389821111  http://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_ QUESTION_ID:1137250200346660664  http://forums.oracle.com/forums/thread.jspa?messageID=416 9151&#4169151