SlideShare una empresa de Scribd logo
1 de 13
March 2010 ACS-4904 Ron McFadyen 1
Aggregate management
References:
Lawrence Corr
• Aggregate improvement
www.intelligententerprise.com/011004/415warehouse1_1.jhtml
• Lost, shrunken, and collapsed
www.intelligententerprise.com/020101/501warehouse1_1.jhtml
Ralph Kimball
• Aggregate navigation with (almost) no metadata
www.dbmsmag.com/9608d54.html
March 2010 ACS-4904 Ron McFadyen 2
Aggregate Navigation
From “20 criteria for dimensional friendly systems”
#4. Open Aggregate Navigation.
The system uses physically stored aggregates as a way to
enhance performance of common queries. These
aggregates, like indexes, are chosen silently by the
database if they are physically present. End users and
application developers do not need to know what
aggregates are available at any point in time, and
applications are not required to explicitly code the name
of an aggregate. All query processes accessing the data,
even those from different application vendors, realize the
full benefit of aggregate navigation.
March 2010 ACS-4904 Ron McFadyen 3
Aggregation
• Created for performance reasons
• Using one base star schema, many aggregates can be defined
• Some dimensions could be lost, some shrunken or collapsed –
see Corr’s articles
Customer Date
Product
Store
Transaction
Price
Cost
Tax
Qty
Consider the
Transaction-level schema:
March 2010 ACS-4904 Ron McFadyen 4
Aggregation
Customer Date
Product
Store
Transaction
Price
Cost
Tax
Qty
Month
Product
Store
Price
Cost
Tax
Qty
Transaction-level schema Aggregate schema with:
•lost dimensions (transaction, customer)
•shrunken dimension (month)
•metrics are accumulated over the
transactions, customers, month
March 2010 ACS-4904 Ron McFadyen 5
Aggregation
• Suppose you must create the schema:
• What are the steps to do so?
– Product and Store already exist, but the others must be
created
Month
Product
Store
Price
Cost
Tax
Qty
March 2010 ACS-4904 Ron McFadyen 6
Aggregation
Design Goal 1
• Aggregates must be stored in their own fact tables; each
distinct aggregation level must occupy its own unique fact
table
March 2010 ACS-4904 Ron McFadyen 7
Aggregation
Design Goal 2
• The dimension tables attached to the aggregate fact tables
must be shrunken versions of the dimension tables
associated with the base fact table
Product
ProductSK
SKU
Description
Brand
Category
Department
Category
SK?
Category
Department
Category is a
shrunken version
of Product
Attributes names
from Product
What values does
the PK of Category
assume?
What name should
the PK of Category
have?
March 2010 ACS-4904 Ron McFadyen 8
Aggregation
Design Goal 3
• The base atomic fact table and all of its related aggregate
fact tables must be associated together as a “family of
schemas” so that the aggregate navigator knows which
tables are related to one another.
– Kimball’s paper presents an algorithm/technique for
mapping an SQL statement from a base schema to an
aggregate schema
March 2010 ACS-4904 Ron McFadyen 9
Aggregation
Design Goal 4
• Force all SQL created by any end-user data access tool or
application to refer exclusively to the base fact table and
its associated full-size dimension tables.
March 2010 ACS-4904 Ron McFadyen 10
Kimball’s Aggregate Navigation Algorithm
1. For any given SQL statement presented to the DBMS, find the smallest
fact table that has not yet been examined in the family of schemas
referenced by the query. "Smallest" in this case means the least
number of rows. Finding the smallest unexamined schema is a simple
lookup in the aggregate navigator metadata table. Choose the smallest
schema and proceed to step 2.
2. Compare the table fields in the SQL statement to the table fields in the
particular Fact and Dimension tables being examined.
This is a series of lookups in the DBMS system catalog.
If all of the fields in the SQL statement can be found in the Fact
and Dimension tables being examined, alter the original
SQL by simply substituting destination table names for
original table names. No field names need to change.
If any field in the SQL statement cannot be found in the current
Fact and Dimension tables, then go back to step 1 and find
the next larger Fact table.
3. Run the altered SQL. It is guaranteed to return the correct answer
because all of the fields in the SQL statement are present in the
chosen schema.
March 2010 ACS-4904 Ron McFadyen 11
Aggregate Navigation
www.intelligententerprise.com/000929/feat1.jhtml
“Other data mart capabilities worth mentioning are
aggregation and aggregate navigation. The nature of
multidimensional analysis dictates the extensive use of
aggregation.
…
Both UDB and O8i provide a database-resident means for
navigating user queries to the optimum aggregated tables”
March 2010 ACS-4904 Ron McFadyen 12
Materialized Views
An MV is a view where the result set is pre-computed and
stored in the database
Periodically an MV must be re-computed to account for updates
to its source tables
MVs represent a performance enhancement to a DBMS as the
result set is immediately available when the view is referenced
in a SQL statement
MVs are used in some cases to provide summary or aggregates
for data warehousing transparently to the end-user
Query optimization is a DBMS feature that may re-write an
SQL statement supplied by a user.
March 2010 ACS-4904 Ron McFadyen 13
Materialized Views
Oracle example:
CREATE MATERIALIZED VIEW hr.mview_employees AS
SELECT employees.employee_id, employees.email
FROM employees
UNION ALL
SELECT new_employees.employee_id, new_employees.email
FROM new_employees;
SQL Server has “indexed views” … see
http://www.databasejournal.com/features/mssql/article.php/2119721

Más contenido relacionado

La actualidad más candente (20)

Lecture 04 normalization
Lecture 04 normalization Lecture 04 normalization
Lecture 04 normalization
 
Balanced Tree (AVL Tree & Red-Black Tree)
Balanced Tree (AVL Tree & Red-Black Tree)Balanced Tree (AVL Tree & Red-Black Tree)
Balanced Tree (AVL Tree & Red-Black Tree)
 
Data structures - unit 1
Data structures - unit 1Data structures - unit 1
Data structures - unit 1
 
Heaps
HeapsHeaps
Heaps
 
Arrays
ArraysArrays
Arrays
 
SQL Views
SQL ViewsSQL Views
SQL Views
 
Graph representation
Graph representationGraph representation
Graph representation
 
SQL DDL
SQL DDLSQL DDL
SQL DDL
 
SQL Basics
SQL BasicsSQL Basics
SQL Basics
 
Charts and pivot tables
Charts and pivot tablesCharts and pivot tables
Charts and pivot tables
 
Normalization in DBMS
Normalization in DBMSNormalization in DBMS
Normalization in DBMS
 
Sorting Algorithms
Sorting AlgorithmsSorting Algorithms
Sorting Algorithms
 
B+ Tree
B+ TreeB+ Tree
B+ Tree
 
introdution to SQL and SQL functions
introdution to SQL and SQL functionsintrodution to SQL and SQL functions
introdution to SQL and SQL functions
 
Normal forms
Normal formsNormal forms
Normal forms
 
TYPES DATA STRUCTURES( LINEAR AND NON LINEAR)....
TYPES DATA STRUCTURES( LINEAR AND NON LINEAR)....TYPES DATA STRUCTURES( LINEAR AND NON LINEAR)....
TYPES DATA STRUCTURES( LINEAR AND NON LINEAR)....
 
Algorithms Lecture 4: Sorting Algorithms I
Algorithms Lecture 4: Sorting Algorithms IAlgorithms Lecture 4: Sorting Algorithms I
Algorithms Lecture 4: Sorting Algorithms I
 
Normalization in databases
Normalization in databasesNormalization in databases
Normalization in databases
 
database Normalization
database Normalizationdatabase Normalization
database Normalization
 
Stack using Array
Stack using ArrayStack using Array
Stack using Array
 

Destacado

Destacado (6)

Difference between association, aggregation and composition
Difference between association, aggregation and compositionDifference between association, aggregation and composition
Difference between association, aggregation and composition
 
Aggregate fact tables
Aggregate fact tablesAggregate fact tables
Aggregate fact tables
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & Aggregations
 
Aggregate - Concrete Technology
Aggregate - Concrete TechnologyAggregate - Concrete Technology
Aggregate - Concrete Technology
 
Aggregates
 Aggregates Aggregates
Aggregates
 
Aggregates
AggregatesAggregates
Aggregates
 

Similar a Aggregation

Database performance tuning and query optimization
Database performance tuning and query optimizationDatabase performance tuning and query optimization
Database performance tuning and query optimizationUsman Tariq
 
Cognos framework manager
Cognos framework managerCognos framework manager
Cognos framework managermaxonlinetr
 
Process management seminar
Process management seminarProcess management seminar
Process management seminarapurva_naik
 
Solr and ElasticSearch demo and speaker feb 2014
Solr  and ElasticSearch demo and speaker feb 2014Solr  and ElasticSearch demo and speaker feb 2014
Solr and ElasticSearch demo and speaker feb 2014nkabra
 
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...Dave Stokes
 
World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...
World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...
World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...Karthik K Iyengar
 
Best Practices for Oracle Exadata and the Oracle Optimizer
Best Practices for Oracle Exadata and the Oracle OptimizerBest Practices for Oracle Exadata and the Oracle Optimizer
Best Practices for Oracle Exadata and the Oracle OptimizerEdgar Alejandro Villegas
 
Workload Management with MicroStrategy Software and IBM DB2 9.5
Workload Management with MicroStrategy Software and IBM DB2 9.5Workload Management with MicroStrategy Software and IBM DB2 9.5
Workload Management with MicroStrategy Software and IBM DB2 9.5BiBoard.Org
 
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersSQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersLucidworks
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cRonald Francisco Vargas Quesada
 
Democratization of NOSQL Document-Database over Relational Database Comparati...
Democratization of NOSQL Document-Database over Relational Database Comparati...Democratization of NOSQL Document-Database over Relational Database Comparati...
Democratization of NOSQL Document-Database over Relational Database Comparati...IRJET Journal
 
BigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery MLBigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery MLMárton Kodok
 
SQL Server vs Oracle.pdf
SQL Server vs Oracle.pdfSQL Server vs Oracle.pdf
SQL Server vs Oracle.pdfAlexadiaz52
 
SQL Server vs Oracle.pdf
SQL Server vs Oracle.pdfSQL Server vs Oracle.pdf
SQL Server vs Oracle.pdfAlexadiaz52
 
A Practical Guide to Database Design.pdf
A Practical Guide to Database Design.pdfA Practical Guide to Database Design.pdf
A Practical Guide to Database Design.pdfAllison Koehn
 
CaseStudy-MohammedImranAlam-Xcelsius
CaseStudy-MohammedImranAlam-XcelsiusCaseStudy-MohammedImranAlam-Xcelsius
CaseStudy-MohammedImranAlam-XcelsiusMohammed Imran Alam
 
Data warehousing features in oracle
Data warehousing features in oracleData warehousing features in oracle
Data warehousing features in oracleJinal Shah
 
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSQUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSijcsit
 

Similar a Aggregation (20)

Database performance tuning and query optimization
Database performance tuning and query optimizationDatabase performance tuning and query optimization
Database performance tuning and query optimization
 
F04302053057
F04302053057F04302053057
F04302053057
 
Cognos framework manager
Cognos framework managerCognos framework manager
Cognos framework manager
 
Process management seminar
Process management seminarProcess management seminar
Process management seminar
 
Solr and ElasticSearch demo and speaker feb 2014
Solr  and ElasticSearch demo and speaker feb 2014Solr  and ElasticSearch demo and speaker feb 2014
Solr and ElasticSearch demo and speaker feb 2014
 
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
 
World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...
World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...
World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...
 
dd presentation.pdf
dd presentation.pdfdd presentation.pdf
dd presentation.pdf
 
Best Practices for Oracle Exadata and the Oracle Optimizer
Best Practices for Oracle Exadata and the Oracle OptimizerBest Practices for Oracle Exadata and the Oracle Optimizer
Best Practices for Oracle Exadata and the Oracle Optimizer
 
Workload Management with MicroStrategy Software and IBM DB2 9.5
Workload Management with MicroStrategy Software and IBM DB2 9.5Workload Management with MicroStrategy Software and IBM DB2 9.5
Workload Management with MicroStrategy Software and IBM DB2 9.5
 
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersSQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12c
 
Democratization of NOSQL Document-Database over Relational Database Comparati...
Democratization of NOSQL Document-Database over Relational Database Comparati...Democratization of NOSQL Document-Database over Relational Database Comparati...
Democratization of NOSQL Document-Database over Relational Database Comparati...
 
BigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery MLBigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery ML
 
SQL Server vs Oracle.pdf
SQL Server vs Oracle.pdfSQL Server vs Oracle.pdf
SQL Server vs Oracle.pdf
 
SQL Server vs Oracle.pdf
SQL Server vs Oracle.pdfSQL Server vs Oracle.pdf
SQL Server vs Oracle.pdf
 
A Practical Guide to Database Design.pdf
A Practical Guide to Database Design.pdfA Practical Guide to Database Design.pdf
A Practical Guide to Database Design.pdf
 
CaseStudy-MohammedImranAlam-Xcelsius
CaseStudy-MohammedImranAlam-XcelsiusCaseStudy-MohammedImranAlam-Xcelsius
CaseStudy-MohammedImranAlam-Xcelsius
 
Data warehousing features in oracle
Data warehousing features in oracleData warehousing features in oracle
Data warehousing features in oracle
 
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSQUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
 

Aggregation

  • 1. March 2010 ACS-4904 Ron McFadyen 1 Aggregate management References: Lawrence Corr • Aggregate improvement www.intelligententerprise.com/011004/415warehouse1_1.jhtml • Lost, shrunken, and collapsed www.intelligententerprise.com/020101/501warehouse1_1.jhtml Ralph Kimball • Aggregate navigation with (almost) no metadata www.dbmsmag.com/9608d54.html
  • 2. March 2010 ACS-4904 Ron McFadyen 2 Aggregate Navigation From “20 criteria for dimensional friendly systems” #4. Open Aggregate Navigation. The system uses physically stored aggregates as a way to enhance performance of common queries. These aggregates, like indexes, are chosen silently by the database if they are physically present. End users and application developers do not need to know what aggregates are available at any point in time, and applications are not required to explicitly code the name of an aggregate. All query processes accessing the data, even those from different application vendors, realize the full benefit of aggregate navigation.
  • 3. March 2010 ACS-4904 Ron McFadyen 3 Aggregation • Created for performance reasons • Using one base star schema, many aggregates can be defined • Some dimensions could be lost, some shrunken or collapsed – see Corr’s articles Customer Date Product Store Transaction Price Cost Tax Qty Consider the Transaction-level schema:
  • 4. March 2010 ACS-4904 Ron McFadyen 4 Aggregation Customer Date Product Store Transaction Price Cost Tax Qty Month Product Store Price Cost Tax Qty Transaction-level schema Aggregate schema with: •lost dimensions (transaction, customer) •shrunken dimension (month) •metrics are accumulated over the transactions, customers, month
  • 5. March 2010 ACS-4904 Ron McFadyen 5 Aggregation • Suppose you must create the schema: • What are the steps to do so? – Product and Store already exist, but the others must be created Month Product Store Price Cost Tax Qty
  • 6. March 2010 ACS-4904 Ron McFadyen 6 Aggregation Design Goal 1 • Aggregates must be stored in their own fact tables; each distinct aggregation level must occupy its own unique fact table
  • 7. March 2010 ACS-4904 Ron McFadyen 7 Aggregation Design Goal 2 • The dimension tables attached to the aggregate fact tables must be shrunken versions of the dimension tables associated with the base fact table Product ProductSK SKU Description Brand Category Department Category SK? Category Department Category is a shrunken version of Product Attributes names from Product What values does the PK of Category assume? What name should the PK of Category have?
  • 8. March 2010 ACS-4904 Ron McFadyen 8 Aggregation Design Goal 3 • The base atomic fact table and all of its related aggregate fact tables must be associated together as a “family of schemas” so that the aggregate navigator knows which tables are related to one another. – Kimball’s paper presents an algorithm/technique for mapping an SQL statement from a base schema to an aggregate schema
  • 9. March 2010 ACS-4904 Ron McFadyen 9 Aggregation Design Goal 4 • Force all SQL created by any end-user data access tool or application to refer exclusively to the base fact table and its associated full-size dimension tables.
  • 10. March 2010 ACS-4904 Ron McFadyen 10 Kimball’s Aggregate Navigation Algorithm 1. For any given SQL statement presented to the DBMS, find the smallest fact table that has not yet been examined in the family of schemas referenced by the query. "Smallest" in this case means the least number of rows. Finding the smallest unexamined schema is a simple lookup in the aggregate navigator metadata table. Choose the smallest schema and proceed to step 2. 2. Compare the table fields in the SQL statement to the table fields in the particular Fact and Dimension tables being examined. This is a series of lookups in the DBMS system catalog. If all of the fields in the SQL statement can be found in the Fact and Dimension tables being examined, alter the original SQL by simply substituting destination table names for original table names. No field names need to change. If any field in the SQL statement cannot be found in the current Fact and Dimension tables, then go back to step 1 and find the next larger Fact table. 3. Run the altered SQL. It is guaranteed to return the correct answer because all of the fields in the SQL statement are present in the chosen schema.
  • 11. March 2010 ACS-4904 Ron McFadyen 11 Aggregate Navigation www.intelligententerprise.com/000929/feat1.jhtml “Other data mart capabilities worth mentioning are aggregation and aggregate navigation. The nature of multidimensional analysis dictates the extensive use of aggregation. … Both UDB and O8i provide a database-resident means for navigating user queries to the optimum aggregated tables”
  • 12. March 2010 ACS-4904 Ron McFadyen 12 Materialized Views An MV is a view where the result set is pre-computed and stored in the database Periodically an MV must be re-computed to account for updates to its source tables MVs represent a performance enhancement to a DBMS as the result set is immediately available when the view is referenced in a SQL statement MVs are used in some cases to provide summary or aggregates for data warehousing transparently to the end-user Query optimization is a DBMS feature that may re-write an SQL statement supplied by a user.
  • 13. March 2010 ACS-4904 Ron McFadyen 13 Materialized Views Oracle example: CREATE MATERIALIZED VIEW hr.mview_employees AS SELECT employees.employee_id, employees.email FROM employees UNION ALL SELECT new_employees.employee_id, new_employees.email FROM new_employees; SQL Server has “indexed views” … see http://www.databasejournal.com/features/mssql/article.php/2119721