SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
DATA WAREHOUSING
Multi Dimensional
Data Modeling.
Facts and Dimensions
2
 While an entity-relationship
     modeling approach from relational
     database design could be used, the
     dimensional modeling approach to
     logical design is more often used
     for a data warehouse.



3
 End users cannot understand,
     remember, navigate an E/R model
     (not even with a GUI)
      One reason is that an enterprise-
       level ERM would be too complex to
       understand.



4
 Software cannot usefully query an
     E/R model




5
 Use of E/R modeling doesn’t meet
     the DW purpose: intuitive and high
     performance querying




6
Employee_Dim
                         EmployeeKey
                         EmployeeID
                         .
                         .
                         .

      Dimension Table
    Time_Dim                 Fact Table            Product_Dim
TimeKey                  Sales_Fact                ProductKey
TheDate                  TimeKey                   ProductID
.                                                  .
.                        EmployeeKey               .
.                        ProductKey                .
                         CustomerKey
                         ShipperKey
                         $
                         .
                         .
                         .
           Shipper_Dim                    Customer_Dim
           ShipperKey                     CustomerKey
           ShipperID                      CustomerID
           .                              .
           .                              .
           .                              .
7
Several distinct dimensions, combined with
                        facts, enable you to answer business
     Dimension          questions.
       Tables
    Geographic   Dimension
                                                         Measures
                                     Fact Table
                       Geographic   Product   Time   Units   $
    Product                                                          Facts




    Time




8
Dimensions
 They are normally textual and
    descriptive descriptions of the
    business.




9
Dimensions
 dimension tables contain relatively
     small amounts of relatively static data




10
Dimensions
 dimension table: usually not-
     normalized




11
Dimensions
    Independent of each other, not hierarchically
     related




12
 Dimensional attributes (attributes
     no key) help to describe the
     dimensional value.

                          Dimensional attributes




13
Facts
 Fact are (usually numerical) measures
     of business.




14
Facts
 Fact table is the largest table in the
     star schema and is composed of large
     volumes of data




15
Facts
 Fact table is (often) normalized




16
Facts
 fact table has a composite primary key
     made up of foreign keys

                               PK = FKi




17
Facts
 fact table usually contains one or more
     numerical facts that occur for the
     combination of keys that define each
     record



                           measures



18
Facts
 A fact table contains either detail-level
     facts or facts that have been
     aggregated (summary tables)


                  Σ




19
Facts
 Facts are:
      additive
      semi-additive
      non-additive




20
Facts
 Non-additive facts cannot be added at all.
   An example of this is averages.
 Semi-additive facts can be aggregated along some of
  the dimensions and not along others:
   current_Balance is a semi-additive fact as it makes
    sense to add them up for all accounts (what's the
    total current balance for all accounts in the bank?)
    but it does not make sense to add them up through
    time (adding up all current balances for a given
    account for each day of the month does not give us
    any useful information
 The most useful measures are: Numeric, Additive
21
 Atomic level of data of the business
  process
 A definition of the highest level of
  detail that is supported in a data
  warehouse



22
 A fact table usually contains facts
  with the same level of aggregation
 a proper dimensional design allows
  only facts of a uniform grain (the same
  dimensionality) to coexist in a single
  fact table


23
   Some perfectly good fact tables represent
     measurements that have no facts! This kind of
     measurements is often called an event. The
     classic example of such a factless fact table is a
     record representing a student attending a class
     on a specific day. The dimensions are Day,
     Student, Professor, Course, and Location, but
     there are no obvious numeric facts. The tuition
     paid and grade received are good facts but not
     at the grain of the daily attendance.

24
   Dimensions without attributes. (Such as a
     transaction number or order number.)
    Put the attribute value into the fact table
     even though it is not an additive fact.




25
26
Employee_Dim
                                   EmployeeKey
                                   EmployeeID
Fact table provides statistics     .
                                   .
for sales broken down by           .
product, time, employee, shipper
and customer, dimensions

  Time_Dim                                                      Product_Dim
  TimeKey                          Sales_Fact                   ProductKey
  TheDate                          TimeKey
                                   TimeKey                      ProductID
  .                                                             .
  .                                EmployeeKey                  .
  .             Dimensional Keys   ProductKey                   .
                                                      Multipart Key
                                   CustomerKey
                                   ShipperKey
                                   $
                                   .                   Measures
                                   .
                                   .
            Shipper_Dim                           Customer_Dim
            ShipperKey                            CustomerKey
            ShipperID                             CustomerID
            .                                     .
            .                                     .
            .                                     .
 27
28
1.    Choosing the data mart for the
       small group of end users we deal
       with.
       Choose a business process to
        model, e.g., orders, invoices, etc.


29
2.   Fact table granularity (the smallest
      defined level of data in the table) is
      determined.




30
3.    Fact table dimensions are selected.
       Choose the dimensions that will
        apply to each fact table record
       Add dimensions for "everything
        you know" about this grain.


31
4.    Determine the facts for the table. In
       most cases, the granularity is at the
       transaction level, so the fact is the
       amount.
       Choose the measure that will
         populate each fact table record
       Add numeric measured facts true to
         the grain
32
   The Data Warehouse Toolkit.Second
    Edition.The Complete Guide to Dimensional
    Modeling.Ralph Kimball.Margy Ross

Más contenido relacionado

Similar a Dw design 1_dim_facts

Data warehouse : Order Management
Data warehouse : Order ManagementData warehouse : Order Management
Data warehouse : Order ManagementKritiya Sangnitidaj
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data modeljagdish_93
 
Informix physical database design for data warehousing
Informix physical database design for data warehousingInformix physical database design for data warehousing
Informix physical database design for data warehousingKeshav Murthy
 
Business Intelligence Jargon Buster
Business Intelligence Jargon BusterBusiness Intelligence Jargon Buster
Business Intelligence Jargon BusterDonna Kelly
 
Praktisi Mengajar - Workflow Analysis.pptx
Praktisi Mengajar - Workflow Analysis.pptxPraktisi Mengajar - Workflow Analysis.pptx
Praktisi Mengajar - Workflow Analysis.pptxSoniAdiyatma1
 
How to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on SnowflakeHow to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on SnowflakeAtScale
 
Change data capture
Change data captureChange data capture
Change data captureJames Deppen
 
Basics+of+Datawarehousing
Basics+of+DatawarehousingBasics+of+Datawarehousing
Basics+of+Datawarehousingtheextraaedge
 
Enhancing Dashboard Visuals with Multi-Dimensional Expressions (MDX)
Enhancing Dashboard Visuals with Multi-Dimensional Expressions (MDX)Enhancing Dashboard Visuals with Multi-Dimensional Expressions (MDX)
Enhancing Dashboard Visuals with Multi-Dimensional Expressions (MDX)Daniel Upton
 
Tomas mis eng
Tomas mis engTomas mis eng
Tomas mis engtomasdse
 
Cost and management accounting systems & a bc costig
Cost and management accounting systems & a bc costigCost and management accounting systems & a bc costig
Cost and management accounting systems & a bc costigKhalid Aziz
 
Flevy.com - Finance and Valuation Basics
Flevy.com - Finance and Valuation BasicsFlevy.com - Finance and Valuation Basics
Flevy.com - Finance and Valuation BasicsDavid Tracy
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasiryasir873
 
Sgf2009 278 2009
Sgf2009 278 2009Sgf2009 278 2009
Sgf2009 278 2009trexpruitt
 
Intro to Financial Modeling - EI
Intro to Financial Modeling - EIIntro to Financial Modeling - EI
Intro to Financial Modeling - EIMartin Zych
 
Micro strategy Reporting Suite
Micro strategy Reporting SuiteMicro strategy Reporting Suite
Micro strategy Reporting SuiteClassic Polo
 
111Assignment Learning ObjectivesBSIS 105Assignment 3Purc.docx
111Assignment Learning ObjectivesBSIS 105Assignment 3Purc.docx111Assignment Learning ObjectivesBSIS 105Assignment 3Purc.docx
111Assignment Learning ObjectivesBSIS 105Assignment 3Purc.docxhyacinthshackley2629
 

Similar a Dw design 1_dim_facts (20)

Data warehouse : Order Management
Data warehouse : Order ManagementData warehouse : Order Management
Data warehouse : Order Management
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data model
 
Informix physical database design for data warehousing
Informix physical database design for data warehousingInformix physical database design for data warehousing
Informix physical database design for data warehousing
 
Business Intelligence Jargon Buster
Business Intelligence Jargon BusterBusiness Intelligence Jargon Buster
Business Intelligence Jargon Buster
 
Praktisi Mengajar - Workflow Analysis.pptx
Praktisi Mengajar - Workflow Analysis.pptxPraktisi Mengajar - Workflow Analysis.pptx
Praktisi Mengajar - Workflow Analysis.pptx
 
How to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on SnowflakeHow to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on Snowflake
 
Change data capture
Change data captureChange data capture
Change data capture
 
Basics+of+Datawarehousing
Basics+of+DatawarehousingBasics+of+Datawarehousing
Basics+of+Datawarehousing
 
Enhancing Dashboard Visuals with Multi-Dimensional Expressions (MDX)
Enhancing Dashboard Visuals with Multi-Dimensional Expressions (MDX)Enhancing Dashboard Visuals with Multi-Dimensional Expressions (MDX)
Enhancing Dashboard Visuals with Multi-Dimensional Expressions (MDX)
 
Tomas mis eng
Tomas mis engTomas mis eng
Tomas mis eng
 
Cost and management accounting systems & a bc costig
Cost and management accounting systems & a bc costigCost and management accounting systems & a bc costig
Cost and management accounting systems & a bc costig
 
Behavior-Driven Development with JGiven
Behavior-Driven Development with JGivenBehavior-Driven Development with JGiven
Behavior-Driven Development with JGiven
 
208 dataflowdgm
208 dataflowdgm208 dataflowdgm
208 dataflowdgm
 
Flevy.com - Finance and Valuation Basics
Flevy.com - Finance and Valuation BasicsFlevy.com - Finance and Valuation Basics
Flevy.com - Finance and Valuation Basics
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
 
Super spike
Super spikeSuper spike
Super spike
 
Sgf2009 278 2009
Sgf2009 278 2009Sgf2009 278 2009
Sgf2009 278 2009
 
Intro to Financial Modeling - EI
Intro to Financial Modeling - EIIntro to Financial Modeling - EI
Intro to Financial Modeling - EI
 
Micro strategy Reporting Suite
Micro strategy Reporting SuiteMicro strategy Reporting Suite
Micro strategy Reporting Suite
 
111Assignment Learning ObjectivesBSIS 105Assignment 3Purc.docx
111Assignment Learning ObjectivesBSIS 105Assignment 3Purc.docx111Assignment Learning ObjectivesBSIS 105Assignment 3Purc.docx
111Assignment Learning ObjectivesBSIS 105Assignment 3Purc.docx
 

Más de Claudia Gomez

Más de Claudia Gomez (20)

Olapsql
OlapsqlOlapsql
Olapsql
 
3 olap storage
3 olap storage3 olap storage
3 olap storage
 
3 olap storage
3 olap storage3 olap storage
3 olap storage
 
2 olap operaciones
2 olap operaciones2 olap operaciones
2 olap operaciones
 
1 introba
1 introba1 introba
1 introba
 
Diseño fisico particiones_3
Diseño fisico particiones_3Diseño fisico particiones_3
Diseño fisico particiones_3
 
Diseño fisico indices_2
Diseño fisico indices_2Diseño fisico indices_2
Diseño fisico indices_2
 
Diseño fisico 1
Diseño fisico 1Diseño fisico 1
Diseño fisico 1
 
Agreggates iii
Agreggates iiiAgreggates iii
Agreggates iii
 
Agreggates ii
Agreggates iiAgreggates ii
Agreggates ii
 
Agreggates i
Agreggates iAgreggates i
Agreggates i
 
Dw design hierarchies_7
Dw design hierarchies_7Dw design hierarchies_7
Dw design hierarchies_7
 
Dw design fact_tables_types_6
Dw design fact_tables_types_6Dw design fact_tables_types_6
Dw design fact_tables_types_6
 
Dw design date_dimension_1_1
Dw design date_dimension_1_1Dw design date_dimension_1_1
Dw design date_dimension_1_1
 
Dw design 4_bus_architecture
Dw design 4_bus_architectureDw design 4_bus_architecture
Dw design 4_bus_architecture
 
Dw design 3_surro_keys
Dw design 3_surro_keysDw design 3_surro_keys
Dw design 3_surro_keys
 
Dw design 2_conceptual_model
Dw design 2_conceptual_modelDw design 2_conceptual_model
Dw design 2_conceptual_model
 
3 dw architectures
3 dw architectures3 dw architectures
3 dw architectures
 
2 dw requeriments
2 dw requeriments2 dw requeriments
2 dw requeriments
 
1 dw projectplanning
1 dw projectplanning1 dw projectplanning
1 dw projectplanning
 

Dw design 1_dim_facts

  • 1. DATA WAREHOUSING Multi Dimensional Data Modeling. Facts and Dimensions
  • 2. 2
  • 3.  While an entity-relationship modeling approach from relational database design could be used, the dimensional modeling approach to logical design is more often used for a data warehouse. 3
  • 4.  End users cannot understand, remember, navigate an E/R model (not even with a GUI)  One reason is that an enterprise- level ERM would be too complex to understand. 4
  • 5.  Software cannot usefully query an E/R model 5
  • 6.  Use of E/R modeling doesn’t meet the DW purpose: intuitive and high performance querying 6
  • 7. Employee_Dim EmployeeKey EmployeeID . . . Dimension Table Time_Dim Fact Table Product_Dim TimeKey Sales_Fact ProductKey TheDate TimeKey ProductID . . . EmployeeKey . . ProductKey . CustomerKey ShipperKey $ . . . Shipper_Dim Customer_Dim ShipperKey CustomerKey ShipperID CustomerID . . . . . . 7
  • 8. Several distinct dimensions, combined with facts, enable you to answer business Dimension questions. Tables Geographic Dimension Measures Fact Table Geographic Product Time Units $ Product Facts Time 8
  • 9. Dimensions  They are normally textual and descriptive descriptions of the business. 9
  • 10. Dimensions  dimension tables contain relatively small amounts of relatively static data 10
  • 11. Dimensions  dimension table: usually not- normalized 11
  • 12. Dimensions  Independent of each other, not hierarchically related 12
  • 13.  Dimensional attributes (attributes no key) help to describe the dimensional value. Dimensional attributes 13
  • 14. Facts  Fact are (usually numerical) measures of business. 14
  • 15. Facts  Fact table is the largest table in the star schema and is composed of large volumes of data 15
  • 16. Facts  Fact table is (often) normalized 16
  • 17. Facts  fact table has a composite primary key made up of foreign keys PK = FKi 17
  • 18. Facts  fact table usually contains one or more numerical facts that occur for the combination of keys that define each record measures 18
  • 19. Facts  A fact table contains either detail-level facts or facts that have been aggregated (summary tables) Σ 19
  • 20. Facts  Facts are:  additive  semi-additive  non-additive 20
  • 21. Facts  Non-additive facts cannot be added at all.  An example of this is averages.  Semi-additive facts can be aggregated along some of the dimensions and not along others:  current_Balance is a semi-additive fact as it makes sense to add them up for all accounts (what's the total current balance for all accounts in the bank?) but it does not make sense to add them up through time (adding up all current balances for a given account for each day of the month does not give us any useful information  The most useful measures are: Numeric, Additive 21
  • 22.  Atomic level of data of the business process  A definition of the highest level of detail that is supported in a data warehouse 22
  • 23.  A fact table usually contains facts with the same level of aggregation  a proper dimensional design allows only facts of a uniform grain (the same dimensionality) to coexist in a single fact table 23
  • 24. Some perfectly good fact tables represent measurements that have no facts! This kind of measurements is often called an event. The classic example of such a factless fact table is a record representing a student attending a class on a specific day. The dimensions are Day, Student, Professor, Course, and Location, but there are no obvious numeric facts. The tuition paid and grade received are good facts but not at the grain of the daily attendance. 24
  • 25. Dimensions without attributes. (Such as a transaction number or order number.)  Put the attribute value into the fact table even though it is not an additive fact. 25
  • 26. 26
  • 27. Employee_Dim EmployeeKey EmployeeID Fact table provides statistics . . for sales broken down by . product, time, employee, shipper and customer, dimensions Time_Dim Product_Dim TimeKey Sales_Fact ProductKey TheDate TimeKey TimeKey ProductID . . . EmployeeKey . . Dimensional Keys ProductKey . Multipart Key CustomerKey ShipperKey $ . Measures . . Shipper_Dim Customer_Dim ShipperKey CustomerKey ShipperID CustomerID . . . . . . 27
  • 28. 28
  • 29. 1. Choosing the data mart for the small group of end users we deal with.  Choose a business process to model, e.g., orders, invoices, etc. 29
  • 30. 2. Fact table granularity (the smallest defined level of data in the table) is determined. 30
  • 31. 3. Fact table dimensions are selected.  Choose the dimensions that will apply to each fact table record  Add dimensions for "everything you know" about this grain. 31
  • 32. 4. Determine the facts for the table. In most cases, the granularity is at the transaction level, so the fact is the amount.  Choose the measure that will populate each fact table record  Add numeric measured facts true to the grain 32
  • 33. The Data Warehouse Toolkit.Second Edition.The Complete Guide to Dimensional Modeling.Ralph Kimball.Margy Ross