SlideShare a Scribd company logo
1 of 41
Data Warehousing: Data
Models and OLAP operations
                By
          Kishore Jaladi
    kishorejaladi@yahoo.com
Topics Covered
1. Understanding the term “Data Warehousing”

2. Three-tier Decision Support Systems

3. Approaches to OLAP servers

4. Multi-dimensional data model

5. ROLAP

6. MOLAP


7. HOLAP

8. Which to choose: Compare and Contrast

9. Conclusion
Understanding the term Data Warehousing
• Data Warehouse:
  The term Data Warehouse was coined by Bill Inmon in 1990, which
  he defined in the following way: "A warehouse is a subject-oriented,
  integrated, time-variant and non-volatile collection of data in support
  of management's decision making process". He defined the terms in
  the sentence as follows:
• Subject Oriented:
  Data that gives information about a particular subject instead of
  about a company's ongoing operations.
• Integrated:
  Data that is gathered into the data warehouse from a variety of
  sources and merged into a coherent whole.
• Time-variant:
  All data in the data warehouse is identified with a particular time
  period.
• Non-volatile
  Data is stable in a data warehouse. More data is added but data is
  never removed. This enables management to gain a consistent
  picture of the business.
Data Warehouse Architecture
Other important terminology
• Enterprise Data warehouse
  collects all information about subjects
  (customers,products,sales,assets, personnel) that span the entire
  organization

• Data Mart
  Departmental subsets that focus on selected subjects

• Decision Support System (DSS)
  Information technology to help the knowledge worker (executive,
  manager, analyst) make faster & better decisions

• Online Analytical Processing (OLAP)
   an element of decision support systems (DSS)
Three-Tier Decision Support Systems
• Warehouse database server
  – Almost always a relational DBMS, rarely flat files
• OLAP servers
  – Relational OLAP (ROLAP): extended relational DBMS
    that maps operations on multidimensional data to
    standard relational operators
  – Multidimensional OLAP (MOLAP): special-purpose
    server that directly implements multidimensional data
    and operations
• Clients
  –   Query and reporting tools
  –   Analysis tools
  –   Data mining tools
The Complete Decision Support System

Information Sources             Data Warehouse   OLAP Servers               Clients
                                     Server        (Tier 2)                 (Tier 3)
                                    (Tier 1)
                                                 e.g., MOLAP
   Semistructured                                                           OLAP
      Sources
                                     Data
                                   Warehouse                    serve

                    extract                                             Query/Reporting
                    transform
                    load                            serve
                    refresh
                    etc.                          e.g., ROLAP
    Operational
      DB’s                                                               Data Mining
                                                                serve




                                  Data Marts
Approaches to OLAP Servers
Three possibilities for OLAP servers
(1) Relational OLAP (ROLAP)
    – Relational and specialized relational DBMS to store
      and manage warehouse data
    – OLAP middleware to support missing pieces
(2) Multidimensional OLAP (MOLAP)
    – Array-based storage structures
    – Direct access to array data structures
(3) Hybrid OLAP (HOLAP)
   –   Storing detailed data in RDBMS
   –   Storing aggregated data in MDBMS
   –   User access via MOLAP tools
The Multi-Dimensional Data Model
         “Sales by product line over the past six months”
         “Sales by store between 1990 and 1995”
           Store Info                Key columns joining fact table
                                          to dimension tables       Numerical Measures


                                    Prod Code Time Code Store Code Sales     Qty

                                                                                   Fact table for
   Product Info
                                                                                   measures




Dimension tables        Time Info


                                               ...
ROLAP: Dimensional Modeling Using
Relational DBMS
• Special schema design: star, snowflake

• Special indexes: bitmap, multi-table join

• Proven technology (relational model, DBMS), tend to
  outperform specialized MDDB especially on large
  data sets

• Products
   – IBM DB2, Oracle, Sybase IQ, RedBrick, Informix
Star Schema (in RDBMS)
Star Schema Example
The “Classic” Star Schema
 Store Dimension     Fact Table          Time Dimension
 STORE KEY           STORE KEY
                                         PERIOD KEY
                                                            A single fact table, with
                     PRODUCT KEY
 Store Description
 City                PERIOD KEY          Period Desc      
                                         Year
                                                            detail and summary data
 State
                      Dollars            Quarter
 District ID
                      Units
 District Desc.                          Month
                      Price
 Region_ID
 Region Desc.
                     Product Dimension
                                         Day
                                         Current Flag      Fact table primary key
 Regional Mgr.
                                                            has only one key column
                                         Resolution
 Level                PRODUCT KEY        Sequence
                     Product Desc.
                     Brand
                     Color                                  per dimension
                                                           Each key is generated
                     Size
                     Manufacturer
                     Level

                                                           Each dimension is a
                                                            single table, highly de-
                                                            normalized



Benefits: Easy to understand, easy to define hierarchies, reduces # of physical joins, low
maintenance, very simple metadata
Star Schema
with Sample
Data
The “Snowflake” Schema
Store Dimension
STORE KEY              District_ID       Region_ID
Store Description       District Desc.   Region Desc.
City                    Region_ID        Regional Mgr.
State
District ID
Region_ID
Regional Mgr.
                    Store Fact Table
                    STORE KEY
                    PRODUCT KEY
                    PERIOD KEY
                     Dollars
                     Units
                     Price
Aggregation in a Single Fact Table
 Store Dimension     Fact Table          Time Dimension
 STORE KEY           STORE KEY
                                         PERIOD KEY
 Store Description   PRODUCT KEY
 City                PERIOD KEY          Period Desc
 State                                   Year
                      Dollars            Quarter
 District ID
                      Units
 District Desc.                          Month
                      Price
 Region_ID                               Day
 Region Desc.                            Current Flag
 Regional Mgr.
                     Product Dimension
                                         Resolution
 Level                PRODUCT KEY        Sequence
                     Product Desc.
                     Brand
                     Color
                     Size
                     Manufacturer
                     Level




Drawbacks: Summary data in the fact table yields poorer performance for summary
levels, huge dimension tables a problem
The “Fact Constellation” Schema
Store Dimension     Fact Table              Time Dimension
STORE KEY           STORE KEY
                                            PERIOD KEY
Store Description   PRODUCT KEY
City                PERIOD KEY              Period Desc
State                                       Year
                     Dollars                Quarter
District ID
                     Units
District Desc.                              Month
                     Price
Region_ID                                   Day
Region Desc.                                Current Flag
Regional Mgr.
                    Product Dimension
                                            Sequence
                     PRODUCT KEY
                    Product Desc.
                    Brand               District Fact Table
                    Color                                     Region Fact Table
                    Size                  District_ID
                    Manufacturer          PRODUCT_KEY         Region_ID
                                                              PRODUCT_KEY
                                          PERIOD_KEY
                                                              PERIOD_KEY
                                           Dollars
                                                                Dollars
                                           Units                Units
                                           Price                Price
Schema and Multiple Fact
      Tables
St ore Dimension
STORE KEY                Dist rict _ ID    Region_ ID
                                                                                     • No LEVEL in dimension tables
St ore Descript ion
Cit y
                         Dist rict Desc.
                         Region_ ID
                                           Region Desc.
                                           Regional Mgr.
                                                                                     • Dimension tables are normalized by
St at e
Dist rict ID
                                                                                       decomposing at the attribute level
                                                                                     • Each dimension table has one key
Dist rict Desc.
Region_ ID
Region Desc.
Regional Mgr.
                      St ore Fact Table
                      STORE KEY
                                           Dist rict Fact Table
                                           District_ID
                                                                  RegionFact Table
                                                                   Region_ID           for each level of the dimensionís
                                                                                       hierarchy
                                                                   PRODUCT_KEY
                                           PRODUCT_KEY             PERIOD_KEY
                      PRODUCT KEY          PERIOD_KEY              Dollars
                      PERIOD KEY                                   Unit s

                                                                                     • The lowest level key joins the
                                            Dollars
                                            Unit s                 Price
                       Dollars              Price


                                                                                       dimension table to both the fact
                       Unit s
                       Price

                                                                                       table and the lower level attribute
                                                                                       table
  How does it work?
  The best way is for the query to be built by understanding which
  summary levels exist, and finding the proper snowflaked attribute
  tables, constraining there for keys, then selecting from the fact table.
Aggregation Contd …
             St ore Dimension
             STORE KEY                Dist rict _ ID    Region_ ID
             St ore Descript ion      Dist rict Desc.   Region Desc.
             Cit y                    Region_ ID        Regional Mgr.
             St at e
             Dist rict ID
             Dist rict Desc.
             Region_ ID
             Region Desc.
             Regional Mgr.
                                   St ore Fact Table    Dist rict Fact Table   RegionFact Table
                                                                                Region_ID
                                   STORE KEY            District_ID
                                                                                PRODUCT_KEY
                                                        PRODUCT_KEY             PERIOD_KEY
                                   PRODUCT KEY          PERIOD_KEY              Dollars
                                   PERIOD KEY            Dollars                Unit s
                                                         Unit s                 Price
                                    Dollars              Price
                                    Unit s
                                    Price




Advantage: Best performance when queries involve aggregation

Disadvantage: Complicated maintenance and metadata, explosion in the number
of tables in the database
Aggregates
• Add up amounts for day 1
• In SQL: SELECT sum(amt) FROM SALE
           WHERE date = 1

  sale   prodId storeId   date   amt
           p1     s1       1      12
           p2     s1       1      11   81
           p1     s3       1      50
           p2     s2       1       8
           p1     s1       2      44
           p1     s2       2       4
Aggregates
 • Add up amounts by day
 • In SQL: SELECT date, sum(amt) FROM SALE
            GROUP BY date

sale   prodId   storeId   date   amt
         p1       s1       1      12
         p2       s1       1      11   ans   date   sum
         p1       s3       1      50          1      81
         p2       s2       1       8          2      48
         p1       s1       2      44
         p1       s2       2       4
Another Example
   • Add up amounts by day, product
   • In SQL: SELECT date, sum(amt) FROM SALE
              GROUP BY date, prodId
sale   prodId   storeId   date     amt
         p1       s1       1        12        sale   prodId   date   amt
         p2       s1       1        11                 p1      1      62
         p1       s3       1        50                 p2      1      19
         p2       s2       1         8
         p1       s1       2        44                 p1      2      48
         p1       s2       2         4


                                  rollup

                                 drill-down
Points to be noticed about ROLAP
• Defines complex, multi-dimensional data with simple
  model
• Reduces the number of joins a query has to process
• Allows the data warehouse to evolve with rel. low
  maintenance
• Can contain both detailed and summarized data.
• ROLAP is based on familiar, proven, and already
  selected technologies.
BUT!!!
• SQL for multi-dimensional manipulation of
  calculations.
MOLAP: Dimensional Modeling Using
the Multi Dimensional Model

•   MDDB: a special-purpose data model
•   Facts stored in multi-dimensional arrays
•   Dimensions used to index array
•   Sometimes on top of relational DB
•   Products
    – Pilot, Arbor Essbase, Gentia
The MOLAP Cube

  Fact table view:              Multi-dimensional cube:
sale   prodId   storeId   amt
         p1       s1       12          s1    s2    s3
         p2       s1       11     p1   12          50
         p1       s3       50     p2   11     8
         p2       s2        8



                                  dimensions = 2
3-D Cube
 Fact table view:                       Multi-dimensional cube:

sale   prodId   storeId   date   amt
         p1       s1       1      12
         p2       s1       1      11                       s1        s2        s3
                                        day 2
         p1       s3       1      50               p1      44         4
         p2       s2       1       8               p2 s1        s2        s3
         p1       s1       2      44   day 1
                                                p1    12                  50
         p1       s2       2       4            p2    11        8




                                                dimensions = 3
Example
                                                   roll-up to region
                                                                      Dimensions:
                   e             NY
               r                                                               Time, Product, Store
           Sto              SF                    roll-up to brand
                       LA                                             Attributes:
                                                                               Product (upc, price, …)
             Juice           10
                                                                               Store …
 Product




             Milk            34
                             56
                                                                               …
             Coke
                             32
                                                                      Hierarchies:
             Cream
                             12                                                Product → Brand → …
             Soap
             Bread           56                        roll-up to week         Day → Week → Quarter
                                 M   T W Th F S S                              Store → Region → Country
                                        Time
56 units of bread sold in LA on M
Cube Aggregation: Roll-up
                                                 Example: computing sums
                    s1         s2         s3
 day 2                                              ...
            p1      44          4
            p2 s1         s2         s3
day 1
         p1    12                    50
         p2    11         8


                                                     s1    s2   s3
                                               sum   67    12   50
              s1         s2         s3
        p1    56          4         50
        p2    11          8
                                                                     129
                                                     sum
              rollup                            p1   110
                                                p2    19
             drill-down
Cube Operators for Roll-up

                    s1         s2         s3
 day 2                                                ...
            p1      44          4
            p2 s1         s2         s3
day 1
         p1    12                    50
         p2    11         8                           sale(s1,*,*)

                                                     s1     s2   s3
                                               sum   67     12   50
               s1        s2         s3
        p1     56         4         50
        p2     11         8
                                                                          129
                                                     sum
             sale(s2,p2,*)                      p1   110
                                                p2    19              sale(*,*,*)
Extended Cube

               *         s1    s2   s3     *
                   p1    56     4   50    110
                   p2    11     8         19
     day 2          *
                   s1    67
                         s2    12
                              s3     50
                                     *    129
              p1   44    4          48
              p2
              s1    s2   s3     *
 day 1
         p1
               *
              12
                   44    4
                         50    62
                                    48          sale(*,p2,*)
         p2   11    8          19
          *   23    8    50    81
Aggregation Using Hierarchies
                    s1           s2        s3
 day 2
            p1      44            4                        store
            p2 s1           s2        s3
day 1
         p1    12                     50
         p2    11           8
                                                          region

                                                          country

                 region A    region B
            p1      56          54
            p2      11           8
                                                (store s1 in Region A;
                                                stores s2, s3 in Region B)
Points to be noticed about MOLAP
• Pre-calculating or pre-consolidating transactional data
 improves speed.
BUT
 Fully pre-consolidating incoming data, MDDs require an
 enormous amount of overhead both in processing time and in
 storage. An input file of 200MB can easily expand to 5GB

  MDDs are great candidates for the < 50GB department data
  marts.

• Rolling up and Drilling down through aggregate data.

• With MDDs, application design is essentially the definition of
  dimensions and calculation rules, while the RDBMS requires
  that the database schema be a star or snowflake.
Hybrid OLAP (HOLAP)

• HOLAP = Hybrid OLAP:

  – Best of both worlds

  – Storing detailed data in RDBMS

  – Storing aggregated data in MDBMS

  – User access via MOLAP tools
Data Flow in HOLAP
RDBMS Server                 MDBMS Server                      Client
                                                    Multi-
                                                 dimensional
                      SQL-Read                     access
                                                                  Multidimensional
 User                                 Multi-
 data     Meta data
                                                                       Viewer
                                   dimensional
        Derived                       data
         data
                      SQL-Reach
                       Through
                                                                        Relational
                                                                         Viewer
                                  SQL-Read
When deciding which technology to
go for, consider:

 1) Performance:

    • How fast will the system appear to the end-user?

    • MDD server vendors believe this is a key point in their favor.

 2) Data volume and scalability:

    • While MDD servers can handle up to 50GB of storage, RDBMS
      servers can handle hundreds of gigabytes and terabytes.
An experiment with Relational and the
 Multidimensional models on a data set
The analysis of the author’s example illustrates the following differences between
   the best Relational alternative and the Multidimensional approach.

                                         relational Multi-     Improvement
                                                    dimensiona
                                                    l
 Disk space requirement                  17         10         1.7
 (Gigabytes)
 Retrieve the corporate measures         240         1              240
 Actual Vs Budget, by month (I/O’s)

 Calculation of Variance                 237         2*             110*
 Budget/Actual for the whole
 database (I/O time in hours)

* This may include the calculation of many other derived data without any
   additional I/O.
Reference: http://dimlab.usc.edu/csci599/Fall2002/paper/I2_P064.pdf
What-if analysis
IF
       A. You require write access
       B. Your data is under 50 GB
       C. Your timetable to implement is 60-90 days
       D. Lowest level already aggregated
       E. Data access on aggregated level
       F. You’re developing a general-purpose application for inventory movement or assets
       management
THEN
       Consider an MDD /MOLAP solution for your data mart
 
IF
       A. Your data is over 100 GB
       B. You have a "read-only" requirement
       C. Historical data at the lowest level of granularity
       D. Detailed access, long-running queries
       E. Data assigned to lowest level elements
THEN
       Consider an RDBMS/ROLAP solution for your data mart.

IF
       A. OLAP on aggregated and detailed data
       B. Different user groups
       C. Ease of use and detailed data
THEN
       Consider an HOLAP for your data mart
Examples
• ROLAP
  –   Telecommunication startup: call data records (CDRs)
  –   ECommerce Site
  –   Credit Card Company
• MOLAP
  – Analysis and budgeting in a financial department
  – Sales analysis
• HOLAP
  – Sales department of a multi-national company
  – Banks and Financial Service Providers
Tools available
• ROLAP:
  –   ORACLE 8i
  –   ORACLE Reports; ORACLE Discoverer
  –   ORACLE Warehouse Builder
  –   Arbors Software’s Essbase


• MOLAP:
  –   ORACLE Express Server
  –   ORACLE Express Clients (C/S and Web)
  –   MicroStrategy’s DSS server
  –   Platinum Technologies’ Plantinum InfoBeacon


• HOLAP:
  –   ORACLE 8i
  –   ORACLE Express Serve
  –   ORACLE Relational Access Manager
  –   ORACLE Express Clients (C/S and Web)
Conclusion
• ROLAP: RDBMS -> star/snowflake schema

• MOLAP: MDD -> Cube structures

• ROLAP or MOLAP: Data models used play major role in performance
   differences

• MOLAP: for summarized and relatively lesser volumes of data (10-50GB)

• ROLAP: for detailed and larger volumes of data

• Both storage methods have strengths and weaknesses

• The choice is requirement specific, though currently data warehouses are
   predominantly built using RDBMSs/ROLAP.
References
• http://dimlab.usc.edu/csci599/Fall2002/paper/I2_P064.pdf
   – OLAP, Relational, and Multidimensional Database Systems,
     by George Colliat, Arbor Software Corporation

• http://www.donmeyer.com/art3.html
   – Data warehousing Services , Data Mining & Analysis, LLC

• http://www.cs.man.ac.uk/~franconi/teaching/2001/CS636/CS636-olap.pp
  http://www.cs.man.ac.uk/~franconi/teaching/2001/CS636/CS636-olap.p
   – Data Warehouse Models and OLAP Operations , by Enrico
     Franconi

• http://www.promatis.com/mediacenter/papers
  - ROLAP, MOLAP, HOLAP: How to determine which to technology
  is appropriate, by Holger Frietch, PROMATIS Corporation

More Related Content

Similar to Kishore jaladi-dw

professional informatica trainer
professional informatica trainerprofessional informatica trainer
professional informatica trainervibrantuser
 
Intro to datawarehouse dev 1.0
Intro to datawarehouse   dev 1.0Intro to datawarehouse   dev 1.0
Intro to datawarehouse dev 1.0Jannet Peetz
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overviewashok kumar
 
1 introductory slides (1)
1 introductory slides (1)1 introductory slides (1)
1 introductory slides (1)tafosepsdfasg
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data modeljagdish_93
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALASaikiran Panjala
 
Business objects integration kit for sap crystal reports 2008
Business objects integration kit for sap   crystal reports 2008Business objects integration kit for sap   crystal reports 2008
Business objects integration kit for sap crystal reports 2008Yogeeswar Reddy
 
Measure Data Quality
Measure Data QualityMeasure Data Quality
Measure Data QualityZavalaJV
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingDunn Solutions Group
 
Oracle BI Server by AORTA
Oracle BI Server by AORTAOracle BI Server by AORTA
Oracle BI Server by AORTAguest066f569
 
Scaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaleBase
 
SQL Server Developer 70-433
SQL Server Developer 70-433SQL Server Developer 70-433
SQL Server Developer 70-433jasonyousef
 

Similar to Kishore jaladi-dw (20)

professional informatica trainer
professional informatica trainerprofessional informatica trainer
professional informatica trainer
 
3dw
3dw3dw
3dw
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Intro to datawarehouse dev 1.0
Intro to datawarehouse   dev 1.0Intro to datawarehouse   dev 1.0
Intro to datawarehouse dev 1.0
 
Oracle: DW Design
Oracle: DW DesignOracle: DW Design
Oracle: DW Design
 
Oracle: Dw Design
Oracle: Dw DesignOracle: Dw Design
Oracle: Dw Design
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
3dw
3dw3dw
3dw
 
1 introductory slides (1)
1 introductory slides (1)1 introductory slides (1)
1 introductory slides (1)
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data model
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
 
Business objects integration kit for sap crystal reports 2008
Business objects integration kit for sap   crystal reports 2008Business objects integration kit for sap   crystal reports 2008
Business objects integration kit for sap crystal reports 2008
 
Dwlogical0910
Dwlogical0910Dwlogical0910
Dwlogical0910
 
Oracle BI Server By AORTA
Oracle BI Server By AORTAOracle BI Server By AORTA
Oracle BI Server By AORTA
 
Measure Data Quality
Measure Data QualityMeasure Data Quality
Measure Data Quality
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional Modeling
 
Oracle BI Server by AORTA
Oracle BI Server by AORTAOracle BI Server by AORTA
Oracle BI Server by AORTA
 
Scaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data Distribution
 
SQL Server Developer 70-433
SQL Server Developer 70-433SQL Server Developer 70-433
SQL Server Developer 70-433
 

Recently uploaded

HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsManeerUddin
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 

Recently uploaded (20)

HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture hons
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 

Kishore jaladi-dw

  • 1. Data Warehousing: Data Models and OLAP operations By Kishore Jaladi kishorejaladi@yahoo.com
  • 2. Topics Covered 1. Understanding the term “Data Warehousing” 2. Three-tier Decision Support Systems 3. Approaches to OLAP servers 4. Multi-dimensional data model 5. ROLAP 6. MOLAP 7. HOLAP 8. Which to choose: Compare and Contrast 9. Conclusion
  • 3. Understanding the term Data Warehousing • Data Warehouse: The term Data Warehouse was coined by Bill Inmon in 1990, which he defined in the following way: "A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process". He defined the terms in the sentence as follows: • Subject Oriented: Data that gives information about a particular subject instead of about a company's ongoing operations. • Integrated: Data that is gathered into the data warehouse from a variety of sources and merged into a coherent whole. • Time-variant: All data in the data warehouse is identified with a particular time period. • Non-volatile Data is stable in a data warehouse. More data is added but data is never removed. This enables management to gain a consistent picture of the business.
  • 5. Other important terminology • Enterprise Data warehouse collects all information about subjects (customers,products,sales,assets, personnel) that span the entire organization • Data Mart Departmental subsets that focus on selected subjects • Decision Support System (DSS) Information technology to help the knowledge worker (executive, manager, analyst) make faster & better decisions • Online Analytical Processing (OLAP) an element of decision support systems (DSS)
  • 6. Three-Tier Decision Support Systems • Warehouse database server – Almost always a relational DBMS, rarely flat files • OLAP servers – Relational OLAP (ROLAP): extended relational DBMS that maps operations on multidimensional data to standard relational operators – Multidimensional OLAP (MOLAP): special-purpose server that directly implements multidimensional data and operations • Clients – Query and reporting tools – Analysis tools – Data mining tools
  • 7. The Complete Decision Support System Information Sources Data Warehouse OLAP Servers Clients Server (Tier 2) (Tier 3) (Tier 1) e.g., MOLAP Semistructured OLAP Sources Data Warehouse serve extract Query/Reporting transform load serve refresh etc. e.g., ROLAP Operational DB’s Data Mining serve Data Marts
  • 8. Approaches to OLAP Servers Three possibilities for OLAP servers (1) Relational OLAP (ROLAP) – Relational and specialized relational DBMS to store and manage warehouse data – OLAP middleware to support missing pieces (2) Multidimensional OLAP (MOLAP) – Array-based storage structures – Direct access to array data structures (3) Hybrid OLAP (HOLAP) – Storing detailed data in RDBMS – Storing aggregated data in MDBMS – User access via MOLAP tools
  • 9. The Multi-Dimensional Data Model “Sales by product line over the past six months” “Sales by store between 1990 and 1995” Store Info Key columns joining fact table to dimension tables Numerical Measures Prod Code Time Code Store Code Sales Qty Fact table for Product Info measures Dimension tables Time Info ...
  • 10. ROLAP: Dimensional Modeling Using Relational DBMS • Special schema design: star, snowflake • Special indexes: bitmap, multi-table join • Proven technology (relational model, DBMS), tend to outperform specialized MDDB especially on large data sets • Products – IBM DB2, Oracle, Sybase IQ, RedBrick, Informix
  • 11. Star Schema (in RDBMS)
  • 13. The “Classic” Star Schema Store Dimension Fact Table Time Dimension STORE KEY STORE KEY PERIOD KEY A single fact table, with PRODUCT KEY Store Description City PERIOD KEY Period Desc  Year detail and summary data State Dollars Quarter District ID Units District Desc. Month Price Region_ID Region Desc. Product Dimension Day Current Flag  Fact table primary key Regional Mgr. has only one key column Resolution Level PRODUCT KEY Sequence Product Desc. Brand Color per dimension  Each key is generated Size Manufacturer Level  Each dimension is a single table, highly de- normalized Benefits: Easy to understand, easy to define hierarchies, reduces # of physical joins, low maintenance, very simple metadata
  • 15. The “Snowflake” Schema Store Dimension STORE KEY District_ID Region_ID Store Description District Desc. Region Desc. City Region_ID Regional Mgr. State District ID Region_ID Regional Mgr. Store Fact Table STORE KEY PRODUCT KEY PERIOD KEY Dollars Units Price
  • 16. Aggregation in a Single Fact Table Store Dimension Fact Table Time Dimension STORE KEY STORE KEY PERIOD KEY Store Description PRODUCT KEY City PERIOD KEY Period Desc State Year Dollars Quarter District ID Units District Desc. Month Price Region_ID Day Region Desc. Current Flag Regional Mgr. Product Dimension Resolution Level PRODUCT KEY Sequence Product Desc. Brand Color Size Manufacturer Level Drawbacks: Summary data in the fact table yields poorer performance for summary levels, huge dimension tables a problem
  • 17. The “Fact Constellation” Schema Store Dimension Fact Table Time Dimension STORE KEY STORE KEY PERIOD KEY Store Description PRODUCT KEY City PERIOD KEY Period Desc State Year Dollars Quarter District ID Units District Desc. Month Price Region_ID Day Region Desc. Current Flag Regional Mgr. Product Dimension Sequence PRODUCT KEY Product Desc. Brand District Fact Table Color Region Fact Table Size District_ID Manufacturer PRODUCT_KEY Region_ID PRODUCT_KEY PERIOD_KEY PERIOD_KEY Dollars Dollars Units Units Price Price
  • 18. Schema and Multiple Fact Tables St ore Dimension STORE KEY Dist rict _ ID Region_ ID • No LEVEL in dimension tables St ore Descript ion Cit y Dist rict Desc. Region_ ID Region Desc. Regional Mgr. • Dimension tables are normalized by St at e Dist rict ID decomposing at the attribute level • Each dimension table has one key Dist rict Desc. Region_ ID Region Desc. Regional Mgr. St ore Fact Table STORE KEY Dist rict Fact Table District_ID RegionFact Table Region_ID for each level of the dimensionís hierarchy PRODUCT_KEY PRODUCT_KEY PERIOD_KEY PRODUCT KEY PERIOD_KEY Dollars PERIOD KEY Unit s • The lowest level key joins the Dollars Unit s Price Dollars Price dimension table to both the fact Unit s Price table and the lower level attribute table How does it work? The best way is for the query to be built by understanding which summary levels exist, and finding the proper snowflaked attribute tables, constraining there for keys, then selecting from the fact table.
  • 19. Aggregation Contd … St ore Dimension STORE KEY Dist rict _ ID Region_ ID St ore Descript ion Dist rict Desc. Region Desc. Cit y Region_ ID Regional Mgr. St at e Dist rict ID Dist rict Desc. Region_ ID Region Desc. Regional Mgr. St ore Fact Table Dist rict Fact Table RegionFact Table Region_ID STORE KEY District_ID PRODUCT_KEY PRODUCT_KEY PERIOD_KEY PRODUCT KEY PERIOD_KEY Dollars PERIOD KEY Dollars Unit s Unit s Price Dollars Price Unit s Price Advantage: Best performance when queries involve aggregation Disadvantage: Complicated maintenance and metadata, explosion in the number of tables in the database
  • 20. Aggregates • Add up amounts for day 1 • In SQL: SELECT sum(amt) FROM SALE WHERE date = 1 sale prodId storeId date amt p1 s1 1 12 p2 s1 1 11 81 p1 s3 1 50 p2 s2 1 8 p1 s1 2 44 p1 s2 2 4
  • 21. Aggregates • Add up amounts by day • In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date sale prodId storeId date amt p1 s1 1 12 p2 s1 1 11 ans date sum p1 s3 1 50 1 81 p2 s2 1 8 2 48 p1 s1 2 44 p1 s2 2 4
  • 22. Another Example • Add up amounts by day, product • In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date, prodId sale prodId storeId date amt p1 s1 1 12 sale prodId date amt p2 s1 1 11 p1 1 62 p1 s3 1 50 p2 1 19 p2 s2 1 8 p1 s1 2 44 p1 2 48 p1 s2 2 4 rollup drill-down
  • 23. Points to be noticed about ROLAP • Defines complex, multi-dimensional data with simple model • Reduces the number of joins a query has to process • Allows the data warehouse to evolve with rel. low maintenance • Can contain both detailed and summarized data. • ROLAP is based on familiar, proven, and already selected technologies. BUT!!! • SQL for multi-dimensional manipulation of calculations.
  • 24. MOLAP: Dimensional Modeling Using the Multi Dimensional Model • MDDB: a special-purpose data model • Facts stored in multi-dimensional arrays • Dimensions used to index array • Sometimes on top of relational DB • Products – Pilot, Arbor Essbase, Gentia
  • 25. The MOLAP Cube Fact table view: Multi-dimensional cube: sale prodId storeId amt p1 s1 12 s1 s2 s3 p2 s1 11 p1 12 50 p1 s3 50 p2 11 8 p2 s2 8 dimensions = 2
  • 26. 3-D Cube Fact table view: Multi-dimensional cube: sale prodId storeId date amt p1 s1 1 12 p2 s1 1 11 s1 s2 s3 day 2 p1 s3 1 50 p1 44 4 p2 s2 1 8 p2 s1 s2 s3 p1 s1 2 44 day 1 p1 12 50 p1 s2 2 4 p2 11 8 dimensions = 3
  • 27. Example roll-up to region Dimensions: e NY r Time, Product, Store Sto SF roll-up to brand LA Attributes: Product (upc, price, …) Juice 10 Store … Product Milk 34 56 … Coke 32 Hierarchies: Cream 12 Product → Brand → … Soap Bread 56 roll-up to week Day → Week → Quarter M T W Th F S S Store → Region → Country Time 56 units of bread sold in LA on M
  • 28. Cube Aggregation: Roll-up Example: computing sums s1 s2 s3 day 2 ... p1 44 4 p2 s1 s2 s3 day 1 p1 12 50 p2 11 8 s1 s2 s3 sum 67 12 50 s1 s2 s3 p1 56 4 50 p2 11 8 129 sum rollup p1 110 p2 19 drill-down
  • 29. Cube Operators for Roll-up s1 s2 s3 day 2 ... p1 44 4 p2 s1 s2 s3 day 1 p1 12 50 p2 11 8 sale(s1,*,*) s1 s2 s3 sum 67 12 50 s1 s2 s3 p1 56 4 50 p2 11 8 129 sum sale(s2,p2,*) p1 110 p2 19 sale(*,*,*)
  • 30. Extended Cube * s1 s2 s3 * p1 56 4 50 110 p2 11 8 19 day 2 * s1 67 s2 12 s3 50 * 129 p1 44 4 48 p2 s1 s2 s3 * day 1 p1 * 12 44 4 50 62 48 sale(*,p2,*) p2 11 8 19 * 23 8 50 81
  • 31. Aggregation Using Hierarchies s1 s2 s3 day 2 p1 44 4 store p2 s1 s2 s3 day 1 p1 12 50 p2 11 8 region country region A region B p1 56 54 p2 11 8 (store s1 in Region A; stores s2, s3 in Region B)
  • 32. Points to be noticed about MOLAP • Pre-calculating or pre-consolidating transactional data improves speed. BUT Fully pre-consolidating incoming data, MDDs require an enormous amount of overhead both in processing time and in storage. An input file of 200MB can easily expand to 5GB MDDs are great candidates for the < 50GB department data marts. • Rolling up and Drilling down through aggregate data. • With MDDs, application design is essentially the definition of dimensions and calculation rules, while the RDBMS requires that the database schema be a star or snowflake.
  • 33. Hybrid OLAP (HOLAP) • HOLAP = Hybrid OLAP: – Best of both worlds – Storing detailed data in RDBMS – Storing aggregated data in MDBMS – User access via MOLAP tools
  • 34. Data Flow in HOLAP RDBMS Server MDBMS Server Client Multi- dimensional SQL-Read access Multidimensional User Multi- data Meta data Viewer dimensional Derived data data SQL-Reach Through Relational Viewer SQL-Read
  • 35. When deciding which technology to go for, consider: 1) Performance: • How fast will the system appear to the end-user? • MDD server vendors believe this is a key point in their favor. 2) Data volume and scalability: • While MDD servers can handle up to 50GB of storage, RDBMS servers can handle hundreds of gigabytes and terabytes.
  • 36. An experiment with Relational and the Multidimensional models on a data set The analysis of the author’s example illustrates the following differences between the best Relational alternative and the Multidimensional approach. relational Multi- Improvement dimensiona l Disk space requirement 17 10 1.7 (Gigabytes) Retrieve the corporate measures 240 1 240 Actual Vs Budget, by month (I/O’s) Calculation of Variance 237 2* 110* Budget/Actual for the whole database (I/O time in hours) * This may include the calculation of many other derived data without any additional I/O. Reference: http://dimlab.usc.edu/csci599/Fall2002/paper/I2_P064.pdf
  • 37. What-if analysis IF A. You require write access B. Your data is under 50 GB C. Your timetable to implement is 60-90 days D. Lowest level already aggregated E. Data access on aggregated level F. You’re developing a general-purpose application for inventory movement or assets management THEN Consider an MDD /MOLAP solution for your data mart   IF A. Your data is over 100 GB B. You have a "read-only" requirement C. Historical data at the lowest level of granularity D. Detailed access, long-running queries E. Data assigned to lowest level elements THEN Consider an RDBMS/ROLAP solution for your data mart. IF A. OLAP on aggregated and detailed data B. Different user groups C. Ease of use and detailed data THEN Consider an HOLAP for your data mart
  • 38. Examples • ROLAP – Telecommunication startup: call data records (CDRs) – ECommerce Site – Credit Card Company • MOLAP – Analysis and budgeting in a financial department – Sales analysis • HOLAP – Sales department of a multi-national company – Banks and Financial Service Providers
  • 39. Tools available • ROLAP: – ORACLE 8i – ORACLE Reports; ORACLE Discoverer – ORACLE Warehouse Builder – Arbors Software’s Essbase • MOLAP: – ORACLE Express Server – ORACLE Express Clients (C/S and Web) – MicroStrategy’s DSS server – Platinum Technologies’ Plantinum InfoBeacon • HOLAP: – ORACLE 8i – ORACLE Express Serve – ORACLE Relational Access Manager – ORACLE Express Clients (C/S and Web)
  • 40. Conclusion • ROLAP: RDBMS -> star/snowflake schema • MOLAP: MDD -> Cube structures • ROLAP or MOLAP: Data models used play major role in performance differences • MOLAP: for summarized and relatively lesser volumes of data (10-50GB) • ROLAP: for detailed and larger volumes of data • Both storage methods have strengths and weaknesses • The choice is requirement specific, though currently data warehouses are predominantly built using RDBMSs/ROLAP.
  • 41. References • http://dimlab.usc.edu/csci599/Fall2002/paper/I2_P064.pdf – OLAP, Relational, and Multidimensional Database Systems, by George Colliat, Arbor Software Corporation • http://www.donmeyer.com/art3.html – Data warehousing Services , Data Mining & Analysis, LLC • http://www.cs.man.ac.uk/~franconi/teaching/2001/CS636/CS636-olap.pp http://www.cs.man.ac.uk/~franconi/teaching/2001/CS636/CS636-olap.p – Data Warehouse Models and OLAP Operations , by Enrico Franconi • http://www.promatis.com/mediacenter/papers - ROLAP, MOLAP, HOLAP: How to determine which to technology is appropriate, by Holger Frietch, PROMATIS Corporation