SlideShare una empresa de Scribd logo
1 de 18
Descargar para leer sin conexión
Data warehousing logical design

        Mirjana Mazuran
      mazuran@elet.polimi.it



        December 15, 2009




                                  1/18
Outline




      Data Warehouse logical design
          ROLAP model
               star schema
               snowflake schema
      Exercise 1: wine company
      Exercise 2: real estate agency




                                       2/18
Introduction
Logical design




          Starting from the conceptual design it is necessary to determin
          the logical schema of data
          We use ROLAP (Relational On-Line Analytical Processing)
          model to represent multidimensional data
          ROLAP uses the relational data model, which means that
          data is stored in relations
          Given the DFM representation of multidimensional data, two
          schemas are used:
                 star schema
                 snowflake schema




                                                                            3/18
ROLAP
Star schema


         Each dimension is represented by a relation such that:
              the primary key of the relation is the primary key of the
              dimension
              the attributes of the relation describe all aggregation levels of
              the dimension
         A fact is represented by a relation such that:
              the primary key of the relation is the set of primary keys
              imported from all the dimension tables
              the attributes of the relation are the measures of the fact

    Pros and Cons
         few joins are needed during query execution
         dimension tables are denormalized
         denormalization introduces redundancy

                                                                                  4/18
ROLAP
Star schema: example


                 MaxAmount                     Time                           Policy
       Amount                 Class            IdTime                         IdPolicy
                                               Day                            Class
      Year                                     Month                          Amount
                                               Year                           MaxAmount
                            IdPolicy
         Month                                              Accident
                 Accident         Motivation                IdTime
                                                            IdPolicy
             NrOfAccidents                                  IdCustomer
      Day    Cost                                           IdMotivation
                                                            NrOfAccidents
                         IDCustomer                         Cost

                                               Customer                     Motivation
                                               IdCustomer                   IdMotivation
        BYear                City Region       Birthday                     Motivation
                                               Gender
                   Sex
                                               City
                                               Region


                                                                                           5/18
ROLAP
Snowflake schema
        Each (primary) dimension is represented by a relation:
             the primary key of the relation is the primary key of the
             dimension
             the attributes of the relation directly depend by the primary key
             a set of foreign keys is used to access information at different
             levels of aggregation. Such information is part of the
             secondary dimensions and is stored in dedicated relations
        A fact is represented by a relation such that:
             the primary key of the relation is the set of primary keys
             imported from all and only the primary dimension tables
             the attributes of the relation are the measures of the fact

    Pros and Cons
        denormalization is reduced
        less memory space is required
        a lot of joins can be required if they involve attributes in
        secondary dimension tables
                                                                                 6/18
ROLAP
Snowflake schema: example



                Category                        Time                       Furniture
                                                IdTime                     IdFurniture
             Type         Material
                                                Day                        Material
     Year                                       Month                      Type
                                                Year                       Category
                          IdFurniture
       Month                                                 Sale
                   Sale                                      IdTime
                                                             IdFurniture
              Quantity
                                                             IdCustomer
              Income
     Day      Discount                                       Quantity
                                                             Income
                      IDCustomer                             Discount

                              Region            Customer                     Place
                                                IdCustomer                   IdPlace
            BYear          City         State   IdPlace                      City
                                                BirthYear                    Region
                    Sex                         Sex                          State



                                                                                         7/18
Exercise 1
Wine company

    An online order wine company requires the designing of a
    datawarehouse to record the quantity and sales of its wines to its
    customers. Part of the original database is composed by the
    following tables:
    CUSTOMER (Code, Name, Address, Phone, BDay, Gender)
          WINE (Code, Name, Type, Vintage, BottlePrice, CasePrice,
               Class)
         CLASS (Code, Name, Region)
          TIME (TimeStamp, Date, Year)
        ORDER (Customer, Wine, Time, nrBottles, nrCases)
    Note that the tables represent the main entities of the ER schema,
    thus it is necessary to derive the significant relashionships among
    them in order to correctly design the data warehouse.

                                                                         8/18
Exercise 1: A possible solution
Snowflake schema

          FACT Sales
    MEASURES Quantity, Cost
    DIMENSIONS Customer, Area, Time, Wine → Class

          Customer                        Wine
         CustomerCode                     WineCode
         Birthday                         ClassCode
         Gender                           Type
                         SalesTable
                                          Vintage
                         WineCode
                         CustomerCode
                         OrderTimeCode
                         Quantity
         Time                             Class
                         Cost
         TimeCode                         ClassCode
         Date                             Region
         Year




                                                      9/18
Exercise 2
Real estate agency
    Let us consider the case of a real estate agency whose database is
    composed by the following tables:
        OWNER (IDOwner, Name, Surname, Address, City, Phone)
        ESTATE (IDEstate, IDOwner, Category, Area, City, Province,
                  Rooms, Bedrooms, Garage, Meters)
    CUSTOMER (IDCust, Name, Surname, Budget, Address, City,
                  Phone)
         AGENT (IDAgent, Name, Surname, Office, Address, City,
                  Phone)
       AGENDA (IDAgent, Data, Hour, IDEstate, ClientName)
           VISIT (IDEstate,IDAgent, IDCust, Date, Duration)
           SALE (IDEstate,IDAgent, IDCust, Date, AgreedPrice,
                  Status)
          RENT (IDEstate,IDAgent, IDCust, Date, Price, Status,
                  Time)
                                                                         10/18
Exercise 2
Real estate agency
          Goal:
               Provide a supervisor with an overview of the situation. The
               supervisor must have a global view of the business, in terms of
               the estates the agency deals with and of the agents’ work.
          Questions:
            1. Design a conceptual schema for the DW.
            2. What facts and dimensions do you consider?
            3. Design a Star Schema or Snowflake Schema for the DW.
          Write the following SQL queries:
               How many customers have visited properties of at least 3
               different categories?
               What is the average duration of visits per property category?
               Who has paid the highest price among the customers that
               have viewed properties of at least 3 different categories?
               Who has bought a flat for the highest price w.r.t. each month?
               What kind of property sold for the highest price w.r.t each city
               and month?
                                                                                  11/18
Exercise 2: A possible solution
Facts and dimensions
    Points 1 and 2 are left as homework. In particular it is required to
    discuss the facts of interest (with respect to your point of view)
    and then define, for each fact, the attribute tree, dimensions and
    measures with the corresponding glossary.
    The following ideas will be used during the solution of the exercise:
         supervisors should be able to control the sales of the agency
                FACT Sales
         MEASURES OfferPrice, AgreedPrice, Status
         DIMENSIONS EstateID, OwnerID, CustomerID, AgentID,
                       TimeID
         supervisors should be able to control the work of the agents by
         analyzing the visits to the estates, which the agents are in
         charge of
                FACT Viewing
         MEASURES Duration
         DIMENSIONS EstateID, CustomerID, AgentID, TimeID
                                                                            12/18
Exercise 2: A possible solution
Star schema
       Estate                           Owner
       EstateID          SalesTable     OwnerID
       Category          EstateID       Name
       Area              OwnerID        Surname
       City              CustomerID     Address
       Province          AgentID        City
       Rooms             TimeID         Phone
       Bedrooms          OfferPrice
                                         Agent
       Garage            AgreedPrice
                                        AgentID
       Meters            Status
                                        Name
       Sheet
                                        Surname
       Map
                                        Office
                                        Address
        Customer                        City
       CustomerID                       Phone
                         ViewingTable
       Name
                         EstateID
       Surname                           Time
                         CustomerID
       Budget                           TimeID
                         AgentID
       Address                          Day
                         TimeID
       City                             Month
                         Duration
       Phone                            Year
                                                  13/18
Exercise 2: A possible solution
Snowflake schema
       Estate            SalesTable      Time
      EstateID          EstateID        TimeID
      PlaceID           TimeID          Day
      Category          CustomerID      Month
      Rooms              OwnerID        Year
      Bedrooms          AgentID         Owner
      Garage            OfferPrice      OwnerID
      Meters            AgreedPrice     PlaceID
      Sheet             Status          Name
      Map                               Surname
       Customer          ViewingTable
                                        Address
      CustomerID        TimeID
                                        Phone
      PlaceID           EstateID
                                         Agent
      Name               CustomerID
                                        AgentID
      Surname           AgentID         PlaceID
      Budget            Duration        Name
      Address                           Surname
                         Place
      Phone                             Office
                        PlaceID
                                        Address
                        City
                                        Phone
                        Province
                        Area

                                                  14/18
Exercise 2: A possible solution
SQL queries wrt the star schema



          How many customers have visited properties of at least 3
          different categories?

          CREATE VIEW Cust3Cat AS
             SELECT V.CustomerID
             FROM ViewingTable V, Estate E
             WHERE V.EstateID = E.EstateID
             GROUP BY V.CustomerID
             HAVING COUNT(DISTINCT E.Category) >=3

          SELECT COUNT(*)
          FROM Cust3Cat



                                                                     15/18
Exercise 2: A possible solution
SQL queries wrt the star schema
          Who has paid the highest price among the customers that
          have viewed properties of at least 3 different categories?
          SELECT C.CustomerID
          FROM Cust3Cat C, SalesTable S
          WHERE C.CustID = S.CustID AND S.AgreedPrice IN
          (SELECT MAX(S.AgreedPrice)
                    FROM Cust3Cat C1, SalesTable S1
                    WHERE C1.CustomerID = S1.CustomerID)

          Average duration of visits per property category
          SELECT E.Category, AVG(V.Duration)
          FROM ViewingTable V, Estate E
          WHERE V.EstateID = E.EstateID
          GROUP BY E.Category
                                                                      16/18
Exercise 2: A possible solution
SQL queries wrt the star schema


          Who has bought a flat for the highest price w.r.t. each month?

          SELECT S.CustomerID, T.Month, T.Year, S.AgreedPrice
          FROM SalesTable S, Estate E, Time T
          WHERE S.EstateID = E.EstateID AND S.TimeID =
          T.TimeID AND E.Category = “flat” AND (T.Month,
          T.Year, S.AgreedPrice) IN (
               SELECT T1.Month, T1.Year,
               MAX(S1.AgreedPrice)
               FROM SalesTable S1, Estate E1, Time T1
               WHERE S1.EstateID = E1.EstateID AND
               S1.TimeID = T1.TimeID AND E1.Category =
               “flat”
               GROUP BY T1.Month, T1.Year

                                                                          17/18
Exercise 2: A possible solution
SQL queries wrt the star schema
          What kind of property sold for the highest price w.r.t each city
          and month?
          SELECT E.Category, E.City, T.Moth, T.Year,
                 E.AgreedPrice
          FROM SalesTable S, Time T, Estate E
          WHERE S.TimeID = T.TimeID AND E.EstateID =
          S.EstateID AND (P.AgreedPrice, P.City, T.month,
          T.year) IN (
                    SELECT MAX(E1.AgreedPrice), E1.City,
                    T1.Month, T1.Year)
                    FROM SalesTable S1, Time T1, Estate E1
                    WHERE S1.TimeID = T1.TimeID AND
                    E1.EstateID = S1.EstateID
                    GROUP BY T.Month, T.Year, E.City)

                                                                             18/18

Más contenido relacionado

La actualidad más candente

White Paper - Data Warehouse Project Management
White Paper - Data Warehouse Project ManagementWhite Paper - Data Warehouse Project Management
White Paper - Data Warehouse Project ManagementDavid Walker
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 
Implementing Effective Data Governance
Implementing Effective Data GovernanceImplementing Effective Data Governance
Implementing Effective Data GovernanceChristopher Bradley
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Miningidnats
 
Data science life cycle
Data science life cycleData science life cycle
Data science life cycleManoj Mishra
 
How We Did It: The Case of the Credit Card Breach
How We Did It: The Case of the Credit Card BreachHow We Did It: The Case of the Credit Card Breach
How We Did It: The Case of the Credit Card BreachTeradata
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guidethomasmary607
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
An introduction to Business intelligence
An introduction to Business intelligenceAn introduction to Business intelligence
An introduction to Business intelligenceHadi Fadlallah
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schemaSayed Ahmed
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.pptneelamoberoi1030
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data modelmoni sindhu
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture janani thirupathi
 
Knowledge discovery process
Knowledge discovery process Knowledge discovery process
Knowledge discovery process Shuvra Ghosh
 
DFD ภาษาอังกฤษ
DFD ภาษาอังกฤษDFD ภาษาอังกฤษ
DFD ภาษาอังกฤษskiats
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingDunn Solutions Group
 

La actualidad más candente (20)

White Paper - Data Warehouse Project Management
White Paper - Data Warehouse Project ManagementWhite Paper - Data Warehouse Project Management
White Paper - Data Warehouse Project Management
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Implementing Effective Data Governance
Implementing Effective Data GovernanceImplementing Effective Data Governance
Implementing Effective Data Governance
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Data science life cycle
Data science life cycleData science life cycle
Data science life cycle
 
How We Did It: The Case of the Credit Card Breach
How We Did It: The Case of the Credit Card BreachHow We Did It: The Case of the Credit Card Breach
How We Did It: The Case of the Credit Card Breach
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data Management
Data Management Data Management
Data Management
 
An introduction to Business intelligence
An introduction to Business intelligenceAn introduction to Business intelligence
An introduction to Business intelligence
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Knowledge discovery process
Knowledge discovery process Knowledge discovery process
Knowledge discovery process
 
DFD ภาษาอังกฤษ
DFD ภาษาอังกฤษDFD ภาษาอังกฤษ
DFD ภาษาอังกฤษ
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional Modeling
 
Data mining
Data mining Data mining
Data mining
 
Data Warehouse Designing: Dimensional Modelling and E-R Modelling
Data Warehouse Designing: Dimensional Modelling and E-R ModellingData Warehouse Designing: Dimensional Modelling and E-R Modelling
Data Warehouse Designing: Dimensional Modelling and E-R Modelling
 

Destacado

Wise real estate planning
Wise real estate planningWise real estate planning
Wise real estate planningPaul Cheema
 
ROLAP partitioning in MS SQL Server 2016
ROLAP partitioning in MS SQL Server 2016ROLAP partitioning in MS SQL Server 2016
ROLAP partitioning in MS SQL Server 2016Andrej Zafka
 
Difference between molap, rolap and holap in ssas
Difference between molap, rolap and holap  in ssasDifference between molap, rolap and holap  in ssas
Difference between molap, rolap and holap in ssasUmar Ali
 
Decision Support Systems
Decision Support SystemsDecision Support Systems
Decision Support SystemsShigem
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEZalpa Rathod
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Internet of Things and its applications
Internet of Things and its applicationsInternet of Things and its applications
Internet of Things and its applicationsPasquale Puzio
 

Destacado (7)

Wise real estate planning
Wise real estate planningWise real estate planning
Wise real estate planning
 
ROLAP partitioning in MS SQL Server 2016
ROLAP partitioning in MS SQL Server 2016ROLAP partitioning in MS SQL Server 2016
ROLAP partitioning in MS SQL Server 2016
 
Difference between molap, rolap and holap in ssas
Difference between molap, rolap and holap  in ssasDifference between molap, rolap and holap  in ssas
Difference between molap, rolap and holap in ssas
 
Decision Support Systems
Decision Support SystemsDecision Support Systems
Decision Support Systems
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Internet of Things and its applications
Internet of Things and its applicationsInternet of Things and its applications
Internet of Things and its applications
 

Similar a Dwlogical0910

Your favorite data modeling tool, your partner in crime for Data Warehouse Au...
Your favorite data modeling tool, your partner in crime for Data Warehouse Au...Your favorite data modeling tool, your partner in crime for Data Warehouse Au...
Your favorite data modeling tool, your partner in crime for Data Warehouse Au...FrederikN
 
Kishore jaladi-dw
Kishore jaladi-dwKishore jaladi-dw
Kishore jaladi-dwsam2sung2
 
Tirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence LayerTirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence LayerWildan Maulana
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modelingjamessnape
 
Marino mercedesportfolio201212
Marino mercedesportfolio201212Marino mercedesportfolio201212
Marino mercedesportfolio201212Marino Mercedes
 
Marino-mercedesportfolio201212
Marino-mercedesportfolio201212Marino-mercedesportfolio201212
Marino-mercedesportfolio201212Marino Mercedes
 
Greenfield Development with CQRS and Windows Azure
Greenfield Development with CQRS and Windows AzureGreenfield Development with CQRS and Windows Azure
Greenfield Development with CQRS and Windows AzureDavid Hoerster
 
Carnegie Presentation
Carnegie PresentationCarnegie Presentation
Carnegie Presentationgopack21
 
8 sony - innovating with tag management
8   sony - innovating with tag management8   sony - innovating with tag management
8 sony - innovating with tag managementEnsighten
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data modeljagdish_93
 
Dw design 2_conceptual_model
Dw design 2_conceptual_modelDw design 2_conceptual_model
Dw design 2_conceptual_modelClaudia Gomez
 
StrikeIron Address Verification for Salesforce.com Datasheet
StrikeIron Address Verification for Salesforce.com DatasheetStrikeIron Address Verification for Salesforce.com Datasheet
StrikeIron Address Verification for Salesforce.com Datasheetmflannigan
 
Micro strategy Reporting Suite
Micro strategy Reporting SuiteMicro strategy Reporting Suite
Micro strategy Reporting SuiteClassic Polo
 
retail marketing
retail marketingretail marketing
retail marketingsabasri
 
Document process v7
Document process v7Document process v7
Document process v7Bala Kris
 
ICA Incentives
ICA IncentivesICA Incentives
ICA IncentivesBirkhoff
 

Similar a Dwlogical0910 (20)

Your favorite data modeling tool, your partner in crime for Data Warehouse Au...
Your favorite data modeling tool, your partner in crime for Data Warehouse Au...Your favorite data modeling tool, your partner in crime for Data Warehouse Au...
Your favorite data modeling tool, your partner in crime for Data Warehouse Au...
 
Kishore jaladi-dw
Kishore jaladi-dwKishore jaladi-dw
Kishore jaladi-dw
 
Tirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence LayerTirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence Layer
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Marino mercedesportfolio201212
Marino mercedesportfolio201212Marino mercedesportfolio201212
Marino mercedesportfolio201212
 
Marino-mercedesportfolio201212
Marino-mercedesportfolio201212Marino-mercedesportfolio201212
Marino-mercedesportfolio201212
 
Greenfield Development with CQRS and Windows Azure
Greenfield Development with CQRS and Windows AzureGreenfield Development with CQRS and Windows Azure
Greenfield Development with CQRS and Windows Azure
 
SAP Power Designer
SAP Power DesignerSAP Power Designer
SAP Power Designer
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Carnegie Presentation
Carnegie PresentationCarnegie Presentation
Carnegie Presentation
 
8 sony - innovating with tag management
8   sony - innovating with tag management8   sony - innovating with tag management
8 sony - innovating with tag management
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data model
 
Dw design 2_conceptual_model
Dw design 2_conceptual_modelDw design 2_conceptual_model
Dw design 2_conceptual_model
 
Competitve Intelligence and Pricing
Competitve Intelligence and PricingCompetitve Intelligence and Pricing
Competitve Intelligence and Pricing
 
StrikeIron Address Verification for Salesforce.com Datasheet
StrikeIron Address Verification for Salesforce.com DatasheetStrikeIron Address Verification for Salesforce.com Datasheet
StrikeIron Address Verification for Salesforce.com Datasheet
 
Micro strategy Reporting Suite
Micro strategy Reporting SuiteMicro strategy Reporting Suite
Micro strategy Reporting Suite
 
retail marketing
retail marketingretail marketing
retail marketing
 
Document process v7
Document process v7Document process v7
Document process v7
 
Birdie Analysis
Birdie AnalysisBirdie Analysis
Birdie Analysis
 
ICA Incentives
ICA IncentivesICA Incentives
ICA Incentives
 

Último

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesShubhangi Sonawane
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIShubhangi Sonawane
 

Último (20)

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 

Dwlogical0910

  • 1. Data warehousing logical design Mirjana Mazuran mazuran@elet.polimi.it December 15, 2009 1/18
  • 2. Outline Data Warehouse logical design ROLAP model star schema snowflake schema Exercise 1: wine company Exercise 2: real estate agency 2/18
  • 3. Introduction Logical design Starting from the conceptual design it is necessary to determin the logical schema of data We use ROLAP (Relational On-Line Analytical Processing) model to represent multidimensional data ROLAP uses the relational data model, which means that data is stored in relations Given the DFM representation of multidimensional data, two schemas are used: star schema snowflake schema 3/18
  • 4. ROLAP Star schema Each dimension is represented by a relation such that: the primary key of the relation is the primary key of the dimension the attributes of the relation describe all aggregation levels of the dimension A fact is represented by a relation such that: the primary key of the relation is the set of primary keys imported from all the dimension tables the attributes of the relation are the measures of the fact Pros and Cons few joins are needed during query execution dimension tables are denormalized denormalization introduces redundancy 4/18
  • 5. ROLAP Star schema: example MaxAmount Time Policy Amount Class IdTime IdPolicy Day Class Year Month Amount Year MaxAmount IdPolicy Month Accident Accident Motivation IdTime IdPolicy NrOfAccidents IdCustomer Day Cost IdMotivation NrOfAccidents IDCustomer Cost Customer Motivation IdCustomer IdMotivation BYear City Region Birthday Motivation Gender Sex City Region 5/18
  • 6. ROLAP Snowflake schema Each (primary) dimension is represented by a relation: the primary key of the relation is the primary key of the dimension the attributes of the relation directly depend by the primary key a set of foreign keys is used to access information at different levels of aggregation. Such information is part of the secondary dimensions and is stored in dedicated relations A fact is represented by a relation such that: the primary key of the relation is the set of primary keys imported from all and only the primary dimension tables the attributes of the relation are the measures of the fact Pros and Cons denormalization is reduced less memory space is required a lot of joins can be required if they involve attributes in secondary dimension tables 6/18
  • 7. ROLAP Snowflake schema: example Category Time Furniture IdTime IdFurniture Type Material Day Material Year Month Type Year Category IdFurniture Month Sale Sale IdTime IdFurniture Quantity IdCustomer Income Day Discount Quantity Income IDCustomer Discount Region Customer Place IdCustomer IdPlace BYear City State IdPlace City BirthYear Region Sex Sex State 7/18
  • 8. Exercise 1 Wine company An online order wine company requires the designing of a datawarehouse to record the quantity and sales of its wines to its customers. Part of the original database is composed by the following tables: CUSTOMER (Code, Name, Address, Phone, BDay, Gender) WINE (Code, Name, Type, Vintage, BottlePrice, CasePrice, Class) CLASS (Code, Name, Region) TIME (TimeStamp, Date, Year) ORDER (Customer, Wine, Time, nrBottles, nrCases) Note that the tables represent the main entities of the ER schema, thus it is necessary to derive the significant relashionships among them in order to correctly design the data warehouse. 8/18
  • 9. Exercise 1: A possible solution Snowflake schema FACT Sales MEASURES Quantity, Cost DIMENSIONS Customer, Area, Time, Wine → Class Customer Wine CustomerCode WineCode Birthday ClassCode Gender Type SalesTable Vintage WineCode CustomerCode OrderTimeCode Quantity Time Class Cost TimeCode ClassCode Date Region Year 9/18
  • 10. Exercise 2 Real estate agency Let us consider the case of a real estate agency whose database is composed by the following tables: OWNER (IDOwner, Name, Surname, Address, City, Phone) ESTATE (IDEstate, IDOwner, Category, Area, City, Province, Rooms, Bedrooms, Garage, Meters) CUSTOMER (IDCust, Name, Surname, Budget, Address, City, Phone) AGENT (IDAgent, Name, Surname, Office, Address, City, Phone) AGENDA (IDAgent, Data, Hour, IDEstate, ClientName) VISIT (IDEstate,IDAgent, IDCust, Date, Duration) SALE (IDEstate,IDAgent, IDCust, Date, AgreedPrice, Status) RENT (IDEstate,IDAgent, IDCust, Date, Price, Status, Time) 10/18
  • 11. Exercise 2 Real estate agency Goal: Provide a supervisor with an overview of the situation. The supervisor must have a global view of the business, in terms of the estates the agency deals with and of the agents’ work. Questions: 1. Design a conceptual schema for the DW. 2. What facts and dimensions do you consider? 3. Design a Star Schema or Snowflake Schema for the DW. Write the following SQL queries: How many customers have visited properties of at least 3 different categories? What is the average duration of visits per property category? Who has paid the highest price among the customers that have viewed properties of at least 3 different categories? Who has bought a flat for the highest price w.r.t. each month? What kind of property sold for the highest price w.r.t each city and month? 11/18
  • 12. Exercise 2: A possible solution Facts and dimensions Points 1 and 2 are left as homework. In particular it is required to discuss the facts of interest (with respect to your point of view) and then define, for each fact, the attribute tree, dimensions and measures with the corresponding glossary. The following ideas will be used during the solution of the exercise: supervisors should be able to control the sales of the agency FACT Sales MEASURES OfferPrice, AgreedPrice, Status DIMENSIONS EstateID, OwnerID, CustomerID, AgentID, TimeID supervisors should be able to control the work of the agents by analyzing the visits to the estates, which the agents are in charge of FACT Viewing MEASURES Duration DIMENSIONS EstateID, CustomerID, AgentID, TimeID 12/18
  • 13. Exercise 2: A possible solution Star schema Estate Owner EstateID SalesTable OwnerID Category EstateID Name Area OwnerID Surname City CustomerID Address Province AgentID City Rooms TimeID Phone Bedrooms OfferPrice Agent Garage AgreedPrice AgentID Meters Status Name Sheet Surname Map Office Address Customer City CustomerID Phone ViewingTable Name EstateID Surname Time CustomerID Budget TimeID AgentID Address Day TimeID City Month Duration Phone Year 13/18
  • 14. Exercise 2: A possible solution Snowflake schema Estate SalesTable Time EstateID EstateID TimeID PlaceID TimeID Day Category CustomerID Month Rooms OwnerID Year Bedrooms AgentID Owner Garage OfferPrice OwnerID Meters AgreedPrice PlaceID Sheet Status Name Map Surname Customer ViewingTable Address CustomerID TimeID Phone PlaceID EstateID Agent Name CustomerID AgentID Surname AgentID PlaceID Budget Duration Name Address Surname Place Phone Office PlaceID Address City Phone Province Area 14/18
  • 15. Exercise 2: A possible solution SQL queries wrt the star schema How many customers have visited properties of at least 3 different categories? CREATE VIEW Cust3Cat AS SELECT V.CustomerID FROM ViewingTable V, Estate E WHERE V.EstateID = E.EstateID GROUP BY V.CustomerID HAVING COUNT(DISTINCT E.Category) >=3 SELECT COUNT(*) FROM Cust3Cat 15/18
  • 16. Exercise 2: A possible solution SQL queries wrt the star schema Who has paid the highest price among the customers that have viewed properties of at least 3 different categories? SELECT C.CustomerID FROM Cust3Cat C, SalesTable S WHERE C.CustID = S.CustID AND S.AgreedPrice IN (SELECT MAX(S.AgreedPrice) FROM Cust3Cat C1, SalesTable S1 WHERE C1.CustomerID = S1.CustomerID) Average duration of visits per property category SELECT E.Category, AVG(V.Duration) FROM ViewingTable V, Estate E WHERE V.EstateID = E.EstateID GROUP BY E.Category 16/18
  • 17. Exercise 2: A possible solution SQL queries wrt the star schema Who has bought a flat for the highest price w.r.t. each month? SELECT S.CustomerID, T.Month, T.Year, S.AgreedPrice FROM SalesTable S, Estate E, Time T WHERE S.EstateID = E.EstateID AND S.TimeID = T.TimeID AND E.Category = “flat” AND (T.Month, T.Year, S.AgreedPrice) IN ( SELECT T1.Month, T1.Year, MAX(S1.AgreedPrice) FROM SalesTable S1, Estate E1, Time T1 WHERE S1.EstateID = E1.EstateID AND S1.TimeID = T1.TimeID AND E1.Category = “flat” GROUP BY T1.Month, T1.Year 17/18
  • 18. Exercise 2: A possible solution SQL queries wrt the star schema What kind of property sold for the highest price w.r.t each city and month? SELECT E.Category, E.City, T.Moth, T.Year, E.AgreedPrice FROM SalesTable S, Time T, Estate E WHERE S.TimeID = T.TimeID AND E.EstateID = S.EstateID AND (P.AgreedPrice, P.City, T.month, T.year) IN ( SELECT MAX(E1.AgreedPrice), E1.City, T1.Month, T1.Year) FROM SalesTable S1, Time T1, Estate E1 WHERE S1.TimeID = T1.TimeID AND E1.EstateID = S1.EstateID GROUP BY T.Month, T.Year, E.City) 18/18