SlideShare una empresa de Scribd logo
1 de 38
Business Information Systems

Dimensional Analysis
Prithwis Mukerjee, Ph.D.
Dimensional Models
A denormalized relational model


Made up of tables with attributes



Relationships defined by keys and foreign keys

Organized for understandability and ease of reporting
rather than update
Queried and maintained by SQL or special purpose
management tools.

2
From Relational to Dimensional
Relational Model


Designed from the
perspective of process
efficiency

Dimensional Model


 Sales

 Marketing
 Sales




 Customers


“Normalised” data
structures
 Entity Relationship Model



Used for transactional, or
operational systems



Based on data that is
 Current
 Non Redundant

“De-normalised” data
structures in blatant
violation of normalisation
Used for analysis of
aggregated data
 OLAP : OnLine Analytical

Processing

 OLTP : OnLine Transaction

Processing

Designed from the
perspective of subject



Based on data that is
 Historical
 May be redundant

3
ER vs. Dimensional Models
One table per entity
Minimize data
redundancy
Optimize update
The Transaction
Processing Model

One fact table for
data organization
Maximize
understandability
Optimized for
retrieval
The data
warehousing model

4
Strengths of the Dimensional Model
Predictable, standard
framework
Respond well to changes in
user reporting needs
Relatively easy to add data
without reloading tables
Standard design approaches
have been developed
There exist a number of
products supporting the
dimensional model

“The Data Warehouse
Toolkit” by Ralph
Kimball & Margy Ross
“The Data Warehouse
Lifecycle Toolkit” by
Ralph Kimball & Margy
Ross

5
A Transactional Database
Countries
Addresses
Customers
CustomerID
AddressID
Name

States

CountryID

AddressID

StateID

Description

StateID

CountryID

Street

Desc

OrderHeader
OrderHeaderID
CustomerID
OrderDate
FreightAmount

Products
OrderDetails

ProductID

OrderHeaderID

Description

ProductID

Size

Amount

6
A Dimensional Model
Customers
CustomerID

Time

Name

TimeID

Street

Date

FactSales

Month

CustomerID

Quarter

ProductID

Year

TimeID

Products

SalesAmount

ProductID

State
Country

Description
Size
Subcategory
Category

7
Extract Transform Load

Relational

Dimensional Model

Process Oriented

Subject Oriented

Transactional

Aggregate

Current

Historic

8
Facts & Dimensions
• There are two main types of objects in a dimensional
model
– Facts are quantitative measures that we wish to analyse and
report on.
– Dimensions contain textual descriptors of the business. They
provide context for the facts.

9
Fact & Dimension Tables
FACTS

DIMENSIONS

Contains two or more
foreign keys

Contain text and
descriptive information

Tend to have huge
numbers of records

1 in a 1-M relationship

Useful facts tend to be
numeric and additive

Generally the source of
interesting constraints
Typically contain the
attributes for the SQL
answer set.

10
Fact Table
Measurements associated with a specific business
process
Grain: level of detail of the table
Process events produce fact records
Facts (attributes) are usually



Numeric
Additive

Derived facts included
Foreign (surrogate) keys refer to dimension tables
(entities)
Classification values help define subsets

11
Dimension Tables
Entities describing the objects of the process
Conformed dimensions cross processes
Attributes are descriptive



Text
Numeric

Surrogate keys
Less volatile than facts (1:m with the fact table)
Null entries
Date dimensions
Produce “by” questions

12
The Bus Matrix

Date

Product

Store

Promotion

Warehouse

Vendor

Retail Sales

X

X

X

X

Retail Inventory

X

X

X

Retail
Deliveries

X

X

X

Warehouse
Inventory

X

X

X

Warehouse
Deliveries

X

X

X

X

Purchase Orders

X

X

X

X

X

Contract

Shipper

X

X

Process

13
Business Model
As always in life, there are some disadvantages
to 3NF:
Performance can be truly awful. Most of the
work that is performed on denormalizing a data
model is an attempt to reach performance
objectives.
The structure can be overwhelmingly complex.
We may wind up creating many small relations
which the user might think of as a single
relation or group of data.

14
The 4 Step Design Process
Choose the Data Mart
Declare the Grain
Choose the Dimensions
Choose the Facts

15
Structural Dimensions
The first step is the development of the
structural dimensions. This step corresponds
very closely to what we normally do in a
relational database.
The star architecture that we will develop here
depends upon taking the central intersection
entities as the fact tables and building the
foreign key => primary key relations as
dimensions.

16
Steps in dimensional modeling
Select an associative entity for a fact table
Determine granularity
Replace operational keys with surrogate keys
Promote the keys from all hierarchies to the fact table
Add date dimension
Split all compound attributes
Add necessary categorical dimensions
Fact (varies with time) / Attribute (constant)

17
The Big Picture

Customer ID
Cust Name
Cust Address

Order ID
Customer ID (FK)
Date

Order ID (FK)
Item ID
Product ID (FK)
Quantity
Value

Product ID
Product Name
Product Desc
Unit Price

OLTP
OLAP

Customer ID
Cust Name
Cust Address

Transaction ID
Product ID (FK)
Client ID (FK)
Date
Quantity
Value

Product ID
Product Name
Product Desc
Unit Price

18
Converting an E-R Diagram
Determine the purpose of the mart
Identify an association table as the central fact
table
Determine facts to be included
Replace all keys with surrogate keys
Promote foreign keys in related tables to the
fact table
Add time dimension
Refine the dimension tables

19
Fact Tables
Represent a process or reporting environment that is of
value to the organization
It is important to determine the identity of the fact table
and specify exactly what it represents.
Typically correspond to an associative entity in the E-R
model

20
Grain (unit of analysis)
The grain determines what each fact record represents:
the level of detail.
For example


Individual transactions



Snapshots (points in time)



Line items on a document

Generally better to focus on the smallest grain

21
Facts
Measurements associated with fact table records at fact
table granularity
Normally numeric and additive
Non-key attributes in the fact table
Attributes in dimension tables are constants. Facts vary with
the granularity of the fact table

22
Dimensions
A table (or hierarchy of tables) connected with the
fact table with keys and foreign keys
Preferably single valued for each fact record
(1:m)
Connected with surrogate (generated) keys, not
operational keys
Dimension tables contain text or numeric
attributes

23
CUSTOMER
customer_ID (PK)
customer_name
purchase_profile
credit_profile
address

STORE
store_ID (PK)
store_name
address
district
floor_type
CLERK
clerk_id (PK)
clerk_name
clerk_grade

ERD

ORDER
order_num (PK)
customer_ID (FK)
store_ID (FK)
clerk_ID (FK)
date

PRODUCT
SKU (PK)
description
brand
category

ORDER-LINE
order_num (PK) (FK)
SKU (PK) (FK)
promotion_key (FK)
dollars_sold
units_sold
dollars_cost

PROMOTION
promotion_NUM (PK)
promotion_name
price_type
ad_type

24
TIME
time_key (PK)
SQL_date
day_of_week
month
STORE
store_key (PK)
store_ID
store_name
address
district
floor_type

CLERK
clerk_key (PK)
clerk_id
clerk_name
clerk_grade

DIMENSONAL
MODEL
FACT
time_key (FK)
store_key (FK)
clerk_key (FK)
product_key (FK)
customer_key (FK)
promotion_key (FK)
dollars_sold
units_sold
dollars_cost

PRODUCT
product_key (PK)
SKU
description
brand
category
CUSTOMER
customer_key (PK)
customer_name
purchase_profile
credit_profile
address

PROMOTION
promotion_key (PK)
promotion_name
price_type
ad_type

25
Date Dimensions
Fiscal Year

Calendar Year

Fiscal Quarter

Calendar
Quarter

Fiscal Month

Calendar
Month

Fiscal Week

Calendar
Week

Type of Day

Day of Week

Day

Holiday

26
Attribute Name
Attribute Description
Day
The specific day that an activity took
place.
Day of Week
The specific name of the day.
Holiday
Identifies that this day is a holiday.
Type of Day
Indicates whether or not this day is
a weekday or a weekend day.
Calendar Week
The week ending date, always a
Saturday. Note that WE denotes
Calendar Month
The calendar month.
Calendar Quarter
Calendar Year
Fiscal Week
Fiscal Month
Fiscal Quarter
Fiscal Year

Sample Values
06/04/1998; 06/05/1998
Monday; Tuesday
Easter; Thanksgiving
Weekend; Weekday

WE 06/06/1998;
WE 06/13/1998
January,1998; February,
1998
The calendar quarter.
1998Q1; 1998Q4
The calendar year.
1998
The week that represents the
F Week 1 1998;
corporate calendar. Note that the F F Week 46 1998
The fiscal period comprised of 4 or 5 F January, 1998;
weeks. Note that the F in the data
F February, 1998
The grouping of 3 fiscal months.
F 1998Q1; F1998Q2
The grouping of 52 fiscal weeks / 12 F 1998; F 1999
fiscal months that comprise the
financial year.
27
Snowflaking & Hierarchies
Efficiency vs Space
Understandability
M:N relationships

28
Star Schema

dimTime

dimProduct

…
factSales

dimCustomer

ProductID
ProductName
CategoryName
SubCategoryName

ProductID
TimeID
CustomerID
SalesAmount

…

29
Snowflake Schema

dimSubCategory
SubCategoryID
Description
dimCategory
CategoryID
subCategoryID
Description
factSales
ProductID
TimeID
CustomerID
SalesAmount

dimProduct
ProductID
CategoryID
Description

30
Slowly Changing Dimensions
(Addresses, Managers, etc.)
Type 1: Store only the current value, overwrite
previous value
Type 2: Create a dimension record for each value (with
or without date stamps)
Type 3: Create an attribute in the dimension record for
previous value

31
Examples
Original

SKU

LeapPad

Education

LP2105

ProductKey

Description

Category

SKU

LeapPad

Toy

LP2105

ProductKey

Description

Category

SKU

21553

LeapPad

Education

LP2105

44631

LeapPad

Toy

LP2105

ProductKey

Description

Category

OldCat

SKU

21553

Type 3

Category

21553

Type 2

Description

21553

Type 1

ProductKey

LeapPad

Toy

Education

LP2105

ProductKey

Description

Category

OldCat

SKU

21335

LeapPad

Electronics

Education

LP2105

44631

LeapPad

Electronics

Toy

LP2105

68122

LeapPad

Education

Electronics

LP2105

Hybrid

32
Type 1 Slowly Changing Dimension
The simplest form
Only updates existing records
Overwrites history

33
Type 1 Slowly Changing Dimension

CustomerID

Code

Name

State Gender

1

K001

Miranda Kerr

VIC
NSW

F

34
Type 2 Slowly Changing Dimension
Allows the recording of changes of state over time
Generates a new record each time the state changes
Usually requires the use of effective dates when joining
to facts.

35
Type 2 Slowly Changing Dimension

CustomerID

Code

Name

State Gender Start

End

1

K001

Miranda Kerr NSW

F

1/1/09

23/2/09
<NULL>

2

K001

Miranda Kerr VIC

F

24/2/09

<NULL>

36
Type 3 Slowly Changing Dimension
De-normalized change tracking
Only keeps a limited history
Stores changes in separate columns

37
Type 3 Slowly Changing Dimension

CustomerID Code Name
1

K001

Miranda Kerr

Current Gender Prev
State
State
NSW
F
<NULL>
VIC

38

Más contenido relacionado

La actualidad más candente

Lec 17 heap data structure
Lec 17 heap data structureLec 17 heap data structure
Lec 17 heap data structure
Sajid Marwat
 

La actualidad más candente (20)

Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)
 
Machine Learning in R
Machine Learning in RMachine Learning in R
Machine Learning in R
 
Association rule Mining
Association rule MiningAssociation rule Mining
Association rule Mining
 
Star schema PPT
Star schema PPTStar schema PPT
Star schema PPT
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
 
Introduction To Applied Machine Learning
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learning
 
05 Classification And Prediction
05   Classification And Prediction05   Classification And Prediction
05 Classification And Prediction
 
Relational Algebra & Calculus
Relational Algebra & CalculusRelational Algebra & Calculus
Relational Algebra & Calculus
 
08 subprograms
08 subprograms08 subprograms
08 subprograms
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
SQL Queries
SQL QueriesSQL Queries
SQL Queries
 
1.8 discretization
1.8 discretization1.8 discretization
1.8 discretization
 
MODULE 1_Introduction to Data analytics and life cycle..pptx
MODULE 1_Introduction to Data analytics and life cycle..pptxMODULE 1_Introduction to Data analytics and life cycle..pptx
MODULE 1_Introduction to Data analytics and life cycle..pptx
 
Cost estimation for Query Optimization
Cost estimation for Query OptimizationCost estimation for Query Optimization
Cost estimation for Query Optimization
 
Database Normalization
Database NormalizationDatabase Normalization
Database Normalization
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse design
 
Lec 17 heap data structure
Lec 17 heap data structureLec 17 heap data structure
Lec 17 heap data structure
 
String operation
String operationString operation
String operation
 
sum of subset problem using Backtracking
sum of subset problem using Backtrackingsum of subset problem using Backtracking
sum of subset problem using Backtracking
 

Destacado (10)

Retail location
Retail locationRetail location
Retail location
 
Retail Store Locations - Retail Management
Retail Store Locations - Retail ManagementRetail Store Locations - Retail Management
Retail Store Locations - Retail Management
 
Retail strategy brand-store location-size
Retail strategy   brand-store location-sizeRetail strategy   brand-store location-size
Retail strategy brand-store location-size
 
Principles of retailing
Principles of retailingPrinciples of retailing
Principles of retailing
 
Retail Strategy
Retail StrategyRetail Strategy
Retail Strategy
 
Retail formats
Retail formatsRetail formats
Retail formats
 
Retailing (Concept & Definition)
Retailing (Concept & Definition)Retailing (Concept & Definition)
Retailing (Concept & Definition)
 
Retail merchandising
Retail merchandisingRetail merchandising
Retail merchandising
 
Store design
Store designStore design
Store design
 
Retail store layout,design and display
Retail store layout,design and displayRetail store layout,design and display
Retail store layout,design and display
 

Similar a 04 Dimensional Analysis - v6

Bw training 3 data modeling
Bw training   3 data modelingBw training   3 data modeling
Bw training 3 data modeling
Joseph Tham
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
ganblues
 
Bi Architecture And Conceptual Framework
Bi Architecture And Conceptual FrameworkBi Architecture And Conceptual Framework
Bi Architecture And Conceptual Framework
Slava Kokaev
 
SSAS R2 and SharePoint 2010 – Business Intelligence
SSAS R2 and SharePoint 2010 – Business IntelligenceSSAS R2 and SharePoint 2010 – Business Intelligence
SSAS R2 and SharePoint 2010 – Business Intelligence
Slava Kokaev
 
Database Management Systems Lab manual (KR20) CSE.pdf
Database Management Systems Lab manual (KR20) CSE.pdfDatabase Management Systems Lab manual (KR20) CSE.pdf
Database Management Systems Lab manual (KR20) CSE.pdf
Anvesh71
 
Modelado Dimensional 4 Etapas
Modelado Dimensional 4 EtapasModelado Dimensional 4 Etapas
Modelado Dimensional 4 Etapas
Roberto Espinosa
 

Similar a 04 Dimensional Analysis - v6 (20)

Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Bw training 3 data modeling
Bw training   3 data modelingBw training   3 data modeling
Bw training 3 data modeling
 
Overview of business intelligence
Overview of business intelligenceOverview of business intelligence
Overview of business intelligence
 
Integrated Financial Modeling - MS Excel and VBA for MS Excel
Integrated Financial Modeling - MS Excel and VBA for MS ExcelIntegrated Financial Modeling - MS Excel and VBA for MS Excel
Integrated Financial Modeling - MS Excel and VBA for MS Excel
 
Data Warehouse-Final
Data Warehouse-FinalData Warehouse-Final
Data Warehouse-Final
 
Data visualization for e commerce of jcpenney
Data visualization for e commerce of jcpenneyData visualization for e commerce of jcpenney
Data visualization for e commerce of jcpenney
 
Industrialization of IT and Operations
Industrialization of IT and OperationsIndustrialization of IT and Operations
Industrialization of IT and Operations
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAP
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
IT301-Datawarehousing (1) and its sub topics.pptx
IT301-Datawarehousing (1) and its sub topics.pptxIT301-Datawarehousing (1) and its sub topics.pptx
IT301-Datawarehousing (1) and its sub topics.pptx
 
3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt
 
Bi Architecture And Conceptual Framework
Bi Architecture And Conceptual FrameworkBi Architecture And Conceptual Framework
Bi Architecture And Conceptual Framework
 
Analytics 101
Analytics 101Analytics 101
Analytics 101
 
SSAS R2 and SharePoint 2010 – Business Intelligence
SSAS R2 and SharePoint 2010 – Business IntelligenceSSAS R2 and SharePoint 2010 – Business Intelligence
SSAS R2 and SharePoint 2010 – Business Intelligence
 
Database Management Systems Lab manual (KR20) CSE.pdf
Database Management Systems Lab manual (KR20) CSE.pdfDatabase Management Systems Lab manual (KR20) CSE.pdf
Database Management Systems Lab manual (KR20) CSE.pdf
 
Modelado Dimensional 4 Etapas
Modelado Dimensional 4 EtapasModelado Dimensional 4 Etapas
Modelado Dimensional 4 Etapas
 
LeadDesk database description
LeadDesk database descriptionLeadDesk database description
LeadDesk database description
 
Fusion Applications - PIM Deep Dive
Fusion Applications - PIM Deep DiveFusion Applications - PIM Deep Dive
Fusion Applications - PIM Deep Dive
 
3dw
3dw3dw
3dw
 

Más de Prithwis Mukerjee

Lecture02 - Data Mining & Analytics
Lecture02 - Data Mining & AnalyticsLecture02 - Data Mining & Analytics
Lecture02 - Data Mining & Analytics
Prithwis Mukerjee
 
Data mining clustering-2009-v0
Data mining clustering-2009-v0Data mining clustering-2009-v0
Data mining clustering-2009-v0
Prithwis Mukerjee
 
Data mining classification-2009-v0
Data mining classification-2009-v0Data mining classification-2009-v0
Data mining classification-2009-v0
Prithwis Mukerjee
 

Más de Prithwis Mukerjee (20)

Bitcoin, Blockchain and the Crypto Contracts - Part 2
Bitcoin, Blockchain and the Crypto Contracts - Part 2Bitcoin, Blockchain and the Crypto Contracts - Part 2
Bitcoin, Blockchain and the Crypto Contracts - Part 2
 
Bitcoin, Blockchain and Crypto Contracts - Part 3
Bitcoin, Blockchain and Crypto Contracts - Part 3Bitcoin, Blockchain and Crypto Contracts - Part 3
Bitcoin, Blockchain and Crypto Contracts - Part 3
 
Internet of Things
Internet of ThingsInternet of Things
Internet of Things
 
Thought controlled devices
Thought controlled devicesThought controlled devices
Thought controlled devices
 
Cloudcasting
CloudcastingCloudcasting
Cloudcasting
 
Currency, Commodity and Bitcoins
Currency, Commodity and BitcoinsCurrency, Commodity and Bitcoins
Currency, Commodity and Bitcoins
 
Data Science
Data ScienceData Science
Data Science
 
05 OLAP v6 weekend
05 OLAP  v6 weekend05 OLAP  v6 weekend
05 OLAP v6 weekend
 
Thought control
Thought controlThought control
Thought control
 
World of data @ praxis 2013 v2
World of data   @ praxis 2013  v2World of data   @ praxis 2013  v2
World of data @ praxis 2013 v2
 
BIS 08a - Application Development - II Version 2
BIS 08a - Application Development - II Version 2BIS 08a - Application Development - II Version 2
BIS 08a - Application Development - II Version 2
 
Lecture02 - Data Mining & Analytics
Lecture02 - Data Mining & AnalyticsLecture02 - Data Mining & Analytics
Lecture02 - Data Mining & Analytics
 
ইন্টার্নেট কি এবং কেন ?
ইন্টার্নেট কি এবং কেন ?ইন্টার্নেট কি এবং কেন ?
ইন্টার্নেট কি এবং কেন ?
 
Data mining clustering-2009-v0
Data mining clustering-2009-v0Data mining clustering-2009-v0
Data mining clustering-2009-v0
 
Data mining classification-2009-v0
Data mining classification-2009-v0Data mining classification-2009-v0
Data mining classification-2009-v0
 
Data mining arm-2009-v0
Data mining arm-2009-v0Data mining arm-2009-v0
Data mining arm-2009-v0
 
Data mining intro-2009-v2
Data mining intro-2009-v2Data mining intro-2009-v2
Data mining intro-2009-v2
 
PPM Lite
PPM LitePPM Lite
PPM Lite
 
Business Intelligence Industry Perspective Session I
Business Intelligence   Industry Perspective Session IBusiness Intelligence   Industry Perspective Session I
Business Intelligence Industry Perspective Session I
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

04 Dimensional Analysis - v6

Notas del editor

  1. A simplistic transactional schema showing 7 tables relating to sales orders
  2. This is a star schema, (later on we will discuss snowflake schemas.) showing 4 tables that relate to the previous transactional schema State and Country have been denormalized under Customer Dimensions are in Blue These are the things that we analyse “by” (eg. By Time, By Customer, By Region) Fact is yellow These are ususally quantitative things that we are interested in
  3. We already have the data in a data model – why create another data model…? Well… What is currently called “Data Warehousing” or “Business Intelligence” was originally often called “Decision Support Systems” We already have all the data in the OLTP system, why replicate it in a dimensional model? Atomic - Summary Supports Transaction throughput – Supports Aggregate queries Current - Historic
  4. Facts work best if they are additive Dimensions allow us to “slice &amp; dice” the facts into meaningful groups. The provide context
  5. There are some changes where it is valid to overwrite history. When someone gets married and changes their name, they may want to carry the history of their previous purchases over to their new name rather than see a split history.
  6. This makes inserts into your fact table more expensive as you always need to match on the effective dates as well as the business key. Sometimes people kept a “Current” flag. Another approach rather than putting nulls in the End date is to put an arbitrary date well in the future, this can make the join logic a bit simpler.
  7. This type of change tracking is more useful when there is a once off change like a change in sales regions where you want to see history re-cast into the new regions, but may also want to compare the old and new regions.