3. The need for ER modeling?
Problems with early COBOLian data processing
systems.
Data redundancies
From flat file to Table, each entity ultimately
becomes a Table in the physical schema.
Simple O(n2
) Join to work with Tables
4. Why ER Modeling has been so successful?
Coupled with normalization drives out all the
redundancy out of the database.
Change (or add or delete) the data at just one
point.
Can be used with indexing for very fast access.
Resulted in success of OLTP systems.
5. Need for DM: Un-answered Qs
Lets have a look at a typical ER data model first.
Some Observations:
All tables look-alike, as a consequence it is difficult to identify:
Which table is more important ?
Which is the largest?
Which tables contain numerical measurements of the
business?
Which table contain nearly static descriptive attributes?
6. Need for DM: Complexity of Representation
Many topologies for the same ER diagram, all appearing
different.
Very hard to visualize and remember.
A large number of possible connections to any two (or
more) tables
1
10
3
12
2
6
5
11 4
7
8
9
1
10
3
12
2
6
5
11
4
7
8
9
7. Need for DM: The Paradox
The Paradox: Trying to make information accessible using tables
resulted in an inability to query them!
ER and Normalization result in large number of tables which are:
Hard to understand by the users (DB programmers)
Hard to navigate optimally by DBMS software
Real value of ER is in using tables individually or in pairs
Too complex for queries that span multiple tables with a large
number of records
8. ER vs. DM
ER
Constituted to optimize
OLTP performance.
Models the micro
relationships among data
elements.
A wild variability of the
structure of ER models.
Very vulnerable to changes in
the user's querying habits,
because such schemas are
asymmetrical.
DM
Constituted to optimize DSS
query performance.
Models the macro
relationships among data
elements with an overall
deterministic strategy.
All dimensions serve as equal
entry points to the fact table.
Changes in users' querying
habits can be accommodated
by automatic SQL generators.
9. How to simplify a ER data model?
Two general methods:
De-Normalization
Dimensional Modeling (DM)
10. What is DM?
A simpler logical model optimized for decision
support.
Inherently dimensional in nature, with a single
central fact table and a set of smaller
dimensional tables.
Multi-part key for the fact table
Dimensional tables with a single-part PK.
Keys are usually system generated
11. What is DM?
Results in a star like structure, called star schema
or star join.
All relationships mandatory M-1.
Single path between any two levels.
Supports ROLAP operations.
12. Dimensions have Hierarchies
Items
Books Cloths
Fiction Text Men Women
MedicalEngg
Analysts tend to look at the data throughAnalysts tend to look at the data through
dimension at a particular “level” in thedimension at a particular “level” in the
hierarchyhierarchy
14. “Simplified” 3NF (Retail)
CITY DISTRICT
1
ZONE CITY
DISTRICTDIVISION
MONTH QTR
STORE # STREET ZONE ...
WEEK MONTH
DATE WEEK
RECEIPT #STORE # DATE ...
ITEM #RECEIPT # ... $
ITEM # CATEGORY
ITEM #
DEPTCATEGORY
year
month
week
sale_header
store
sale_detail
item_x_cat
item_x_splir
cat_x_dept
M
1
M
1M
1
M
1
1
M M
1
M
M M1
1
M
1
1
M
YEAR QTR
1
M
quarter
SUPPLIER
DIVISIONPROVINCEM
1
division
district
zone
15. Vastly Simplified Star Schema
RECEIPT#
STORE#
DATE
ITEM# M
Fact Table
ITEM#
CATEGORY
DEPT
SUPPLIER
Product Dim
M
Sale Rs.
M
STORE#
ZONE
CITY
PROVINCE
Geography Dim
DISTRICT
DATE
WEEK
QUARTER
YEAR
Time Dim
MONTH
.
.
.
1
1
1
facts
DIVISION
16. The Benefit of Simplicity
Beauty lies in close
correspondence with the
business, evident even to
business users.
17. Features of Star Schema
Dimensional hierarchies are collapsed into a single table
for each dimension. Loss of Information?
A single fact table created with a single header from the
detail records, resulting in:
A vastly simplified physical data model!
Fewer tables (thousands of tables in some ERP systems).
Fewer joins resulting in high performance.
Some requirement of additional space.