SlideShare a Scribd company logo
1 of 20
Business Information Systems Data Modeling - Normalisation Prithwis Mukerjee, Ph.D.
Normalization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],NORMALIZATION A formal data modeling approach to examining and validating the model.
Foundations : Revisited ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Normal Forms ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
How to Normalize an Entity ,[object Object],[object Object],[object Object],[object Object]
Order Management System ,[object Object],[object Object],[object Object],[object Object],[object Object]
Identify the Primary Key ORDER ORDER
First Normal Form - 1NF ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
1NF Violation & Solution Violation : Repeating Groups Solution : Split into two entities { { ORDER ORDER ORDER ITEM FK FK
Cleaner 1 NF Solution ORDER ORDER ITEM FK ,[object Object]
Second Normal Form - 2NF  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
2NF Violation and Solution ORDER ITEM ORDER ITEM PRODUCT Violation : Description, Unit Price does not depend on full PK key Solution : Split into two entities FK
Now the solution looks like  ... ORDER ORDER ITEM PRODUCT
Third Normal Form - 3NF  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
3NF Violation & Solution ORDER ORDER CUSTOMER Violation : Address, Credit Limit does not depend on OrderID, but on Customer Name Solution : Create separate entity for Customer FK
Final Solution ORDER CUSTOMER ORDER ITEM PRODUCT FK FK FK ORDER UnNormalised Entity Normalised Entities Information about Customers and Products can be recorded  even when  there are no Orders
Entities can proliferate ! ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Rationale for Normalisation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Denormalisation ? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],This will cause a performance problem
The Managerial Perspective ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

More Related Content

What's hot

What's hot (11)

Referential integrity
Referential integrityReferential integrity
Referential integrity
 
Chapter 2 Relational Data Model-part 3
Chapter 2 Relational Data Model-part 3Chapter 2 Relational Data Model-part 3
Chapter 2 Relational Data Model-part 3
 
Data integrity
Data integrityData integrity
Data integrity
 
check 11
check 11check 11
check 11
 
Alternate Part Analysis
Alternate Part AnalysisAlternate Part Analysis
Alternate Part Analysis
 
Chap05 c
Chap05 cChap05 c
Chap05 c
 
Lesson03 the relational model
Lesson03 the relational modelLesson03 the relational model
Lesson03 the relational model
 
Dbms relational data model and sql queries
Dbms relational data model and sql queries Dbms relational data model and sql queries
Dbms relational data model and sql queries
 
Normalization
NormalizationNormalization
Normalization
 
Datastage database design and data modeling ppt 4
Datastage database design and data modeling ppt 4Datastage database design and data modeling ppt 4
Datastage database design and data modeling ppt 4
 
Nunes database
Nunes databaseNunes database
Nunes database
 

Similar to BIS04 Data Modelling - II

Data quality and bi
Data quality and biData quality and bi
Data quality and bi
jeffd00
 
Database Management Systems Lab manual (KR20) CSE.pdf
Database Management Systems Lab manual (KR20) CSE.pdfDatabase Management Systems Lab manual (KR20) CSE.pdf
Database Management Systems Lab manual (KR20) CSE.pdf
Anvesh71
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
Ashish Chandwani
 
ISAS 600 – Database Project Phase III RubricAs the final ste.docx
ISAS 600 – Database Project Phase III RubricAs the final ste.docxISAS 600 – Database Project Phase III RubricAs the final ste.docx
ISAS 600 – Database Project Phase III RubricAs the final ste.docx
bagotjesusa
 
Building a Data Quality Program from Scratch
Building a Data Quality Program from ScratchBuilding a Data Quality Program from Scratch
Building a Data Quality Program from Scratch
dmurph4
 

Similar to BIS04 Data Modelling - II (20)

SA Chapter 10
SA Chapter 10SA Chapter 10
SA Chapter 10
 
Data quality and bi
Data quality and biData quality and bi
Data quality and bi
 
Alliance 2017 - CRM Deep Dive: Workflows, Business Rules, Security, and Troub...
Alliance 2017 - CRM Deep Dive: Workflows, Business Rules, Security, and Troub...Alliance 2017 - CRM Deep Dive: Workflows, Business Rules, Security, and Troub...
Alliance 2017 - CRM Deep Dive: Workflows, Business Rules, Security, and Troub...
 
George McGeachie's Favourite PowerDesigner features
George McGeachie's Favourite PowerDesigner featuresGeorge McGeachie's Favourite PowerDesigner features
George McGeachie's Favourite PowerDesigner features
 
Database Management Systems Lab manual (KR20) CSE.pdf
Database Management Systems Lab manual (KR20) CSE.pdfDatabase Management Systems Lab manual (KR20) CSE.pdf
Database Management Systems Lab manual (KR20) CSE.pdf
 
Requirements analysis 2011
Requirements analysis 2011Requirements analysis 2011
Requirements analysis 2011
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
Focus
FocusFocus
Focus
 
Living in a MultiOrg World
Living in a MultiOrg WorldLiving in a MultiOrg World
Living in a MultiOrg World
 
ISAS 600 – Database Project Phase III RubricAs the final ste.docx
ISAS 600 – Database Project Phase III RubricAs the final ste.docxISAS 600 – Database Project Phase III RubricAs the final ste.docx
ISAS 600 – Database Project Phase III RubricAs the final ste.docx
 
Crack Smoking Data Models
Crack Smoking Data ModelsCrack Smoking Data Models
Crack Smoking Data Models
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Building a Data Quality Program from Scratch
Building a Data Quality Program from ScratchBuilding a Data Quality Program from Scratch
Building a Data Quality Program from Scratch
 
Itlc hanoi ba day 3 - thai son - data modelling
Itlc hanoi   ba day 3 - thai son - data modellingItlc hanoi   ba day 3 - thai son - data modelling
Itlc hanoi ba day 3 - thai son - data modelling
 
Company master training_final_l
Company master training_final_lCompany master training_final_l
Company master training_final_l
 
Inventory management system
Inventory management systemInventory management system
Inventory management system
 
inventory management system
 inventory management system inventory management system
inventory management system
 
Normalization
NormalizationNormalization
Normalization
 
Data Governance challenges in a major Energy Company
Data Governance challenges in a major Energy CompanyData Governance challenges in a major Energy Company
Data Governance challenges in a major Energy Company
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 

More from Prithwis Mukerjee

04 Dimensional Analysis - v6
04 Dimensional Analysis - v604 Dimensional Analysis - v6
04 Dimensional Analysis - v6
Prithwis Mukerjee
 
Lecture02 - Data Mining & Analytics
Lecture02 - Data Mining & AnalyticsLecture02 - Data Mining & Analytics
Lecture02 - Data Mining & Analytics
Prithwis Mukerjee
 
Data mining clustering-2009-v0
Data mining clustering-2009-v0Data mining clustering-2009-v0
Data mining clustering-2009-v0
Prithwis Mukerjee
 
Data mining classification-2009-v0
Data mining classification-2009-v0Data mining classification-2009-v0
Data mining classification-2009-v0
Prithwis Mukerjee
 

More from Prithwis Mukerjee (20)

Bitcoin, Blockchain and the Crypto Contracts - Part 2
Bitcoin, Blockchain and the Crypto Contracts - Part 2Bitcoin, Blockchain and the Crypto Contracts - Part 2
Bitcoin, Blockchain and the Crypto Contracts - Part 2
 
Bitcoin, Blockchain and Crypto Contracts - Part 3
Bitcoin, Blockchain and Crypto Contracts - Part 3Bitcoin, Blockchain and Crypto Contracts - Part 3
Bitcoin, Blockchain and Crypto Contracts - Part 3
 
Internet of Things
Internet of ThingsInternet of Things
Internet of Things
 
Thought controlled devices
Thought controlled devicesThought controlled devices
Thought controlled devices
 
Cloudcasting
CloudcastingCloudcasting
Cloudcasting
 
Currency, Commodity and Bitcoins
Currency, Commodity and BitcoinsCurrency, Commodity and Bitcoins
Currency, Commodity and Bitcoins
 
Data Science
Data ScienceData Science
Data Science
 
05 OLAP v6 weekend
05 OLAP  v6 weekend05 OLAP  v6 weekend
05 OLAP v6 weekend
 
04 Dimensional Analysis - v6
04 Dimensional Analysis - v604 Dimensional Analysis - v6
04 Dimensional Analysis - v6
 
Thought control
Thought controlThought control
Thought control
 
World of data @ praxis 2013 v2
World of data   @ praxis 2013  v2World of data   @ praxis 2013  v2
World of data @ praxis 2013 v2
 
BIS 08a - Application Development - II Version 2
BIS 08a - Application Development - II Version 2BIS 08a - Application Development - II Version 2
BIS 08a - Application Development - II Version 2
 
Lecture02 - Data Mining & Analytics
Lecture02 - Data Mining & AnalyticsLecture02 - Data Mining & Analytics
Lecture02 - Data Mining & Analytics
 
ইন্টার্নেট কি এবং কেন ?
ইন্টার্নেট কি এবং কেন ?ইন্টার্নেট কি এবং কেন ?
ইন্টার্নেট কি এবং কেন ?
 
Data mining clustering-2009-v0
Data mining clustering-2009-v0Data mining clustering-2009-v0
Data mining clustering-2009-v0
 
Data mining classification-2009-v0
Data mining classification-2009-v0Data mining classification-2009-v0
Data mining classification-2009-v0
 
Data mining arm-2009-v0
Data mining arm-2009-v0Data mining arm-2009-v0
Data mining arm-2009-v0
 
Data mining intro-2009-v2
Data mining intro-2009-v2Data mining intro-2009-v2
Data mining intro-2009-v2
 
PPM Lite
PPM LitePPM Lite
PPM Lite
 
Business Intelligence Industry Perspective Session I
Business Intelligence   Industry Perspective Session IBusiness Intelligence   Industry Perspective Session I
Business Intelligence Industry Perspective Session I
 

Recently uploaded

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Recently uploaded (20)

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 

BIS04 Data Modelling - II

  • 1. Business Information Systems Data Modeling - Normalisation Prithwis Mukerjee, Ph.D.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. Identify the Primary Key ORDER ORDER
  • 8.
  • 9. 1NF Violation & Solution Violation : Repeating Groups Solution : Split into two entities { { ORDER ORDER ORDER ITEM FK FK
  • 10.
  • 11.
  • 12. 2NF Violation and Solution ORDER ITEM ORDER ITEM PRODUCT Violation : Description, Unit Price does not depend on full PK key Solution : Split into two entities FK
  • 13. Now the solution looks like ... ORDER ORDER ITEM PRODUCT
  • 14.
  • 15. 3NF Violation & Solution ORDER ORDER CUSTOMER Violation : Address, Credit Limit does not depend on OrderID, but on Customer Name Solution : Create separate entity for Customer FK
  • 16. Final Solution ORDER CUSTOMER ORDER ITEM PRODUCT FK FK FK ORDER UnNormalised Entity Normalised Entities Information about Customers and Products can be recorded even when there are no Orders
  • 17.
  • 18.
  • 19.
  • 20.

Editor's Notes

  1. Normalized data models are often referred to as relational models. However, star schemas and snowflake schemas may also be implemented on top of relational data base management systems. Normalization is the process of removing redundancy in data by separating the data into multiple tables thus designing for efficient and reliable single record access. Relational database theorists have created rules by which degree of normalization is measured. These degrees are called normal forms , with the minimum degree of normalization commonly accepted as 3 rd normal form. Often degree of normalization beyond 3 rd normal form is sacrificed due to hardware limitations. A properly normalized relational data model allows the efficient use of storage space, elimination of redundant data, reduction or elimination of inconsistent data, and minimization of the data maintenance burden. However, an “over normalized” data model may cause performance concerns. Accessing the data requires large table joins, which slows response time. Normalized data models will be in 3 rd Normal Form when the following are true: Repeating groups of data are removed (1 st normal form) Redundant data is removed (2 nd normal form) Attributes of an entity depend upon the key, the whole key, and nothing but the key. Once the model has been normalized to at least 3 rd Normal Form, then the following are true: The structure is remarkably insensitive to change. Structural paths for accessing information are very clear. Create, Report, Update and Delete anomalies are eliminated. Performance can be an issue. The structure can be very complex.
  2. Normalized relational data modeling is the classic modeling technique used for organizing entities defined by unique identifiers and attributes that are wholly dependent upon those identifiers. This is the modeling technique that database administrators and modelers are most familiar with, and is most commonly associated with transaction systems development. Normalization is the process of removing redundancy in data by separating the data into multiple tables. There are well established rules of normalization: Eliminate Repeating Groups. Make a separate table for each set of related attributes, and give each table a primary key. (1 st Normal Form) Eliminate Redundant Data. If an attribute depends on only part of a multi-valued key, remove it to a separate table. (2 nd Normal Form) Eliminate Columns Not Dependent on Key. If attributes do not contribute to a description of the key, remove them to a separate table. (3 rd Normal Form) Isolate Independent Multiple Relationships. No table may contain two or more 1:n or n:m relationships that are not directly related. (4 th Normal Form) Isolate Semantically Related Multiple Relationships. There may be practical constrains on information that justify separating logically related many-to-many relationships. (5 th Normal Form) The last two rules, 4 th and 5 th Normal Forms, are not often attained. It is not uncommon, in fact, to denormalize from 3 rd Normal Form in the physical model to address performance concerns. Consequently, the rest of this section will not cover these two forms.
  3. Step 1: Source material can be in many different forms. In order to begin the normalization process, this example assumes that sources were combined in an un-normalized form. All attributes of the relation must be identified, along with the key, and any repeating groups. For example, in the un-normalized table above the EMPL NO is underlined to indicate that it is the key. EMPL NO, EMPL NAME, DEPT NO, DEPT NAME, EMPL SEX, COURSE NO, COURSE NAME, and ASSESSMENT are all attributes of the relation. COURSE DATA is recognized as a repeating group, and this is notated with the asteric.
  4. Step 2: In order to achieve First Normal Form(1NF) all repeating groups must be removed. The repeating groups were identified in step one, and in order to remove them a new key is created. The relation now has two keys, also referred to as a concatenated key. For example, in the 1NF table above, the COURSE DATA repeating group, the relation now lists only the attributes for the repeating groups, and EMPL NO, and COURSE NO comprise the concatenated key for the relation.
  5. Step 3: Removing partial dependencies from the relation will result in the model being in Second Normal Form(2NF). It should be noted that if a relation is in 1NF, and has a single key, then it is already in 2NF. If an attribute is not fully functionally dependent upon the entire key, then this attribute must be removed, and a new relation must be created. A foreign key will indicate the relationship between the relations. For example, in the 2NF table above, EMPL NAME, DEPT NO, DEPT NAME, and EMPL SEX are only dependent upon the EMPL NO, not the COURSE NO. These items are separated into a single relation. COURSE NAME is only dependent upon the COURSE NO. These items are separated into a single relation. COURSE NO, and EMPL NO will now become foreign keys to indicate the relationships among the relations. ASSESSMENT is the only attribute dependent upon the whole key, and hence the creation of another relation.
  6. Step 4: Removing mutual dependencies from the relation will result in Third Normal Form (3NF). If an attribute of a relation is mutually dependent upon another attribute, then these attributes must be removed into another relation. A foreign key will indicate the relationship between the two relations. For example, in the 3NF table above, DEPT NAME is mutually dependent upon DEPT NO. The DEPT NO will remain in the employee relation, and a new relation will be created for the DEPT NO and DEPT NAME. DEPT NO in the employee relation will become the foreign key.
  7. Normalized data models are often referred to as relational models. However, star schemas and snowflake schemas may also be implemented on top of relational data base management systems. Normalization is the process of removing redundancy in data by separating the data into multiple tables thus designing for efficient and reliable single record access. Relational database theorists have created rules by which degree of normalization is measured. These degrees are called normal forms , with the minimum degree of normalization commonly accepted as 3 rd normal form. Often degree of normalization beyond 3 rd normal form is sacrificed due to hardware limitations. A properly normalized relational data model allows the efficient use of storage space, elimination of redundant data, reduction or elimination of inconsistent data, and minimization of the data maintenance burden. However, an “over normalized” data model may cause performance concerns. Accessing the data requires large table joins, which slows response time. Normalized data models will be in 3 rd Normal Form when the following are true: Repeating groups of data are removed (1 st normal form) Redundant data is removed (2 nd normal form) Attributes of an entity depend upon the key, the whole key, and nothing but the key. Once the model has been normalized to at least 3 rd Normal Form, then the following are true: The structure is remarkably insensitive to change. Structural paths for accessing information are very clear. Create, Report, Update and Delete anomalies are eliminated. Performance can be an issue. The structure can be very complex.