SlideShare a Scribd company logo
1 of 42
Download to read offline
Normalization and Codd’s
Rules
n   Normalization
n   Normal Forms
    n   1 NF
    n   2 NF
    n   3 NF
n   Codd’s Rules
Data Normalization
n   The purpose of normalization is to
    produce a stable set of relations that is
    a faithful model of the operations of the
    enterprise.
    n   Achieve a design that is highly flexible
    n   Reduce redundancy
    n   Ensure that the design is free of certain
        update, insertion and deletion anomalies
Normalization
    1NF
    1NF    Flat file


    2NF
    2NF    Partial dependencies removed


    3NF
    3NF    Transitive dependencies removed


   BCNF
   BCNF    Every determinant is a candidate key

    4NF
    4NF    Non-tivial multi-valued dependencies
           removed
Order No.      10001
                                     Stereos To Go
Date:      6 / 15 / 99                     Invoice
                                                                         Stereos To Go          Go, Hogs
Account No.            0000-000-0000-0
Customer:       John Smith
                                                                         0000 000 0000 0
     Address:   2036-26 Street                                           John Smith                      1/05
                Sacramento   CA 95819
                City                     State   Zip Code


Date Shipped:          6 / 18 / 99
Item       Product
Number     Code                       Product Description/Manufacturer                   Qty     Price
 1       SAGX730           Pioneer Remote A/V Receiver                                    1     56995
 2       AT10               Cervwin Vega Loudspeakers                                           35995
                                                                                          1
 3       CDPC725            Sony Disc-Jockey CD Changer                                   1     39995
 4

 5


                                                                          Subtotal             132985
                                                              Shipping & Handling               10000
                                                                        Sales Tax               10306
                                                                            Total              153291
Unnormalized Relation

(Invoice_number, Invoice_date, Date_delivered, Cust_account
Cust_name Cust_addr Cust_city Cust_state Zip_code,
Item1 Item1_descrip Item1_qty Item1_price,
Item2 Item2_descrip Item2_qty Item2_price, . . . ,
Item7 Item7_descrip Item7_qty Item7_price)




How would a program process the data to recreate the invoice?
Unnormalized to 1NF

(Invoice_number, Invoice_date, Date_delivered, Cust_account
Cust_name Cust_addr Cust_city Cust_state Zip_code,
Item1, Item1_descrip, Item1_qty, Item1_price,
Item2, Item2_descrip, Item2_qty, Item2_price, . . . , Repeating groups
Item7, Item7_descrip, Item7_qty, Item7_price)




A flat file places all the data of a transaction into a single record.
                                                               record.

            This is reminiscent of a COBOL or BASIC program
          processing a single transaction with one read statement.
Unnormalized to 1NF

(Invoice_number, Invoice_date, Date_delivered, Cust_account,
Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code,
Item, Item_descrip, Item_qty, Item_price)


Nominated group of attributes
     to serve as the key
(form a unique combination)
                             • Eliminate the repeating groups.
                             • Each row retains data for one item.
                             • If a person bought 5 items, we
                               would have five tuples
1NF
            r                    er
                  e            b                 e
                b                              m
             um            num            r na        Flat File
            n            t              e
       i ce            un             m
    vo              co            sto                                       Item    Item
 In              Ac           Cu                   Item      Description   Quantity Price
10001 123456 John Smith ••• SAGX730 Pioneer Remote A/V Rec
10001 123456 John Smith ••• SAGX730 Pioneer Remote A/V Rec                    1
                                                                              1    569.95
                                                                                   569.95

10001 123456 John Smith •••
10001 123456 John Smith •••              AT10
                                         AT10         Cerwin Vega Loudspeakers 1
                                                      Cerwin Vega Loudspeakers 1   359.95
                                                                                   359.95

10001 123456 John Smith ••• CDPC725 Sony Disc Jockey CD
10001 123456 John Smith ••• CDPC725 Sony Disc Jockey CD                       1
                                                                              1    399.95
                                                                                   399.95

10001 123456 John Smith ••• S/H
10001 123456 John Smith ••• S/H                        Shipping
                                                       Shipping               1
                                                                              1    100.00
                                                                                   100.00

10001 123456 John Smith ••• Tax
10001 123456 John Smith ••• Tax                        Sales Tax
                                                       Sales Tax              1
                                                                              1    103.06
                                                                                   103.06
From 1NF

(Invoice_number, Invoice_date, Date_delivered,
Cust_account, Cust_name, Cust_addr, Cust_city,
Cust_state, Zip_code,
Item, Item_descrip, Item_qty, Item_price)



        Functional dependencies and determinants

  Example: item_descrip is functionally dependent on item,
  such that item is the determinant of item_descript.
From 1NF to 2NF

(Invoice_number, Invoice_date, Date_delivered,
Cust_account, Cust_name, Cust_addr, Cust_city,
Cust_state, Zip_code)

(Item, Item_descrip, Item_qty, Item_price)


   Is this unique by itself?
   What happens if the item is purchased more than once?
From 1NF to 2NF

(Invoice_number, Invoice_date, Date_delivered,
Cust_account, Cust_name, Cust_addr, Cust_city,
Cust_state, Zip_code)
                          Partial dependency
(Invoice_number, Item, Item_descrip, Item_qty, Item_price)

   Composite key (forms a unique combination)
From 1NF to 2NF

(Invoice_number, Invoice_date, Date_delivered, Cust_account,
Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code)

(Invoice_number, Item, Item_qty, Item_price)

(Item, Item_descrip)
From 2NF to 3NF

(Invoice_number, Invoice_date, Date_delivered, Cust_account,
Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code)

(Invoice_number, Item, Item_qty, Item_price)

(Item, Item_descrip)


       Which attributes are dependent on others?
                  Is there a problem?
Transitive Dependencies and
Anomalies
n   Insertion anomalies
    n   To add a new row, all customer (name,
        address, city, state, zip code, phone) and
        products (description) must be consistent
        with previous entries
n   Deletion anomalies
    n   By deleting a row, a customer or product
        may cease to exist
n   Modification anomalies
    n   To modify a customer’s or product’s data in
        one row, all modifications must be carried
Insertion and Modification Anomalies
      For example…
Insert a new Panasonic product
         Product_code Manufacturer_name
         DVD-A110
         DVD-A110      Panasonic
                       Panasonic
         PV-4210
         PV-4210       Panasonic
                       Panasonic          CT-32S35
                                          CT-32S35     PAN
                                                       PAN
         PV-4250
         PV-4250       Panasonic
                       Panasonic

                                                     Inconsistency
   DVD-A110
   DVD-A110       Panasonic
                  Panasonic
                                       Change all Panasonic
   PV-4210
   PV-4210        PanaSonic
                  PanaSonic
   PV-4250        Pana Sonic          products’ manufacturer
   PV-4250        Pana Sonic
   CT-32S35
   CT-32S35       PAN
                  PAN                name to “Panasonic USA”
Deletion Anomaly
 For Example…
4377182   John Smith   lll   Sacramento   CA      95831
4398711   Arnold S     lll   Davis        CA      95691
4578461   Gray Davis   lll   Sacramento   CA      95831
4873179   Lisa Carr    lll   Reno         NV      89557

By deleting customer Arnold S, we would also be deleting
                   Davis, California.
Invoice_number
Transitive                        Invoice_date
Dependencies                      Date_delivered
                                  Cust_account
                                  Cust_name
Ÿ A condition where A, B, C
  are attributes of a relation    Cust_addr
  such that if A à B and          Cust_city
  B à C, then C is transitively   Cust_state
  dependent on A via B
                                  Zip_code
  (provided that A is not
  functionally dependent on B     Item
  or C).                          Item_descrip
                                  Invoice_number+Item
                                  Item_qty
                                  Item_price
Why Should City and State Be
Separated from Customer Relation?

 n   City and state are dependent on zip
     code for their values and not the
     customer’s identifier (i.e., key).

             Zip_code à City, State

 n   Otherwise,

        Cust_account à Cust_addr,
     Zip_code à City, State
3NF
Invoice Relation
(Invoice_number, Invoice_date, Date_delivered, Cust_account)
Customer Relation
(Cust_account, Cust_name, Cust_addr, Zip_code)
Zip_code Relation
(Zip_code, City, State)
Invoice_items Relation
(Invoice_number, Item, Item_qty, Item_price)
Items Relation
(Item, Item_descrip)
3NF
Invoice Relation
(Invoice_number, Invoice_date, Date_delivered, Cust_account)
Customer Relation
(Cust_account, Cust_name, Cust_addr, Zip_code)
Zip_code Relation
(Zip_code, City, State)
Invoice_items Relation
(Invoice_number, Item, Item_qty, Item_price)
Items Relation             Manufacturers Relation
(Item, Item_descrip)       (Manuf_code, Manuf_name)
  Since the Items relation contains the manufacturer’s name in the
  description, a separate Manufacturers relation can be created
First to Third Normal Form
      (1NF - 3NF)
n   1NF: A relation is in first normal form if and only
    if every attribute is single-valued for each tuple
    (remove the repeating or multi-value attributes
    and create a flat file)
n   2NF: A relation is in second normal form if and
    only if it is in first normal form and the nonkey
    attributes are fully functionally dependent on the
    key (remove partial dependencies)
n   3NF: A relation is in third normal form if it is in
    second normal form and no nonkey attribute is
    transitively dependent on the key (remove
    transitive dependencies)
Codd's Rules
  E. F. Codd presented these rules as a
  basis of determining whether a DBMS
     could be classified as Relational
Codd's Rules
n   Codd's Rules can be divided into 5
    functional areas –
    n   Foundation Rules
    n   Structural Rules
    n   Integrity Rules
    n   Data Manipulation Rules
    n   Data Independence Rules
Foundation Rules
n   Rule 0 –
n   Any system claimed to be a RDBMS
    must be able to manage databases
    entirely through its relational
    capabilities.
    n   All data definition & manipulation must be
        able to be done through relational ops.
Foundation Rules
n   Rule 12 - Nonsubversion Rule -
n   If a RDBMS has a low level (record at a time)
    language, that low level language cannot be
    used to subvert or bypass the integrity rules
    &constraints expressed in the higher-level
    relational language.
    n   All database access must be controlled through the
        DBMS so that the integrity of the database cannot be
        compromised without the knowledge of the user or
        the DBA.
         n   This does not prohibit use of record at a time languages e.g.
             PL/SQL
Codd's Rules
n   Structural Rules (Rules 1 & 6)
    n   The fundamental structural construct is the
        table.
    n   Codd states that an RDBMS must support
        tables, domains, primary & foreign keys.
    n   Each table should have a primary key.
Structural Rules
n   Rule 1 -
n   All info in a RDB is represented
    explicitly at the logical level in exactly
    one way - by values in a table.
    n   ALL info even the Metadata held in the
        system catalogue MUST be stored as
        relations(tables) & manipulated in the
        same way as data.
Structural Rules
n   Rule 6 - View Updating –
n   All views that are theoretically
    updatable are updatable by the system.

    n   Not really implemented yet by any
        available system.
Codd's Rules
n   Integrity Rules (Rules 3 & 10)
    n   Integrity should be maintained by the DBMS not
        the application.
n   Rule 3 - Systematic treatment of null
    values -
n   Null values are supported for representation
    of 'missing' & inapplicable information in a
    systematic way & independent of data type.
Integrity Rules
n   Rule 10 - Integrity independence -
n   Integrity constraints specific to a
    particular RDB MUST be definable in
    the relational data sublanguage &
    storable in the DB, NOT the application
    program.
    n   This gives the advantage of centralised
        control & enforcement
Codd's Rules
n   Data Manipulation Rules (Rule 2, 4, 5 & 7)
n   User should be able to manipulate the 'Logical
    View' of the data with no need for knowledge of
    how it is Physically stored or accessed.

n   Rule 2 - Guaranteed Access -
n   Each & every datum in an RDB is guaranteed to be
    logically accessible by a combination of table
    name, primary key value & column name.
Data Manipulation Rules
n   Rule 4 - Dynamic on-line Catalog based
    on relational model
n   The DB description (metadata) is represented
    at logical level in the same way as ordinary
    data, so that same relational language can be
    used to interrogate the metadata as regular
    data.
    n   System & other data stored & manipulated in the
        same way.
Data Manipulation Rules
n   Rule 5 - Comprehensive Data Sublanguage -
n   RDBMS may support many languages & modes of
    use, but there must be at least ONE language
    whose statements can express ALL of the
    following -
    n   Data Definition
    n   View Definition
    n   Data manipulation (interactive & via program)
    n   Integrity constraints
    n   Authorization
    n   Transaction boundaries (begin, commit & rollback)
         n   1992 - ISO standard for SQL provides all these functions
Data Manipulation Rules
n   Rule 7 - High-level insert, update &
    delete -
n   Capability of handling a base table or
    view as a single operand applies not
    only to data retrieval but also to insert,
    update & delete operations.
Codd's Rules
n   Data Independence Rules (Rules 8, 9
    11)

n   These rules protect users & application
    developers from having to change the
    applications following any low-level
    reorganisation of the DB.
Data Independence Rules

n   Rule 8 - Physical Data Independence -
n   Application Programs & Terminal Activities
    remain logically unimpaired whenever any
    changes are made either to the storage
    organisation or access methods.
n   Rule 9 - Logical Data Independence -
n   Appn Progs & Terminal Acts remain logically
    unimpaired when information-preserving
    changes of any kind that theoretically permit
    unimpairment are made to the base tables.
Data Independence Rules
n   Rule 11 - Distribution Independence -
n   The data manipulation sublanguage of an
    RDBMS must enable application programs
    & queries to remain logically unchanged
    whether & whenever data is physically
    centralised or distributed.
Data Independence Rules
n   Rule 11 - Distribution Independence -
    n   This means that an Application Program that
        accesses the DBMS on a single computer
        should also work ,without modification, even if
        the data is moved from one computer to
        another in a network environment.
         n   The user should 'see' one centralised DB whether
             data is located on one or more computers.
Data Independence Rules
n   Rule 11 - Distribution Independence –

    n   This rule does not say that to be fully
        Relational the DBMS must support distributed
        DB's but that if it does the query must remain
        the same.
Summary
n   Codd's Rules can be divided into 5
    functional areas –
    n   Foundation Rules
    n   Structural Rules
    n   Integrity Rules
    n   Data Manipulation Rules
    n   Data Independence Rules

More Related Content

What's hot

Decision Trees
Decision TreesDecision Trees
Decision TreesStudent
 
First order logic in knowledge representation
First order logic in knowledge representationFirst order logic in knowledge representation
First order logic in knowledge representationSabaragamuwa University
 
Tourism recommendation-system
Tourism recommendation-systemTourism recommendation-system
Tourism recommendation-systemkhatrisagar
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree inductionthamizh arasi
 
Movie Recommender system
Movie Recommender systemMovie Recommender system
Movie Recommender systemPalakNath
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
Heart disease prediction using machine learning algorithm
Heart disease prediction using machine learning algorithm Heart disease prediction using machine learning algorithm
Heart disease prediction using machine learning algorithm Kedar Damkondwar
 
Data Redundancy & Update Anomalies
Data Redundancy & Update AnomaliesData Redundancy & Update Anomalies
Data Redundancy & Update AnomaliesJens Patel
 
Disease Prediction And Doctor Appointment system
Disease Prediction And Doctor Appointment  systemDisease Prediction And Doctor Appointment  system
Disease Prediction And Doctor Appointment systemKOYELMAJUMDAR1
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detectionkalpesh1908
 
Fake Currency detction Using Image Processing
Fake Currency detction Using Image ProcessingFake Currency detction Using Image Processing
Fake Currency detction Using Image ProcessingSavitaHanchinal
 
Smart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesSmart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesDATAVERSITY
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introductionRobert Lujo
 
OCR 's Functions
OCR 's FunctionsOCR 's Functions
OCR 's Functionsprithvi764
 

What's hot (20)

Decision Trees
Decision TreesDecision Trees
Decision Trees
 
First order logic in knowledge representation
First order logic in knowledge representationFirst order logic in knowledge representation
First order logic in knowledge representation
 
Pattern recognition
Pattern recognitionPattern recognition
Pattern recognition
 
Tourism recommendation-system
Tourism recommendation-systemTourism recommendation-system
Tourism recommendation-system
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
 
Movie Recommender system
Movie Recommender systemMovie Recommender system
Movie Recommender system
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
face detection
face detectionface detection
face detection
 
Heart disease prediction using machine learning algorithm
Heart disease prediction using machine learning algorithm Heart disease prediction using machine learning algorithm
Heart disease prediction using machine learning algorithm
 
Data Redundancy & Update Anomalies
Data Redundancy & Update AnomaliesData Redundancy & Update Anomalies
Data Redundancy & Update Anomalies
 
Disease Prediction And Doctor Appointment system
Disease Prediction And Doctor Appointment  systemDisease Prediction And Doctor Appointment  system
Disease Prediction And Doctor Appointment system
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
Fake Currency detction Using Image Processing
Fake Currency detction Using Image ProcessingFake Currency detction Using Image Processing
Fake Currency detction Using Image Processing
 
Wine quality Analysis
Wine quality AnalysisWine quality Analysis
Wine quality Analysis
 
Decision tree
Decision treeDecision tree
Decision tree
 
Smart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesSmart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case Studies
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
OCR 's Functions
OCR 's FunctionsOCR 's Functions
OCR 's Functions
 
Classification
ClassificationClassification
Classification
 
Machine learning meetup
Machine learning meetupMachine learning meetup
Machine learning meetup
 

Viewers also liked (12)

Codd's rules
Codd's rulesCodd's rules
Codd's rules
 
Normalization
NormalizationNormalization
Normalization
 
Database normalisation by D.Lukachuk
Database normalisation by D.LukachukDatabase normalisation by D.Lukachuk
Database normalisation by D.Lukachuk
 
Data normailazation
Data normailazationData normailazation
Data normailazation
 
Normalization form tutorial
Normalization form tutorialNormalization form tutorial
Normalization form tutorial
 
Codd's 12 rules
Codd's 12 rulesCodd's 12 rules
Codd's 12 rules
 
Database anomalies
Database anomaliesDatabase anomalies
Database anomalies
 
Normalization 1 nf,2nf,3nf,bcnf
Normalization 1 nf,2nf,3nf,bcnf Normalization 1 nf,2nf,3nf,bcnf
Normalization 1 nf,2nf,3nf,bcnf
 
Anomalies in database
Anomalies in databaseAnomalies in database
Anomalies in database
 
DBMS - Normalization
DBMS - NormalizationDBMS - Normalization
DBMS - Normalization
 
Ef code first
Ef code firstEf code first
Ef code first
 
Slideshare ppt
Slideshare pptSlideshare ppt
Slideshare ppt
 

More from lubna19

Concurrency Conrol
Concurrency ConrolConcurrency Conrol
Concurrency Conrollubna19
 
Programming in Oracle with PL/SQL
Programming in Oracle with PL/SQLProgramming in Oracle with PL/SQL
Programming in Oracle with PL/SQLlubna19
 
ER Modelling
ER ModellingER Modelling
ER Modellinglubna19
 
Introduction to database
Introduction to databaseIntroduction to database
Introduction to databaselubna19
 
Security and Integrity
Security and IntegritySecurity and Integrity
Security and Integritylubna19
 

More from lubna19 (6)

Concurrency Conrol
Concurrency ConrolConcurrency Conrol
Concurrency Conrol
 
9
99
9
 
Programming in Oracle with PL/SQL
Programming in Oracle with PL/SQLProgramming in Oracle with PL/SQL
Programming in Oracle with PL/SQL
 
ER Modelling
ER ModellingER Modelling
ER Modelling
 
Introduction to database
Introduction to databaseIntroduction to database
Introduction to database
 
Security and Integrity
Security and IntegritySecurity and Integrity
Security and Integrity
 

Normalization and Codd's Rule

  • 2. n Normalization n Normal Forms n 1 NF n 2 NF n 3 NF n Codd’s Rules
  • 3. Data Normalization n The purpose of normalization is to produce a stable set of relations that is a faithful model of the operations of the enterprise. n Achieve a design that is highly flexible n Reduce redundancy n Ensure that the design is free of certain update, insertion and deletion anomalies
  • 4. Normalization 1NF 1NF Flat file 2NF 2NF Partial dependencies removed 3NF 3NF Transitive dependencies removed BCNF BCNF Every determinant is a candidate key 4NF 4NF Non-tivial multi-valued dependencies removed
  • 5. Order No. 10001 Stereos To Go Date: 6 / 15 / 99 Invoice Stereos To Go Go, Hogs Account No. 0000-000-0000-0 Customer: John Smith 0000 000 0000 0 Address: 2036-26 Street John Smith 1/05 Sacramento CA 95819 City State Zip Code Date Shipped: 6 / 18 / 99 Item Product Number Code Product Description/Manufacturer Qty Price 1 SAGX730 Pioneer Remote A/V Receiver 1 56995 2 AT10 Cervwin Vega Loudspeakers 35995 1 3 CDPC725 Sony Disc-Jockey CD Changer 1 39995 4 5 Subtotal 132985 Shipping & Handling 10000 Sales Tax 10306 Total 153291
  • 6. Unnormalized Relation (Invoice_number, Invoice_date, Date_delivered, Cust_account Cust_name Cust_addr Cust_city Cust_state Zip_code, Item1 Item1_descrip Item1_qty Item1_price, Item2 Item2_descrip Item2_qty Item2_price, . . . , Item7 Item7_descrip Item7_qty Item7_price) How would a program process the data to recreate the invoice?
  • 7. Unnormalized to 1NF (Invoice_number, Invoice_date, Date_delivered, Cust_account Cust_name Cust_addr Cust_city Cust_state Zip_code, Item1, Item1_descrip, Item1_qty, Item1_price, Item2, Item2_descrip, Item2_qty, Item2_price, . . . , Repeating groups Item7, Item7_descrip, Item7_qty, Item7_price) A flat file places all the data of a transaction into a single record. record. This is reminiscent of a COBOL or BASIC program processing a single transaction with one read statement.
  • 8. Unnormalized to 1NF (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code, Item, Item_descrip, Item_qty, Item_price) Nominated group of attributes to serve as the key (form a unique combination) • Eliminate the repeating groups. • Each row retains data for one item. • If a person bought 5 items, we would have five tuples
  • 9. 1NF r er e b e b m um num r na Flat File n t e i ce un m vo co sto Item Item In Ac Cu Item Description Quantity Price 10001 123456 John Smith ••• SAGX730 Pioneer Remote A/V Rec 10001 123456 John Smith ••• SAGX730 Pioneer Remote A/V Rec 1 1 569.95 569.95 10001 123456 John Smith ••• 10001 123456 John Smith ••• AT10 AT10 Cerwin Vega Loudspeakers 1 Cerwin Vega Loudspeakers 1 359.95 359.95 10001 123456 John Smith ••• CDPC725 Sony Disc Jockey CD 10001 123456 John Smith ••• CDPC725 Sony Disc Jockey CD 1 1 399.95 399.95 10001 123456 John Smith ••• S/H 10001 123456 John Smith ••• S/H Shipping Shipping 1 1 100.00 100.00 10001 123456 John Smith ••• Tax 10001 123456 John Smith ••• Tax Sales Tax Sales Tax 1 1 103.06 103.06
  • 10. From 1NF (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code, Item, Item_descrip, Item_qty, Item_price) Functional dependencies and determinants Example: item_descrip is functionally dependent on item, such that item is the determinant of item_descript.
  • 11. From 1NF to 2NF (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code) (Item, Item_descrip, Item_qty, Item_price) Is this unique by itself? What happens if the item is purchased more than once?
  • 12. From 1NF to 2NF (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code) Partial dependency (Invoice_number, Item, Item_descrip, Item_qty, Item_price) Composite key (forms a unique combination)
  • 13. From 1NF to 2NF (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code) (Invoice_number, Item, Item_qty, Item_price) (Item, Item_descrip)
  • 14. From 2NF to 3NF (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code) (Invoice_number, Item, Item_qty, Item_price) (Item, Item_descrip) Which attributes are dependent on others? Is there a problem?
  • 15. Transitive Dependencies and Anomalies n Insertion anomalies n To add a new row, all customer (name, address, city, state, zip code, phone) and products (description) must be consistent with previous entries n Deletion anomalies n By deleting a row, a customer or product may cease to exist n Modification anomalies n To modify a customer’s or product’s data in one row, all modifications must be carried
  • 16. Insertion and Modification Anomalies For example… Insert a new Panasonic product Product_code Manufacturer_name DVD-A110 DVD-A110 Panasonic Panasonic PV-4210 PV-4210 Panasonic Panasonic CT-32S35 CT-32S35 PAN PAN PV-4250 PV-4250 Panasonic Panasonic Inconsistency DVD-A110 DVD-A110 Panasonic Panasonic Change all Panasonic PV-4210 PV-4210 PanaSonic PanaSonic PV-4250 Pana Sonic products’ manufacturer PV-4250 Pana Sonic CT-32S35 CT-32S35 PAN PAN name to “Panasonic USA”
  • 17. Deletion Anomaly For Example… 4377182 John Smith lll Sacramento CA 95831 4398711 Arnold S lll Davis CA 95691 4578461 Gray Davis lll Sacramento CA 95831 4873179 Lisa Carr lll Reno NV 89557 By deleting customer Arnold S, we would also be deleting Davis, California.
  • 18. Invoice_number Transitive Invoice_date Dependencies Date_delivered Cust_account Cust_name Ÿ A condition where A, B, C are attributes of a relation Cust_addr such that if A à B and Cust_city B à C, then C is transitively Cust_state dependent on A via B Zip_code (provided that A is not functionally dependent on B Item or C). Item_descrip Invoice_number+Item Item_qty Item_price
  • 19. Why Should City and State Be Separated from Customer Relation? n City and state are dependent on zip code for their values and not the customer’s identifier (i.e., key). Zip_code à City, State n Otherwise, Cust_account à Cust_addr, Zip_code à City, State
  • 20. 3NF Invoice Relation (Invoice_number, Invoice_date, Date_delivered, Cust_account) Customer Relation (Cust_account, Cust_name, Cust_addr, Zip_code) Zip_code Relation (Zip_code, City, State) Invoice_items Relation (Invoice_number, Item, Item_qty, Item_price) Items Relation (Item, Item_descrip)
  • 21. 3NF Invoice Relation (Invoice_number, Invoice_date, Date_delivered, Cust_account) Customer Relation (Cust_account, Cust_name, Cust_addr, Zip_code) Zip_code Relation (Zip_code, City, State) Invoice_items Relation (Invoice_number, Item, Item_qty, Item_price) Items Relation Manufacturers Relation (Item, Item_descrip) (Manuf_code, Manuf_name) Since the Items relation contains the manufacturer’s name in the description, a separate Manufacturers relation can be created
  • 22.
  • 23. First to Third Normal Form (1NF - 3NF) n 1NF: A relation is in first normal form if and only if every attribute is single-valued for each tuple (remove the repeating or multi-value attributes and create a flat file) n 2NF: A relation is in second normal form if and only if it is in first normal form and the nonkey attributes are fully functionally dependent on the key (remove partial dependencies) n 3NF: A relation is in third normal form if it is in second normal form and no nonkey attribute is transitively dependent on the key (remove transitive dependencies)
  • 24. Codd's Rules E. F. Codd presented these rules as a basis of determining whether a DBMS could be classified as Relational
  • 25. Codd's Rules n Codd's Rules can be divided into 5 functional areas – n Foundation Rules n Structural Rules n Integrity Rules n Data Manipulation Rules n Data Independence Rules
  • 26. Foundation Rules n Rule 0 – n Any system claimed to be a RDBMS must be able to manage databases entirely through its relational capabilities. n All data definition & manipulation must be able to be done through relational ops.
  • 27. Foundation Rules n Rule 12 - Nonsubversion Rule - n If a RDBMS has a low level (record at a time) language, that low level language cannot be used to subvert or bypass the integrity rules &constraints expressed in the higher-level relational language. n All database access must be controlled through the DBMS so that the integrity of the database cannot be compromised without the knowledge of the user or the DBA. n This does not prohibit use of record at a time languages e.g. PL/SQL
  • 28. Codd's Rules n Structural Rules (Rules 1 & 6) n The fundamental structural construct is the table. n Codd states that an RDBMS must support tables, domains, primary & foreign keys. n Each table should have a primary key.
  • 29. Structural Rules n Rule 1 - n All info in a RDB is represented explicitly at the logical level in exactly one way - by values in a table. n ALL info even the Metadata held in the system catalogue MUST be stored as relations(tables) & manipulated in the same way as data.
  • 30. Structural Rules n Rule 6 - View Updating – n All views that are theoretically updatable are updatable by the system. n Not really implemented yet by any available system.
  • 31. Codd's Rules n Integrity Rules (Rules 3 & 10) n Integrity should be maintained by the DBMS not the application. n Rule 3 - Systematic treatment of null values - n Null values are supported for representation of 'missing' & inapplicable information in a systematic way & independent of data type.
  • 32. Integrity Rules n Rule 10 - Integrity independence - n Integrity constraints specific to a particular RDB MUST be definable in the relational data sublanguage & storable in the DB, NOT the application program. n This gives the advantage of centralised control & enforcement
  • 33. Codd's Rules n Data Manipulation Rules (Rule 2, 4, 5 & 7) n User should be able to manipulate the 'Logical View' of the data with no need for knowledge of how it is Physically stored or accessed. n Rule 2 - Guaranteed Access - n Each & every datum in an RDB is guaranteed to be logically accessible by a combination of table name, primary key value & column name.
  • 34. Data Manipulation Rules n Rule 4 - Dynamic on-line Catalog based on relational model n The DB description (metadata) is represented at logical level in the same way as ordinary data, so that same relational language can be used to interrogate the metadata as regular data. n System & other data stored & manipulated in the same way.
  • 35. Data Manipulation Rules n Rule 5 - Comprehensive Data Sublanguage - n RDBMS may support many languages & modes of use, but there must be at least ONE language whose statements can express ALL of the following - n Data Definition n View Definition n Data manipulation (interactive & via program) n Integrity constraints n Authorization n Transaction boundaries (begin, commit & rollback) n 1992 - ISO standard for SQL provides all these functions
  • 36. Data Manipulation Rules n Rule 7 - High-level insert, update & delete - n Capability of handling a base table or view as a single operand applies not only to data retrieval but also to insert, update & delete operations.
  • 37. Codd's Rules n Data Independence Rules (Rules 8, 9 11) n These rules protect users & application developers from having to change the applications following any low-level reorganisation of the DB.
  • 38. Data Independence Rules n Rule 8 - Physical Data Independence - n Application Programs & Terminal Activities remain logically unimpaired whenever any changes are made either to the storage organisation or access methods. n Rule 9 - Logical Data Independence - n Appn Progs & Terminal Acts remain logically unimpaired when information-preserving changes of any kind that theoretically permit unimpairment are made to the base tables.
  • 39. Data Independence Rules n Rule 11 - Distribution Independence - n The data manipulation sublanguage of an RDBMS must enable application programs & queries to remain logically unchanged whether & whenever data is physically centralised or distributed.
  • 40. Data Independence Rules n Rule 11 - Distribution Independence - n This means that an Application Program that accesses the DBMS on a single computer should also work ,without modification, even if the data is moved from one computer to another in a network environment. n The user should 'see' one centralised DB whether data is located on one or more computers.
  • 41. Data Independence Rules n Rule 11 - Distribution Independence – n This rule does not say that to be fully Relational the DBMS must support distributed DB's but that if it does the query must remain the same.
  • 42. Summary n Codd's Rules can be divided into 5 functional areas – n Foundation Rules n Structural Rules n Integrity Rules n Data Manipulation Rules n Data Independence Rules