The document provides an overview of functional dependencies and database normalization. It discusses four informal design guidelines for relational databases: 1) design relations so their meaning is clear, 2) avoid anomalies, 3) avoid null values, and 4) avoid spurious tuples. It then covers functional dependencies, inference rules, equivalence, and normal forms including 1NF, 2NF, 3NF and BCNF. The goals of normalization are also summarized as reducing redundancy, anomalies, and producing high quality schemas. Examples are provided to illustrate each concept.
3. GUIDELINE 1: Design a relation schema so that it is easy to
interpret its meaning relation by relation.
• Do not combine attributes from multiple entity types and
relationships types into a single relation.
• Only foreign keys should be used to refer to other entities.
INFORMAL DESIGN GUIDELINES FOR
RELATIONAL DATABASES
Semantics of the Relation Attributes
4. One goal of schema design is to minimize the storage space that the
base relations (files) occupy, but in case of presence of redundant
information this can not be fulfilled.
Another serious problem with redundancy is the problem of update
anomalies. These can be classified into:
•Insertion Anomalies
•Deletion Anomalies
•Modification Anomalies
GUIDELINE 2: Design the base relation schemas so that no insertion, deletion
, or modification anomalies occur in the relations. If any anomalies are present,
note them clearly so that the programs that update the database will operate
correctly
Redundant Information in Tuples
5. ID Name Address Subject
ID
Subject
Name
Teacher
401 Adam London 01 Math M. Smith
402 Alex Austria 03 English L. Brown
403 Alice Germany 02 Physics P. James
404 George London 02 Physics P. James
405 Jennifer New York 05 History I. John
6. ID Name Address Subject ID
401 Adam London 01
402 Alex Austria 03
403 Alice Germany 02
404 George London 02
405 Jennifer New York 05
Subject ID Subject Name Teacher
01 Math M. Smith
02 Physics P. James
03 English L. Brown
05 History I. John
7. GUIDELINE 3: Avoid placing attributes in a relation whose values
may be null.
Reasons for nulls:
• Attribute not applicable or invalid
• Attribute value unknown (may exist)
• Value known to exist, but unavailable
Null Values In Tuples
8. A spurious tuple is, basically, a record in a database that gets created
when two tables are joined badly. Spurious tuples are created when
two tables are joined on attributes that are neither primary keys nor
foreign keys.
GUIDELINE 4: Design relation schemas so that they satisfy the
lossless join condition and no spurious tuples are generated.
ID Name Address
401 Adam London
402 Alex Austria
403 Alice Germany
404 George London
Spurious Tuples
9. ID Address
401 London
402 Austria
403 Germany
404 London
Address Name
London Adam
Austria Alex
Germany Alice
London George
Natural Join
ID Address Name
401 London Adam
401 London Gorge
402 Austria Alex
403 Germany Alice
404 London Adam
404 London George
11. Functional Dependency(FD) is a set of constraints between attributes
of a relation .
Given a relation R, a set of attributes X in R is said to functionally
determine another set of attributes Y, also in R, (written X → Y) if,
and only if, each X value is associated with precisely one Y value; R is
then said to satisfy the functional dependency X → Y.
For any two tuples s1 and s2 in any relation instance r(R): If
s1[X]=s2[X], then s1[Y]=s2[Y].
FUNCTIONAL DEPENDENCIES
12. EXAMPLE OF FUNCTIONAL DEPENDENCIES
Name Class Subject Age
Pooja 5th English 10
Priya 4th Hindi 9
Pooja 5th Maths 10
Pooja 6th Science 10
Sneha 7th Computer 11
2. Class->Name, Class+Subject->Name+Subject
3. Class->Name, Name-> Age, Class->Age
4. Class->Name, Class->Age, Class->Name+Age
5. Class->Name+Age, Class-> Name, Class->Age
6. Class->Name, Name+Subject->Age, Class+subject->Age
13. Rules of Inference for functional dependencies, called Armstrong
axioms, can be used to find all the FDs logically implied by a set of FDs.
Let A,B,C and D be subsets of attributes of a relation R then following
are the different inference rules :
1.Reflexivity
If B is a subset of A then, A->B. This also implies A->A always hold.
2.Augmentation:
If we have AB then ACBC.
3.Transitivity:
If AB and BC, then AC.
4.Additivity or Union:
If AB and AC, then ABC.
5.Projectivity or Decomposition:
If ABC then AB and AC.
6.Pseudo transitivity:
If AB and CBD, then ACD.
INFERENCE RULES OF FDs
14. • Closure of a set F of FDs is the set of all FDs logically implied by
F.
Example : Suppose we are given a relation scheme R=(A,B,C,G,H,I)
and the set of FDs as :
F={AB,CGH,CGI,BH }
Therefore, F + ={AH (By Transitivity), CGHI(By Additivity or Union)}
INFERENCE RULES OF FDs
15. EQUIVALENCE OF SETS OF FDs
Two sets of FDs, F1 and F2 are equivalent if and only if-
- every FD in F1 can be inferred from F2
- every FD in F2 can be inferred from F1
Example : Consider F1 = {A->B, B->H, A->H} & F2 = {A->B, B->H}
F2 :A->A; A->B; A->H;
F2 :B->B; B->H;
F1 :A->A; A->B; A->H;
F1 :BC; B->B; B->H;
Since all FDs in F1 can be obtained from F2 and vice versa, we
conclude that F1 and F2 are equivalent.
16. MINIMAL SETS OF FDs
A set of FDs is minimal if it satisfies the following conditions:
1).Every dependency in F has a single attribute for its RHS.
2).We cannot remove any dependency from F and have a set of
dependencies that is equivalent to F.
F={AB, BH, AH }
F new={AB, BH }
3).We cannot replace any dependency X -> A in F with a
dependency Y -> A, where Y proper-subset-of X ( Y subset-of X)
and still have a set of dependencies that is equivalent to F.
17. FULL AND PARTIAL FUNCTIONAL DEPENDENCY
• Grade is fully functionally dependent on the primary key (ID,
Course-ID) because both parts of the primary keys are needed to
determine Grade.
• On the other hand both Name and Phone attributes
are partially dependent on the primary key, because only a part of
the primary key namely ID is needed to determine them and
similarly Credit-Hours and Course-Name can be determined
using Course-ID .
ID Name Phone Course
ID
Course Name Credit
-Hours
Grade
19. Normalisation works through a series of stages called Normal
Forms . Various Normal Forms are :
First Normal Form(1NF)
Second Normal Form(2NF)
Third Normal Form(3NF)
Boyce- Codd Normal Form(BCNF)
Normalization is a step-by-step refinement process during which
unsatisfactory or bad relations are decomposed by breaking up their
attributes into smaller relations that possess desirable properties.
NORMALIZATION
NORMAL FORM
20. Reduces Data Redundancy.
Reduces the chances of Data Anomalies occurring.
Provides a robust architecture for retrieving and maintaining
data.
Produces high quality relational schema designs.
NORMALIZATION BENEFITS
21. A Relation Schema R is said to be in 1NF when :-
The relation has a primary key.
The relation does not have composite attributes, multivalued
attributes and nested relations .
The non-key attributes depend on the primary key.
FIRST NORMAL FORM (1NF)
22. EXAMPLE OF 1st NORMAL FORM
Name Age Subject
Alex 17 Math, Biology
Adam 15 Physics
Alice 16 Chemistry, Math
Name Age Subject
Alex 17 Math
Alex 17 Biology
Adam 15 Physics
Alice 16 Chemistry
Alice 16 Math
23. A Relation Schema R is said to be in 2NF when: -
The relations meet the criteria for first normal form.
Every non-prime attribute A in R is fully functionally dependent
on the primary key .
SECOND NORMAL FORM (2NF)
24. EXAMPLE OF 2nd NORMAL FORM
Name Age Subject
Alex 17 Math
Alex 17 Biology
Adam 15 Physics
Alice 16 Chemistry
Alice 16 Math
Name Subject
Alex Math
Alex Biology
Adam Physics
Alice Chemistry
Alice Math
Name Age
Alex 17
Adam 15
Alice 16
25. A Relation Schema R is said to be in 3NF when :-
The relations meet the criteria for second normal form.
No non-prime attribute A in R is transitively dependent on the
primary key.
According to the general definition of 3NF for multiple candidate
keys , A relation schema R is in 3NF if whenever a FD X -> A holds
in R, then either:
• X is a super key of R, or
• A is a prime attribute of R
THIRD NORMAL FORM (3NF)
26. EXAMPLE OF 3rd NORMAL FORM
ID Name DOB Street City State Zip
Zip Street City StateID Name DOB Zip
27. A Relation Schema R is said to be in BCNF
If whenever a FD X -> A holds in R, then X is a super key of R.
BCNF is simply a stronger definition of 3NF. Every BCNF relation is
in 3NF.
BOYCE –CODD NORMAL FORM
28. ID Name Address DOB City
ID Name DOB City City Address
EXAMPLE OF BOYCE CODD NORMAL FORM