1. Database Design
Database design process can be
broken down into 5 phases
• Planning
• Analysis
• Design
• Implementation
• Maintenance
2. Planning Phase
In planning phase the overall Database
structure is defined. Therefore;
• The purpose of the database is determined
– What information will be used in the Database
– How information is to be used
– What question will be Answered
• Feasibility studies are conducted.
• Requirements gathering
3. Analysis phase
Databases can be analyzed on different models
• Conceptual-model
– High-level description of facts
– Not system specific
• Logical model
– Organization of data with some implementation information
• Physical model
– Actual storage of information (clustering, partitioning,
indexing etc.)
4. Conceptual model
• Provide a framework for developing a
database structure.
• Three database components (entities,
attributes and relationship) are
described in detail.
5. Entities
• An entity defines a thing that exists and is
distinguishable. i.e Person, place, object or
concept.
• Entities are basic building blocks of the
database design.
• particular occurrence of an entity is known as
entity instance.
• A group of similar entities is called entity set or
entity class
6. Attributes
Attributes describe properties of entities and
relationships
• Simple (Scalars) - smallest semantic unit of data,
atomic (no internal structure)- singular e.g. city
• Composite - group of attributes e.g. address (street,
city, state, zip)
• Multivalued (list) - multiple values e.g. degrees,
courses, skills (not allowed in first normal form)
• Domain - conceptual definition of attributes
– a named set of scalar values all of the same type e.g. integer
a pool of possible values
7. Relationships
A relationship is a connection between entity classes.
For example, a relationship between PERSONS and
AUTOMOBILES could be an "OWNS" relationship.
That is to say, people own automobiles.
• The degree of a relationship indicates the number of entities
involved.
• The cardinality of a relationship indicates the number of
instances in entity class E1 that can or must be associated
with instances in entity class E2
8. Types of Relationship
Based on cardinality of a relationship, we have 3
types: -
• One-One Relationship - For each entity in one class there is at most one
associated entity in the other class. For example, for each husband there is
at most one current legal wife (in this country at least). A wife has at most
one current legal husband.
• Many-One Relationships - One entity in class E2 is associated with zero or
more entities in class E1, but each entity in E1 is associated with at most
one entity in E2. For example, a woman may have many children but a
child has only one birth mother.
• Many-Many Relationships - There are no restrictions on how many entities
in either class are associated with a single entity in the other. An example
of a many-to-many relationship would be students taking classes. Each
student takes many classes. Each class has many students.
9. Logical model
• After validating your conceptual mode,
you can generate a logical model
– Entity Classes are modeled as tables
– Attributes are modeled as fields
– Each instance of an entity is called a record
– Domain are modeled as Data types
– Primary keys for each table
– Foreign keys for relationship
10. Physical –model
• How data will be stored and accessed in a
computer system.
• Where data will be stored
• Estimate the amount of disk space that will be
required by the database.
• How data will be distributed within an
organization or disks
• type of indexes to be used (for efficient
retrieval and manipulation).
11. Design Phase
Determine how best to represent the
information system that was identified in the
previous phase
Mapping Logical Model and physical model
into reality.
– Database Management system (DBMS) to
be used.
– User Views (input forms, output reports)
– Security Mechanisms etc.
12. Implementation phase
Actual implementation of the database and
associated programming.
• Database is analyzed for possible errors
• Tables are created with few records for
sample to see if the desired results are
achieved
• Fine adjustments as needed
13. Entity Relationship Model
• Conceptual data model that views the real
world as entities and relationships.
• A basic component of the model is the
Entity-Relationship diagrams (ERDs),
• (ERDs) provides a convenient method for
visualizing the interrelationships among
entities in a given application
14. The utility of the ER model is:
• It maps well to the relational database
model..
• It is simple and easy to understand with
a minimum of training.
• the model can be used as a design plan
by the database developer to implement
a data model in specific database
management software.
15. Basic Elements in E-R Modeling
The basic elements in ER modal are
• entities
• attributes and
• Relationships.
16. Entities
• Data object about which information is to be
collected.
• Some specific examples of entities are
EMPLOYEE, PROJECT, INVOICE.
• An entity occurrence (also called an instance)
is an individual occurrence of an entity.
• Entity set: a collection of similar
entities (employees, projects,
departments)
17. Attributes
• describe the entity of which they are associated.
• A particular instance of an attribute is a value.
• Attributes can be classified as identifiers or
descriptors.
• Identifiers, more commonly called keys,
uniquely identify an instance of an entity.
• A descriptor describes a non-unique
characteristic of an entity instance.
18. Relationships
• Represents an association between two or
more entities. An example of a relationship
would be:
employees are assigned to projects
projects have subtasks
departments manage one or more projects
• Relationships are classified in terms of
– degree,
– connectivity,
– cardinality,
– and existence.
19. Classifying Relationships
Degree of a Relationship
• number of entities associated with the relationship.
A UNARY RELATIONSHIP exists when an
association exists within a single entity
A BINARY RELATIONSHIP exists when two
entities(participants) are in the relationship.
A TERNARY RELATIONSHIP exists when three
entities (participants) are in the relationship.
20. Classifying Relationships
The connectivity
– describes the mapping of associated entity
instances in the relationship.
– The values of connectivity are "one" or "many".
– The basic types of connectivity for relations are:
one-to-one, one-to-many, and many-to-many.
The cardinality
– actual number of related occurrences for each of
the two entities.
21. Classifying Relationships
Existence
• denotes whether the existence of an entity instance is
dependent upon the existence of another, related,
entity instance.
• Defined as either mandatory or optional.
– For mandatory existence an instance of an entity must
always occur. "every project must be managed by a single
department".
– For optional existence the instance of the entity is not
required or may occur
22. ER Notation
• There is no universal standard for
representing data objects in ER
diagrams.
• Number of Notation styles is used
today, among the more common are
information Engineering, Bachman,
Chen and Martin.
23. ER Notation
Martin Style.
• Entities are represented by labeled rectangles. The label is the
name of the entity. Entity names should be singular nouns.
• Relationships are represented by a solid line connecting two
entities. The name of the relationship is written above the line.
Relationship names should be verbs.
• Attributes, when included, are listed inside the entity
rectangle. Identifier Attributes are underlined. Attribute names
should be singular nouns.
• Cardinality of many is represented by a line ending in a crow's
foot. If the crow's foot is omitted, the cardinality is one.
•
Existence is represented by placing a circle or a perpendicular
bar on the line. Placing a bar line next to the entity shows
mandatory existence. Placing a circle next to the entity shows
optional existence.
25. ER Notation
Chen Style
• Rectangles represent ENTITY CLASSES
• Circles represent ATTRIBUTES
• Diamonds represent RELATIONSHIPS
• Lines - lines connect entities to relationships. Lines are
also used to connect attributes to entities.
• Underline - Key attributes of entities are underlined.
• Number Notations represents cardinality.
• The name of the entity (class) or attribute or relationship is
usually placed inside the symbol used for that object.
(Sometimes, the name is placed adjacent.)
•
27. Refining The Entity-Relationship Diagram
This section discusses four basic rules for
modeling relationships
1. Entities Must Participate In
Relationships
– Entities cannot be modeled unrelated to
any other entity.
– The exception to this rule is a database
with a single table.
28. Refining The Entity-Relationship Diagram
2. Resolve Many-To-Many Relationships
– Many-to-many relationships cannot be used in
the data model because they cannot be
represented by the relational model.
– must be resolved early in the modeling process.
– replace the relationship with an association
entity and then relate the two original entities to
the association entity
29. This strategy is demonstrated below Figure below: -
Here
Employees may be assigned to many projects.
Each project must have assigned to it more than one employee.
30. Refining The Entity-Relationship Diagram
3. Transform Complex Relationships into Binary
Relationships
• Complex relationships are classified as ternary, an association
among three entities, or n-ary, an association among more than
three, where n is the number of entities involved.
• cannot be directly implemented in the relational model.
• so they should be resolved early in the modeling process.
• The strategy for resolving complex relationships is similar to
resolving many-to-many relationships.
• Replace the complex relationship with an association entity and
then relate the two original entities to the association entity
31. Here is an example
Employees can use different skills on any one or more projects.
Each project uses many employees with various skills.
32. Refining The Entity-Relationship Diagram
4. Eliminate redundant relationships
– A redundant relationship is a relationship
between two entities that is equivalent in
meaning to another relationship between
those same two entities.
33. For example,
Figure A shows a redundant relationship between DEPARTMENT and
WORKSTATION.
This relationship provides the same information as the relationships DEPARTMENT
has EMPLOYEES and EMPLOYEEs assigned WORKSTATION.
Figure B shows the solution which is to remove the redundant relationship
DEPARTMENT assigned WORKSTATIONS.
34. Tips for Effective ER Diagrams
• Make sure that each entity only appears once per
diagram.
• Name every entity, relationship, and attribute on your
diagram.
• Examine relationships between entities closely. Are
they necessary? Are their any relationships missing?
Eliminate any redundant relationships. Don't connect
relationships to each other.
• Use colors to highlight important portions of your
diagram
35. Normalization
• Normalization is the process of refining a database
design to produce table schemes in normal form.
• A normal form refers to a class of relational schemas that
obey some set of rules.
• Schemas that obey the rules are said to be in the normal
form.
• Non–normal form is where data may recur repetitively.
• Normalization is aiming at minimizing redundancy in
database
36. Classifying normal forms
• There are six commonly recognized
normal forms, with the inspired names:
– First normal form (or 1NF)
– Second normal form (or 2NF)
– Third normal form (or 3NF)
– Boyce-Codd normal form (or BCNF)
– Fourth normal form (or 4NF)
– Fifth normal form (or 5NF)
• We will consider the first three of these
normal forms
37. First normal form (or 1NF)
A relation is in First Normal Form (1NF) if
every attribute value is indivisible (atomic)
and every column is unique.
• First normal form (1NF) sets the very basic
rules for an organized database:
– Eliminate duplicative columns from the same table.
– Create separate tables for each group of related data
and identify each row with a unique column or set of
columns (the primary key).
38. Second normal form (or 2NF)
A relation is in Second Normal Form (2NF) if it is in
1NF and if all of its attributes are dependent on the
whole key (i.e. none of the non-key attributes are
related only to a part of the key).
• Second normal form (2NF) further addresses
the concept of removing duplicative data:
– Remove subsets of data that apply to multiple rows
of a table and place them in separate tables.
• Create relationships between these new tables and
their predecessors through the use of foreign keys.
39. Third normal form (or 3NF)
A relation is in Third Normal Form (3NF) if it is in 2NF
and there are no transitive dependencies (i.e. none of
the non-key attributes are dependent upon another
attribute which in turn is dependent on the relation
key).
• Third normal form (3NF) goes one large step
further:
Remove columns that are not dependent upon the
primary key.
40. Fourth normal form (or 4NF)
• A relation is in 4NF if it has no
multi-valued dependencies.