2. Database Design process
• Database design can be generally defined as a collection of tasks or processes that
enhance the designing, development, implementation, and maintenance of enterprise
data management system.
• Designing a proper database reduces the maintenance cost thereby improving data
consistency and the cost-effective measures are greatly influenced in terms of disk
storage space.
• The main objectives behind database designing are to produce physical and logical
design models of the proposed database system.
• The logical model is primarily concentrated on the requirements of data and the
considerations must be made in terms of monolithic considerations and hence the stored
physical data must be stored independent of the physical conditions.
• On the other hand, the physical database design model includes a translation of the
logical design model of the database by keep control of physical media using hardware
resources and software systems such as Database Management System (DBMS).
2
3. Importance of Database Design
• Database designs provide the blueprints of how the data is going to
be stored in a system. A proper design of a database highly affects
the overall performance of any application.
• The designing principles defined for a database give a clear idea of
the behavior of any application and how the requests are processed.
• Another instance to emphasize the database design is that a proper
database design meets all the requirements of users.
• Lastly, the processing time of an application is greatly reduced if the
constraints of designing a highly efficient database are properly
implemented.
3
4. Overall workflow and life-cycle of the database
Requirement Analysis
• First, the planning has to be done on what are the basic requirements of the project under which the design of
the database has to be taken forward. Thus, they can be defined as:-
• Planning - This stage is concerned with planning the entire DDLC (Database Development Life Cycle). The
strategic considerations are taken into account before proceeding.
• System definition - This stage covers the boundaries and scopes of the proper database after planning.
4
5. Cont…
Database Designing
• The next step involves designing the database considering the user-based requirements and splitting them out into various
models so that load or heavy dependencies on a single aspect are not imposed. Therefore, there has been some model-
centric approach and that's where logical and physical models play a crucial role.
• Physical Model - The physical model is concerned with the practices and implementations of the logical model.
• Logical Model - This stage is primarily concerned with developing a model based on the proposed requirements. The
entire model is designed on paper without any implementation or adopting DBMS considerations.
Implementation
• The last step covers the implementation methods and checking out the behavior that matches our requirements.
• It is ensured with continuous integration testing of the database with different data sets and conversion of data into
machine understandable language.
• The manipulation of data is primarily focused on these steps where queries are made to run and check if the application is
designed satisfactorily or not.
• Data conversion and loading - This section is used to import and convert data from the old to the new system.
• Testing - This stage is concerned with error identification in the newly implemented system. Testing is a crucial step
because it checks the database directly and compares the requirement specifications.
5
6. Database Design Process
• The process of designing a database carries various conceptual approaches that are needed to be kept in
mind.
An ideal and well-structured database design must be able to:
• Save disk space by eliminating redundant data.
• Maintains data integrity and accuracy.
• Provides data access in useful ways.
• Comparing Logical and Physical data models.
Logical
• A logical data model generally describes the data in as many details as possible, without having to be
concerned about the physical implementations in the database. Features of logical data model might
include:
• All the entities and relationships amongst them.
• Each entity has well-specified attributes.
• The primary key for each entity is specified.
• Foreign keys which are used to identify a relationship between different entities are specified.
• Normalization occurs at this level. 6
7. Cont…
A logical model can be designed using the following approach:
• Specify all the entities with primary keys.
• Specify concurrent relationships between different entities.
• Figure out each entity attributes
• Resolve many-to-many relationships.
• Carry out the process of normalization.
7
8. Cont…
Physical
• A Physical data mode generally represents how the approach or concept of designing the database.
• The main purpose of the physical data model is to show all the structures of the table including
the column name, column data type, constraints, keys(primary and foreign), and the
relationship among tables.
The following are the features of a physical data model:
• Specifies all the columns and tables.
• Specifies foreign keys that usually define the relationship between tables.
• Based on user requirements, de-normalization might occur.
• Since the physical consideration is taken into account so there will straightforward reasons for
difference than a logical model.
• Physical models might be different for different RDBMS. For example, the data type column may be
different in MySQL and SQL Server.
8
9. Cont…
While designing a physical data model, the following points should be taken into consideration:
• Convert the entities into tables.
• Convert the defined relationships into foreign keys.
• Convert the data attributes into columns.
• Modify the data model constraints based on physical requirements.
9
10. Entity Sets
• A database can be modeled as:
– a collection of entities,
– relationship among entities.
• An entity is an object that exists and is distinguishable from other objects.
Example: specific person, company, event, plant
• Entities have attributes
Example: people have names and addresses
• An entity set is a set of entities of the same type that share the same properties.
Example: set of all persons, companies, trees, holidays
• An entity set is a group of similar entities and these entities can have attributes.
• In terms of DBMS, an entity is a table or attribute of a table in database, so by showing relationship among tables and their attributes, ER
diagram shows the complete logical structure of a database.
• An entity refers to any object having:
• Either a physical existence such as a particular person, office, house or car.
• Or a conceptual existence such as a school, a university, a company or a job.
10
11. Entity Sets customer and loan
11
customer-id customer- customer- customer- loan- amount
name street city number
12. Cont…
In ER diagram,
• Attributes are associated with an entity set.
• Attributes describe the properties of entities in the entity set.
• Based on the values of certain attributes, an entity can be identified uniquely.
Types of Entity Sets:
An entity set may be of the following two types:
12
13. Strong Entity Set:
• A strong entity set is an entity set that contains sufficient attributes to uniquely identify all its entities.
• In other words, a primary key exists for a strong entity set.
• Primary key of a strong entity set is represented by underlining it.
Symbols Used:
• A single rectangle is used for representing a strong entity set.
• A diamond symbol is used for representing the relationship that exists between two strong entity sets.
• A single line is used for representing the connection of the strong entity set with the relationship set.
• A double line is used for representing the total participation of an entity set with the relationship set.
• Total participation may or may not exist in the relationship.
13
14. Example
In this ER diagram,
• Two strong entity sets “Student” and “Course” are related to each other.
• Student ID and Student name are the attributes of entity set “Student”.
• Student ID is the primary key using which any student can be identified uniquely.
• Course ID and Course name are the attributes of entity set “Course”.
• Course ID is the primary key using which any course can be identified uniquely.
• Double line between Student and relationship set signifies total participation.
• It suggests that each student must be enrolled in at least one course.
• Single line between Course and relationship set signifies partial participation.
• It suggests that there might exist some courses for which no enrollments are made.
14
15. Weak Entity Set
• A weak entity set is an entity set that does not contain sufficient attributes to uniquely identify its entities.
• In other words, a primary key does not exist for a weak entity set.
• However, it contains a partial key called as a discriminator.
• Discriminator can identify a group of entities from the entity set.
• Discriminator is represented by underlining with a dashed line.
NOTE:
• The combination of discriminator and primary key of the strong entity set makes it possible to uniquely identify all
entities of the weak entity set.
• Thus, this combination serves as a primary key for the weak entity set.
• Clearly, this primary key is not formed by the weak entity set completely.
Symbols Used:
• A double rectangle is used for representing a weak entity set.
• A double diamond symbol is used for representing the relationship that exists between the strong and weak entity sets
and this relationship is known as identifying relationship.
• A double line is used for representing the connection of the weak entity set with the relationship set.
• Total participation always exists in the identifying relationship.
15
16. Example
In this ER diagram,
• One strong entity set “Building” and one weak entity set “Apartment” are related to each other.
• Strong entity set “Building” has building number as its primary key.
• Door number is the discriminator of the weak entity set “Apartment”.
• This is because door number alone can not identify an apartment uniquely as there may be several other buildings having the same door number.
• Double line between Apartment and relationship set signifies total participation.
• It suggests that each apartment must be present in at least one building.
• Single line between Building and relationship set signifies partial participation.
• It suggests that there might exist some buildings which has no apartment.
To uniquely identify any apartment,
• First, building number is required to identify the particular building.
• Secondly, door number of the apartment is required to uniquely identify the apartment.
Thus,
Primary key of Apartment = Primary key of Building + Its own discriminator = Building number + Door number
16
18. Attributes
• An entity is represented by a set of attributes, that is descriptive properties possessed by all
members of an entity set.
• In general, an attribute is a characteristic.
• In a database management system (DBMS), an attribute refers to a database component, such
as a table.
• It also may refer to a database field.
• Attributes describe the instances in the column of a database.
Example:
customer = (customer-id, customer-name,customer-street, customer-city)
loan = (loan-number, amount)
• Domain – the set of permitted values for each attribute
• Attribute types:
– Simple and composite attributes.
– Single-valued and multi-valued attributes
• E.g. multivalued attribute: phone-numbers
– Derived attributes
• Can be computed from other attributes
• E.g. age, given date of birth
18
20. Relationship Sets
• A relationship is an association among several entities
Example:
Hayes depositor A-102
customer entity relationship set account entity
• A relationship set is a mathematical relation among n 2 entities, each taken from entity sets
{(e1, e2, … en) | e1 E1, e2 E2, …, en En}
where (e1, e2, …, en) is a relationship
– Example:
(Hayes, A-102) depositor
1)Unary relationship setOne entity alone participates
2)Binary relationship set Two entity sets will participates in relationshipset
3)Ternary relationship set Three entity sets will participates in relationshipset
4)N-ary relationship set ‘n’entity sets will participates in relationshipset
20
22. Relationship Sets (Cont.)
• An attribute can also be property of a relationship set.
• For instance, the depositor relationship set between entity sets
customer and account may have the attribute access-date
22
23. Degree of a Relationship Set
• Refers to number of entity sets that participate in a relationship set.
• Relationship sets that involve two entity sets are binary (or degree
two). Generally, most relationship sets in a database system are
binary.
• Relationship sets may involve more than two entity sets.
– E.g. Suppose employees of a bank may have jobs (responsibilities) at
multiple branches, with different jobs at different branches. Then there is a
ternary relationship set between entity sets employee, job and branch
– Degree of relationship set= number of entity sets participating in a
relationship set
• Relationships between more than two entity sets are rare. Most
relationships are binary.
23
24. Mapping Cardinalities
• Express the number of entities to which another entity can be
associated via a relationship set.
• Most useful in describing binary relationship sets.
• For a binary relationship set the mapping cardinality must be
one of the following types:
– One to one
– One to many
– Many to one
– Many to many
24
25. Mapping Cardinalities
25
One to one One to many
Note: Some elements in A and B may not be mapped to any
elements in the other set
26. Mapping Cardinalities
26
Many to one Many to many
Note: Some elements in A and B may not be mapped to any
elements in the other set
27. MappingCardinalities affect ER Design
27
Can make access-date an attribute of account, instead of a
relationship attribute, if each account can have only one customer
I.e., the relationship from account to customer is many to one,
or equivalently, customer to account is one to many
28. ER Diagram
• An Entity–relationship model (ER model) describes the structure
of a database with the help of a diagram, which is known as Entity
Relationship Diagram (ER Diagram).
• An ER model is a design or blueprint of a database that can later be
implemented as a database.
• The main components of E-R model are: entity set and relationship
set.
• ER model is a logical representation of an enterprise data. ER model
is a diagrammatic representation of logical structure of database.
• ER model describes relationship among entities and attributes.
• ER diagram is firstly developed by Peter Chen in 1976.
28
29. What is an Entity Relationship Diagram (ER Diagram)?
• An ER diagram shows the relationship among entity sets.
• An entity set is a group of similar entities and these entities can
have attributes.
• In terms of DBMS, an entity is a table or attribute of a table in
database, so by showing relationship among tables and their
attributes, ER diagram shows the complete logical structure of a
database. Lets have a look at a simple ER diagram to understand
this concept.
29
30. ER Diagram:
• In the following diagram we have two entities Student and College and their relationship.
• The relationship between Student and College is many to one as a college can have many students
however a student cannot study in multiple colleges at the same time.
• Student entity has attributes such as Stu_Id, Stu_Name & Stu_Addr and College entity has attributes such
as Col_ID & Col_Name.
30
31. E-R Diagrams
31
Rectangle: Represents Entity sets.
Ellipses: Attributes
Double Ellipses: Multivalued Attributes
Dashed Ellipses: Derived Attributes
Diamonds: Relationship Set
Lines: They link attributes to Entity Sets and Entity sets to Relationship Set
Double Rectangles: Weak Entity Sets
Double Lines: Total participation of an entity in a relationship set
33. Cont…
An ER diagram has three main components:
1. Entity
2. Attribute
3. Relationship
1. Entity
• An entity is an object or component of data. An entity is represented as rectangle in an ER
diagram.
For example: In the following ER diagram we have two entities Student and College and
these two entities have many to one relationship as many students study in a single college.
33
34. Cont…
Weak Entity:
An entity that cannot be uniquely identified by its own attributes and relies on
the relationship with other entity is called weak entity.
• The weak entity is represented by a double rectangle.
• For example – a bank account cannot be uniquely identified without knowing
the bank to which the account belongs, so bank account is a weak entity.
34
35. Attribute
• An attribute describes the property of an entity. An attribute is represented as Oval in an ER diagram. There
are four types of attributes:
1. Key attribute
2. Composite attribute
3. Multivalued attribute
4. Derived attribute
1. Key attribute:
• A key attribute can uniquely identify an entity from an entity set.
• For example, student roll number can uniquely identify a student from a set of students.
• Key attribute is represented by oval same as other attributes however the text of key attribute is underlined.
35
36. Cont…
2. Composite attribute:
• An attribute that is a combination of other attributes is known as composite
attribute.
• For example, In student entity, the student address is a composite attribute as
an address is composed of other attributes such as pin code, state, country.
36
37. Cont…
3.Multivalued attribute:
• An attribute that can hold multiple values is known as multivalued attribute. It is represented with double ovals in
an ER Diagram.
• For example – A person can have more than one phone numbers so the phone number attribute is multivalued.
4. Derived attribute:
• A derived attribute is one whose value is dynamic and derived from another attribute. It is represented by dashed
oval in an ER Diagram.
• For example – Person age is a derived attribute as it changes over time and can be derived from another attribute
(Date of birth).
E-R diagram with multivalued and derived attributes:
37
38. E-R Diagram With Key ,Composite, Multivalued, and Derived
Attributes
38
39. Relationship
• A relationship is represented by diamond shape in ER diagram, it
shows the relationship among entities.
• There are four types of relationships:
1. One to One
2. One to Many
3. Many to One
4. Many to Many
39
41. Roles
• Entity sets of a relationship need not be distinct
• The labels “manager” and “worker” are called roles; they specify how employee entities interact via the
works-for relationship set.
• Roles are indicated in E-R diagrams by labeling the lines that connect diamonds to rectangles.
• Role labels are optional, and are used to clarify semantics of the relationship
41
42. Cardinality Constraints
• We express cardinality constraints by drawing either a directed line (), signifying
“one,” or an undirected line (—), signifying “many,” between the relationship set
and the entity set.
• E.g.: One-to-one relationship:
– A customer is associated with at most one loan via the relationship borrower
– A loan is associated with at most one customer via borrower
42
43. One-To-Many Relationship
• In the one-to-many relationship a loan is associated with at most one
customer via borrower, a customer is associated with several (including 0)
loans via borrower
43
44. Many-To-One Relationships
• In a many-to-one relationship a loan is associated with several (including 0)
customers via borrower, a customer is associated with at most one loan via
borrower
44
45. Many-To-Many Relationship
• A customer is associated with several (possibly
0) loans via borrower
• A loan is associated with several (possibly 0)
customers via borrower
45
46. Keys
• Keys play an important role in the relational database.
• It is used to uniquely identify any record or row of data from the table. It is also used to establish and
identify relationships between tables.
• For example: In Student table, ID is used as a key because it is unique for each student. In PERSON
table, passport_number, license_number, SSN are keys since they are unique for each person.
46
48. Cont…
Primary key
• It is the first key which is used to identify one and only one instance of an entity uniquely. An entity can contain multiple
keys as we saw in PERSON table. The key which is most suitable from those lists become a primary key.
• In the EMPLOYEE table, ID can be primary key since it is unique for each employee. In the EMPLOYEE table, we can
even select License_Number and Passport_Number as primary key since they are also unique.
• For each entity, selection of the primary key is based on requirement and developers.
48
49. Cont…
Candidate key
• A candidate key is an attribute or set of an attribute which can uniquely identify a tuple.
• The remaining attributes except for primary key are considered as a candidate key. The
candidate keys are as strong as the primary key.
• For example: In the EMPLOYEE table, id is best suited for the primary key. Rest of the
attributes like SSN, Passport_Number, and License_Number, etc. are considered as a
candidate key.
49
50. Cont…
Super Key
• Super key is a set of an attribute which can uniquely identify a tuple.
Super key is a superset of a candidate key.
• For example: In the above EMPLOYEE table, for(EMPLOEE_ID,
EMPLOYEE_NAME) the name of two employees can be the same, but
their EMPLYEE_ID can't be the same. Hence, this combination can also
be a key.
• The super key would be EMPLOYEE-ID, (EMPLOYEE_ID,
EMPLOYEE-NAME), etc.
50
51. Cont…
Foreign key
• Foreign keys are the column of the table which is used to point to the primary key of another table.
• In a company, every employee works in a specific department, and employee and department are two different
entities. So we can't store the information of the department in the employee table. That's why we link these two
tables through the primary key of one table.
• We add the primary key of the DEPARTMENT table, Department_Id as a new attribute in the EMPLOYEE table.
• Now in the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related.
51
52. Constraints
• Constraints are rules used to limit the type of data that can go into a table, to maintain the accuracy and
integrity of the data inside table.
• Constraints can be divided into the following two types,
Column level constraints: Limits only column data.
Table level constraints: Limits whole table data.
• Constraints are used to make sure that the integrity of data is maintained in the database. Following are the
most used constraints that can be applied to a table.
NOT NULL
UNIQUE
DEFAULT
CHECK
Key Constraints – PRIMARY KEY, FOREIGN KEY
Domain constraints
Mapping constraints
52
53. Cont…
NOT NULL Constraint
• By default, a column can hold NULL values. If you do not want a column to have a NULL value, use the NOT NULL constraint.
• It restricts a column from having a NULL value.
• We use ALTER statement and MODIFY statement to specify this constraint.
• One important point to note about this constraint is that it cannot be defined at table level.
Example using NOT NULL constraint:
CREATE TABLE Student ( s_id int NOT NULL, name varchar(60), age int );
• The above query will declare that the s_id field of Student table will not take NULL value.
• If you wish to alter the table after it has been created, then we can use the ALTER command for it:
ALTER TABLE Student MODIFY s_id int NOT NULL;
Example:
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (235),
PRIMARY KEY (ROLL_NO)
);
53
54. Cont…
UNIQUE Constraint
• It ensures that a column will only have unique values. A UNIQUE constraint field cannot have any duplicate data.
• It prevents two records from having identical values in a column
Example of UNIQUE Constraint:
• Here we have a simple CREATE query to create a table, which will have a column s_id with unique values.
CREATE TABLE Student ( s_id int NOT NULL, name varchar(60), age int NOT NULL UNIQUE );
The above query will declare that the s_id field of Student table will only have unique values and wont take NULL value.
• If you wish to alter the table after it has been created, then we can use the ALTER command for it:
ALTER TABLE Student MODIFY age INT NOT NULL UNIQUE;
• The above query specifies that s_id field of Student table will only have unique value.
Example:
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
); 54
55. Cont…
DEFAULT:
• The DEFAULT constraint provides a default value to a column when there is
no value provided while inserting a record into a table.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
STU_ADDRESS VARCHAR (35) ,
PRIMARY KEY (ROLL_NO)
);
55
56. Cont…
CHECK:
• This constraint is used for specifying range of values for a particular column of a table. When this constraint
is being set on a column, it ensures that the specified column must have the value falling in the specified
range.
• CHECK constraint is used to restrict the value of a column between a range. It performs check on the
values, before storing them into the database. Its like condition checking before saving data into a column.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL CHECK(ROLL_NO >1000) ,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
STU_ADDRESS VARCHAR (35) ,
PRIMARY KEY (ROLL_NO)
);
• In the above example we have set the check constraint on ROLL_NO column of STUDENT table. Now, the
ROLL_NO field must have the value greater than 1000.
56
57. Cont…
Using CHECK constraint at Table Level
CREATE table Student( s_id int NOT NULL CHECK(s_id > 0), Name
varchar(60) NOT NULL, Age int );
The above query will restrict the s_id value to be greater than zero.
Using CHECK constraint at Column Level
ALTER table Student ADD CHECK(s_id > 0);
57
58. Key Constraints:
PRIMARY KEY:
• Primary key uniquely identifies each record in a table.
• A Primary Key must contain unique value and it must not contain null value. Usually
Primary Key is used to index the data inside the table.
• In the below example the ROLL_NO field is marked as primary key, that means the
ROLL_NO field cannot have duplicate and null values.
Example:
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
);
58
59. Cont…
PRIMARY KEY constraint at Table Level
• CREATE table Student ( s_id int PRIMARY KEY, Name varchar(60) NOT NULL, Age
int);
• The above command will creates a PRIMARY KEY on the s_id.
PRIMARY KEY constraint at Column Level
• ALTER table Student ADD PRIMARY KEY (s_id);
• The above command will creates a PRIMARY KEY on the s_id.
59
60. Cont…
Foreign Key Constraint
• Foreign Key is used to relate two tables. The relationship between the two tables matches the Primary Key in one of the tables with a Foreign
Key in the second table.This is also called a referencing key.
• Foreign keys are the columns of a table that points to the primary key of another table. They act as a cross-reference between tables.
• Foreign keys are the columns of a table that points to the candidate key of another table.
• We use ALTER statement and ADD statement to specify this constraint.
To understand FOREIGN KEY, let's see its use, with help of the below tables:
• Customer_Detail Table
• Order_Detail Table
• In Customer_Detail table, c_id is the primary key which is set as foreign key in Order_Detail table. The value that is entered in c_id which is
set as foreign key in Order_Detail table must be present in Customer_Detail table where it is set as primary key. This prevents invalid data to
be inserted into c_id column of Order_Detail table.
• If you try to insert any incorrect data, DBMS will return error and will not allow you to insert the data.
60
61. Cont…
FOREIGN KEY constraint at Table Level
• CREATE table Order_Detail( order_id int PRIMARY KEY, order_name
varchar(60) NOT NULL, c_id int FOREIGN KEY REFERENCES
Customer_Detail(c_id) );
• In this query, c_id in table Order_Detail is made as foriegn key,
which is a reference of c_id column in Customer_Detail table.
FOREIGN KEY constraint at Column Level
• ALTER table Order_Detail ADD FOREIGN KEY (c_id) REFERENCES
Customer_Detail(c_id);
61
62. Cont…
Domain constraints:
• Each table has certain set of columns and each column allows a same type of
data, based on its data type. The column does not accept values of any other
data type.
• Domain constraints are user defined data type and we can define them like
this:
Domain Constraint = data type + Constraints (NOT NULL / UNIQUE /
PRIMARY KEY / FOREIGN KEY / CHECK / DEFAULT)
62
64. Advantages of ER Model
• Conceptually it is very simple: ER model is very simple because if we know relationship
between entities and attributes, then we can easily draw an ER diagram.
• Better visual representation: ER model is a diagrammatic representation of any logical
structure of database. By seeing ER diagram, we can easily understand relationship among
entities and relationship.
• Effective communication tool: It is an effective communication tool for database designer.
• Highly integrated with relational model: ER model can be easily converted into relational
model by simply converting ER model into tables.
• Easy conversion to any data model: ER model can be easily converted into another data
model like hierarchical data model, network data model and so on.
64
65. Disadvantages of ER Model
• Limited constraints and specification
• Loss of information content: Some information be lost or hidden in ER model
• Limited relationship representation: ER model represents limited relationship as
compared to another data models like relational model etc.
• No representation of data manipulation: It is difficult to show data manipulation in ER
model.
• Popular for high level design: ER model is very popular for designing high level design
• No industry standard for notation
65
66. Mapping Cardinality
• Whenever an attribute of one entity type refers to another entity type, then some
relationship exists between them.
• The attribute Manager of the department refers to an employee who manages the
department.
• In the ER model, these references are represented as relationships.
66
67. Cont…
• Relationship :
The relationship in the ER model is represented using a diamond-shaped box.
• ‘Buys’ is a relationship between customer entity and products. This relationship can be read as ‘A customer
buys a product/products.
• Therefore, a relationship is a way to connect multiple entities.
• When a customer buys a product, there is a timestamp associated with it, so the attribute “Time” will be an
attribute of ‘Buys’.
67
68. Cont…
• According to the Relation/table perspective or relational model :
It can be represented as
68
69. Cont…
• Mapping cardinality/cardinality ratio :
Mapping cardinality is the maximum number of relationship instances in which an entity can participate.
• Mathematically, here (e1, e2,e3…) are instances of an entity set Employee and (d1,d2, d3 ….) are the
instances of entity type department and (r1, r2, r3 …) are relationship instances of relationship type.
• Each instance ri(where i = 1,2,3,….) in R, is an association of entities, and the association includes exactly
one entity from each participating entity type. Each such relationship instance, ri represents that the entities
participating in ri are related in some way by any constraint/condition provided by the user to a designer.
69
70. Participation or existence constraint
• It represents the minimum number of relationship instances that each entity can participate in and it is also
called the minimum cardinality constraint. There are two types of participation constraints, which are total
and partial.
Example :
• In the above example, if the company policy is that every employee should work for a department. Then all the
employees in the employee entity set must be related to the department by a works_for relationship. Therefore,
the participation of the employee entity type is total in the relationship type. The total participation is also
called existence dependency.
• And if there is a constraint that a new department need not have employees, then some entity in the employee
entity set is not related to the department entity by works_for relationship. Therefore, the participation of
employee entity in this relationship(works_for) is partial.
• In the ER diagram, the total participation is represented using a double line connecting the participating entity
type to the relationship, and a single line is used for partial participation.
• The cardinality ratio and participation constraint together is called structural constraint of the relationship type.
70
71. Cardinality ratios for binary relationships
1.One to one relationship(1:1) :
• It is represented using an arrow(⇢,⇠)(There can be many notations possible for the ER diagram).
Example:
• In this ER diagram, both entities customer and driving license having an arrow which means the entity
Customer is participating in the relation “has a” in a one-to-one fashion. It could be read as ‘Each customer has
exactly one driving license and every driving license is associated with exactly one customer.
• Eg: There may be customers who do not have a credit card, but every credit card is associated with exactly one
customer. Therefore, the entity customer has total participation in a relation.
71
72. Cont…
2. One to many relationship (1:M) :
Example:
• This relationship is one to many because “There are some employees who
manage more than one team while there is only one manager to manage a
team”.
72
73. Cont…
3. Many to one relationship (M:1) :
Example :
• It is related to a one-to-many relationship but the difference is due to perspective.
Any number of credit cards can belong to a customer and there might be some customers who do not have any
credit card, but every credit card in a system has to be associated with an employee(i.e. total participation).
While a single credit card can not belong to multiple customers.
73
74. Cont…
4. Many to many relationship (M:N) :
Example:
A customer can buy any number of products and a product can be bought by many customers.
• Any of the four cardinalities of a binary relationship can have both sides partial, both total, and one partial, and
one total participation, depending on the constraints specified by user requirements. 74
75. Extended (or) Enhanced Entity-Relationship (EE-R) Model
• EER is a high-level data model that incorporates the extensions to the original ER
model.
• Enhanced ERD are high level models that represent the requirements and complexities
of complex database.
– Created to design more accurate database schemas
• Reflect the data properties and constraints more precisely
– More complex requirements than traditional applications
• EER model includes all modeling concepts of the ER model.
• In addition to ER model concepts EE-R includes:
Subclasses and Super classes.
Specialization and Generalization.
Attribute and relationship inheritance
Category or union type.
Aggregation.
• These concepts are used to create EE-R diagrams.
75
76. Subclasses and Super class
• Super class is an entity that can be divided into further subtype.
• For example − consider Shape super class.
• Super class shape has sub groups: Triangle, Square and Circle.
• Sub classes are the group of entities with some unique attributes. Sub class inherits the properties and attributes from super class.
• An entity type may have additional meaningful subgroupings of its entities
– Example: EMPLOYEE may be further grouped into:
• SECRETARY, ENGINEER, TECHNICIAN, …
• Based on the EMPLOYEE’s Job
• MANAGER
• EMPLOYEEs who are managers
• SALARIED_EMPLOYEE, HOURLY_EMPLOYEE
• Based on the EMPLOYEE’s method of pay
• EER diagrams extend ER diagrams to represent these additional subgroupings, called subclasses or subtypes.
76
78. Subclasses and Superclasses
• Each of these subgroupings is a subset of EMPLOYEE entities
• Each is called a subclass of EMPLOYEE
• EMPLOYEE is the superclass for each of these subclasses
• These are called superclass/subclass relationships:
– EMPLOYEE/SECRETARY
– EMPLOYEE/TECHNICIAN
– EMPLOYEE/MANAGER
– …
78
79. Subclasses and Superclasses
• These are also called IS-A relationships
– SECRETARY IS-A EMPLOYEE, TECHNICIAN IS-A EMPLOYEE, ….
• Note: An entity that is member of a subclass represents the same real-world
entity as some member of the superclass:
– The subclass member is the same entity in a distinct specific role
– An entity cannot exist in the database merely by being a member of a
subclass; it must also be a member of the superclass
– A member of the superclass can be optionally included as a member of any
number of its subclasses
79
80. Subclasses and Superclasses
• Examples:
– A salaried employee who is also an engineer belongs to the two subclasses:
• ENGINEER, and
• SALARIED_EMPLOYEE
– A salaried employee who is also an engineering manager belongs to the
three subclasses:
• MANAGER,
• ENGINEER, and
• SALARIED_EMPLOYEE
• It is not necessary that every entity in a superclass be a member of some
subclass
80
81. Attribute Inheritance in Superclass / Subclass Relationships
• An entity that is member of a subclass inherits
– All attributes of the entity as a member of the superclass
– All relationships of the entity as a member of the superclass
• Example:
– In the previous slide, SECRETARY (as well as TECHNICIAN and
ENGINEER) inherit the attributes Name, SSN, …, from EMPLOYEE
– Every SECRETARY entity will have values for the inherited attributes
81
82. Specialization
• Specialization is the process of defining a set of subclasses of a
superclass
• The set of subclasses is based upon some distinguishing
characteristics of the entities in the superclass
–Example: {SECRETARY, ENGINEER, TECHNICIAN} is a
specialization of EMPLOYEE based upon job type.
• May have several specializations of the same superclass
82
84. Specialization
• Example: Another specialization of EMPLOYEE based on method of pay is
{SALARIED_EMPLOYEE, HOURLY_EMPLOYEE}.
– Superclass/subclass relationships and specialization can be
diagrammatically represented in EER diagrams
– Attributes of a subclass are called specific or local attributes.
• For example, the attribute TypingSpeed of SECRETARY
– The subclass can also participate in specific relationship types.
• For example, a relationship BELONGS_TO of HOURLY_EMPLOYEE
84
85. Specialization and Generalization
• Generalization is a process of generalizing an entity which contains generalized attributes or properties of generalized entities.
• It is a Bottom up process i.e. consider we have 3 sub entities Car, Truck and Motorcycle. Now these three entities can be
generalized into one super class named as Vehicle.
• Specialization
– Process of defining a set of subclasses or identifying subsets of an entity type.
– Defined on the basis of some distinguishing characteristic of the entities in the superclass
• Specialization is a top down approach in which one entity is broken down into low level entity.
• In above example Vehicle entity can be a Car, Truck or Motorcycle.
• Subclass can define:
– Specific attributes
– Specific relationship types
85
87. Generalization
• Reverse process of abstraction
• Generalize into a single superclass
– Original entity types are special subclasses
• Generalization
– Process of defining a generalized entity type from the given entity types
87
88. Constraints on Specialization and Generalization
• If we can determine exactly those entities that will become
members of each subclass by a condition, the subclasses are
called predicate-defined (or condition-defined) subclasses
–Condition is a constraint that determines subclass members
–Display a predicate-defined subclass by writing the predicate
condition next to the line attaching the subclass to its
superclass
88
89. Constraints on Specialization and Generalization
• If all subclasses in a specialization have membership condition on same
attribute of the superclass, specialization is called an attribute-defined
specialization
– Attribute is called the defining attribute of the specialization
– Example: JobType is the defining attribute of the specialization
{SECRETARY, TECHNICIAN, ENGINEER} of EMPLOYEE
• If no condition determines membership, the subclass is called user-defined
– Membership in a subclass is determined by the database users by applying
an operation to add an entity to the subclass
– Membership in the subclass is specified individually for each entity in the
superclass by the user
89
91. Constraints on Specialization and Generalization
• Two basic constraints can apply to a specialization/generalization:
– Disjointness Constraint.
– Completeness Constraint.
• Disjointness Constraint:
– Specifies that the subclasses of the specialization must be disjoint:
• an entity can be a member of at most one of the subclasses of the specialization
– Specified by d in EER diagram
– If not disjoint, specialization is overlapping:
• that is the same entity may be a member of more than one subclass of the
specialization
– Specified by o in EER diagram
91
92. Constraints on Specialization and Generalization
• Completeness Constraint:
– Total specifies that every entity in the superclass must be a member of some subclass in
the specialization/generalization
– Shown in EER diagrams by a double line
– Partial allows an entity not to belong to any of the subclasses
– Shown in EER diagrams by a single line
• Hence, we have four types of specialization/generalization:
– Disjoint, total
– Disjoint, partial
– Overlapping, total
– Overlapping, partial
• Note: Generalization usually is total because the superclass is derived from the subclasses.
92
95. Specialization/Generalization Hierarchies, Lattices & Shared Subclasses
• A subclass may itself have further subclasses specified on it
– forms a hierarchy or a lattice
• Hierarchy has a constraint that every subclass has only one superclass (called single inheritance); this is
basically a tree structure
• In a lattice, a subclass can be subclass of more than one superclass (called multiple inheritance)
Shared Subclass “Engineering_Manager”
95
96. Specialization/Generalization Hierarchies, Lattices & Shared Subclasses
• In a lattice or hierarchy, a subclass inherits attributes not only of its direct superclass, but also of all its
predecessor superclasses
• A subclass with more than one superclass is called a shared subclass (multiple inheritance)
• It can have:
– specialization hierarchies or lattices, or
– generalization hierarchies or lattices,
– depending on how they were derived
• In specialization, start with an entity type and then define subclasses of the entity type by successive
specialization
– called a top down conceptual refinement process
• In generalization, start with many entity types and generalize those that have common properties
– Called a bottom up conceptual synthesis process
• In practice, a combination of both processes is usually employed
96
98. Category or Union
• Relationship of one super or sub class with more than one super
class.
• Owner is the subset of two super class: Vehicle and House.
98
99. Aggregation
• Represents relationship between a whole object and its component.
• Consider a ternary relationship Works_On between Employee, Branch and Manager. Now the best way to model this
situation is to use aggregation, So, the relationship-set, Works_On is a higher level entity-set. Such an entity-set is treated
in the same manner as any other entity-set. We can create a binary relationship, Manager, between Works_On and Manager
to represent who manages what tasks.
99
100. ER Design Issues
1. Choosing Entity Set vs Attributes
• How choosing an entity set vs an attribute can change the whole ER
design semantics.
• To understand this lets take an example, let’s say we have an entity
set Student with attributes such as student-name and student-id.
• Now we can say that the student-id itself can be an entity with the
attributes like student-class and student-section.
• Now if we compare the two cases we discussed above, in the first
case we can say that the student can have only one student id,
however in the second case when we chose student id as an entity it
implied that a student can have more than one student id. 100
101. Cont…
2. Choosing Entity Set vs. Relationship Sets
• It is hard to decide that an object can be best represented by an entity set or relationship set.
• To comprehend and decide the perfect choice between these two (entity vs relationship), the
user needs to understand whether the entity would need a new relationship if a requirement
arise in future, if this is the case then it is better to choose entity set rather than relationship
set.
• Let’s take an example to understand it better: A person takes a loan from a bank, here we
have two entities person and bank and their relationship is loan. This is fine until there is a
need to disburse a joint loan, in such case a new relationship needs to be created to define
the relationship between the two individuals who have taken joint loan. In this scenario, it is
better to choose loan as an entity set rather than a relationship set.
101
102. Cont…
3. Choosing Binary vs n-ary Relationship Sets
• In most cases, the relationships described in an ER diagrams are binary. The n-
ary relationships are those where entity sets are more than two, if the entity sets are only
two, their relationship can be termed as binary relationship.
• The n-ary relationships can make ER design complex, however the good news is that we
can convert and represent any n-ary relationship using multiple binary relationships.
• This may sound confusing so lets take an example to understand how we can convert an n-
ary relationship to multiple binary relationships.
• Now lets say we have to describe a relationship between four family members: father,
mother, son and daughter. This can easily be represented in forms of multiple binary
relationships, father-mother relationship as “spouse”, son and daughter relationship as
“siblings” and father and mother relationship with their child as “child”.
102
103. Cont…
4. Placing Relationship Attributes
• The cardinality ratio in DBMS can help us determine in which
scenarios we need to place relationship attributes. It is recommended
to represent the attributes of one to one or one to many relationship
sets with any participating entity sets rather than a relationship set.
• For example, if an entity cannot be determined as a separate entity
rather it is represented by the combination of participating entity
sets. In such case it is better to associate these entities to many-to-
many relationship sets.
103
104. Relational Model concept
• Relational model can represent as a table with columns and rows. Each row is
known as a tuple. Each table of the column has a name or attribute.
• Domain: It contains a set of atomic values that an attribute can take.
• Attribute: It contains the name of a column in a particular table. Each attribute Ai
must have a domain, dom(Ai)
• Relational instance: In the relational database system, the relational instance is
represented by a finite set of tuples. Relation instances do not have duplicate tuples.
• Relational schema: A relational schema contains the name of the relation and name
of all columns or attributes.
• Relational key: In the relational key, each row has one or more attributes. It can
identify the row in the relation uniquely.
104
105. Example: STUDENT Relation
• In the given table, NAME, ROLL_NO, PHONE_NO, ADDRESS, and AGE are the attributes.
• The instance of schema STUDENT has 5 tuples.
• t3 = <Laxman, 33289, 8583287182, Gurugram, 20>
105
106. Properties of Relations
• Name of the relation is distinct from all other relations.
• Each relation cell contains exactly one atomic (single) value
• Each attribute contains a distinct name
• Attribute domain has no significance
• tuple has no duplicate value
• Order of tuple can have a different sequence
106