1. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page1
Pondicherry University Community College
Department of Computer Science
Course : B.Voc [Software Development]
Year : II
Semester : III
Subject : Relational DataBase Management System
Unit II Study Material
Prepared by
D.GAYA
Assistant Professor,
Department of Computer Science,
Pondicherry University Community College,
Lawspet, Puducherry-08.
2. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page2
Unit-II
Database Management Systems – Tree Structures – Plex Structures – Data Description
Languages. Relational Databases – Third Normal Form – Canonical Data structures –
Varieties of data independences.
Introduction
Database Management System (DBMS) refers to the technology solution used to
optimize and manage the storage and retrieval of data from databases. DBMS offers a
systematic approach to manage databases via an interface for users as well as workloads
accessing the databases via apps.
The management responsibilities for DBMS encompass information within the
databases, the processes applied to databases (such as access and modification), and the
database’s logic structure.
DBMS also facilitates additional administrative operations such as change
management, disaster recovery, compliance, and performance monitoring, among others.
In order to facilitate these functions, DBMS has the following key components:
Software. DBMS is primarily a software system that can be considered as a
management console or an interface to interact with and manage databases. The interfacing
also spreads across real-world physical systems that contribute data to the backend databases.
The OS, networking software, and the hardware infrastructure is involved in creating,
accessing, managing, and processing the databases.
• Data. DBMS contains operational data, access to database records and metadata as a
resource to perform the necessary functionality. The data may include files with such
as index files, administrative information, and data dictionaries used to represent data
flows, ownership, structure, and relationships to other records or objects.
• Procedures. While not a part of the DBMS software, procedures can be considered as
instructions on using DBMS. The documented guidelines assist users in designing,
modifying, managing, and processing databases.
• Database languages. These are components of the DBMS used to access, modify,
store, and retrieve data items from databases; specify database schema; control user
3. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page3
access; and perform other associated database management operations. Types of
DBMS languages include Data Definition Language (DDL), Data Manipulation
Language (DML), Database Access Language (DAL) and Data Control Language
(DCL).
• Query processor. As a fundamental component of the DBMS, the query processor
acts as an intermediary between users and the DBMS data engine in order to
communicate query requests. When users enter an instruction in SQL language, the
command is executed from the high-level language instruction to a low-level
language that the underlying machine can understand and process to perform the
appropriate DBMS functionality. In addition to instruction parsing and translation, the
query processor also optimizes queries to ensure fast processing and accurate results.
• Runtime database manager. A centralized management component of DBMS that
handles functionality associated with runtime data, which is commonly used for
context-based database access. This component checks for user authorization to
request the query; processes the approved queries; devises an optimal strategy for
query execution; supports concurrency so that multiple users can simultaneously work
on same databases; and ensures integrity of data recorded into the databases.
• Database manager. Unlike the runtime database manager that handles queries and
data at runtime, the database manager performs DBMS functionality associated with
the data within databases. Database manager allows a set of commands to perform
different DBMS operations that include creating, deleting, backup, restoring, cloning,
and other database maintenance tasks. The database manager may also be used to
update the database with patches from vendors.
• Database engine. This is the core software component within the DBMS solution that
performs the core functions associated with data storage and retrieval. A database
engine is also accessible via APIs that allow users or apps to create, read, write, and
delete records in databases.
• Reporting. The report generator extracts useful information from DBMS files and
displays it in structured format based on defined specifications. This information may
be used for further analysis, decision making, or business intelligence.
4. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page4
Benefits of DBMS
DBMS was designed to solve the fundamental problems associated with storing,
managing, accessing, securing, and auditing data in traditional file systems. Traditional
database applications were developed on top of the databases, which led to challenges such as
data redundancy, isolation, integrity constraints, and difficulty managing data access. A layer
of abstraction was required between users or apps and the databases at a physical and logical
level.
Introducing DBMS software to manage databases results in the following benefits:
• Data security. DBMS allows organizations to enforce policies that enable
compliance and security. The databases are available for appropriate users according
to organizational policies. The DBMS system is also responsible to maintain optimum
performance of querying operations while ensuring the validity, security and
consistency of data items updated to a database.
• Data sharing. Fast and efficient collaboration between users.
• Data access and auditing. Controlled access to databases. Logging associated access
activities allows organizations to audit for security and compliance.
• Data integration. Instead of operating island of database resources, a single interface
is used to manage databases with logical and physical relationships.
5. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page5
• Abstraction and independence. Organizations can change the physical schema of
database systems without necessitating changes to the logical schema that govern
database relationships. As a result, organizations can upgrade storage and scale the
infrastructure without impacting database operations. Similarly, changes to the logical
schema can be applied without altering the apps and services that access the databases.
• Uniform management and administration. A single console interface to perform
basic administrative tasks makes the job easier for database admins and IT users.
Applications of DBMS
Database is a collection of related data and data is a collection of facts and figures that can be
processed to produce information.
Mostly data represents recordable facts. Data aids in producing information, which is based
on facts. For example, if we have data about marks obtained by all students, we can then
conclude about toppers and average marks.
A database management system stores data in such a way that it becomes easier to retrieve,
manipulate, and produce information. Following are the important characteristics and
applications of DBMS.
• ACID Properties
DBMS follows the concepts of Atomicity, Consistency, Isolation,
and Durability (normally shortened as ACID). These concepts are applied on
transactions, which manipulate data in a database. ACID properties help the database
stay healthy in multi-transactional environments and in case of failure.
• Multiuser and Concurrent Access
DBMS supports multi-user environment and allows them to access and
manipulate data in parallel. Though there are restrictions on transactions when users
attempt to handle the same data item, but users are always unaware of them.
• Multiple views
DBMS offers multiple views for different users. A user who is in the Sales
department will have a different view of database than a person working in the
Production department. This feature enables the users to have a concentrate view of
the database according to their requirements.
6. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page6
• Security
Features like multiple views offer security to some extent where users are
unable to access data of other users and departments. DBMS offers methods to
impose constraints while entering data into the database and retrieving the same at a
later stage. DBMS offers many different levels of security features, which enables
multiple users to have different views with different features.
For example, a user in the Sales department cannot see the data that belongs to
the Purchase department. Additionally, it can also be managed how much data of the
Sales department should be displayed to the user. Since a DBMS is not saved on the
disk as traditional file systems, it is very hard for miscreants to break the code.
Data Model in DBMS
• A model is an abstraction process that represent essential features without including
the background details or explanations. It hides superfluous details while highlighting
details pertinent to the application at hand.
• A data model is a mechanism that provides this abstraction for database applications.
Data modelling is used for representing entities of interest and their relationships in
the data base.
• A data model defines the logical structure of a data base means that how data is
connected to each other and how they are processed and stored inside a system.
• A number of models for representing data have been developed. As with
programming languages, there is no best choice for all applications but the models
maintains the integrity of the by enforcing a set of constraints.
Data models differ in their method of representing the associations amongst entities and
attributes. The main models or approach are:
• The Hierarchical Model – Tree Structure
• The Network Model – Plex Structure
• The Relational Model – Normalised Structure
• The ER Model -Conceptual Model
Data Model Structure and Constraints –
• To define the database structure, Constructs are used
7. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page7
• Constructs typically include elements (and their data types) as well as group of
elements (Example- Entity, Record, Table), and relationships among such groups.
• Constraints specify some restriction on valid data; These constraints must be enforced
at all times.
Data Model Operations –
These operations are used for specifying database retrievals and updates by referring
to constructs of the data model. The Operations may include basic model operations as well
as user defined operations.
Basic Model Operations :
• Insert
• Delete
• Update
The Hierarchical Model – Tree Structure
Hierarchical model is a data model which uses the tree as its basic structure. So, lets
define the basics of the tree.
Basics of Tree :
A tree is a data structure that consists of hierarchy of nodes with a single node, called
the root at highest level.
A node may have any number of children, but each child node may have only one
parent node on which it is dependent. Thus the parent to child relationship in a tree is one to
many relationship whereas child to parent relationship in a tree is one to one.
8. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page8
In figure 1, the node at level 1 is called the root node and the nodes at that has no children are
called leaves. For example, node 4, 5, 7, 8, 9, 10 and 11.
• Nodes that are children of the same parent are called siblings. For example, nodes 2, 3,
4 are siblings.
• For any node there is a single path called the hierarchical path from the root node. The
nodes along this path are called that nodes ancestors.
• Similarly for a given node, any node along a path from that node to leaf is called
its descendent.
• For example, suppose we have to find out the hierarchical path of node 10, then it will
be 1→2→6→10 and the ancestors of node 10 are 1, 2 and 6.
• The height of tree is the number of levels on the longest hierarchical path from the
root to a leaf. The above tree has a height= 4.
• A tree is said to be balanced if every path from the root node to a leaf has the same
length.
Figure 2 shows a balanced and an unbalanced tree.
9. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page9
A binary tree is one in which each node has not more than two children.
Figure 3 shows a binary tree
Example of Hierarchical Model :
10. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page10
• Figure 4 shows a data structure diagram for a tree representing the STUDENT,
FACULTY and CLASS.
• The root node chosen is faculty, CLASS as a child of faculty and STUDENT as a
child of class.
• The cardinality between CLASS and FACULTY is one to many cardinality as a
FACULTY teaches one or more CLASS.
• The cardinality between a CLASS and a STUDENT is also one to many cardinality
because a CLASS has many STUDENTS.
Figure 5 shows an occurrence of the FACULTY-CLASS-STUDENT.
Operations on Hierarchical Model
• Deletion- If CS02 is deleted, then all the students in CS02 class will be deleted. So
deletion is very difficult. However deletion of leaf nodes that is students does not
create difficulty in deletion.
• Insertion- A new class say, CS03 may not be introduced unless some faculty is
available at root level. So insertion is also difficult.
• Updation- Suppose a student has changed his subject from Hindi to Sanskrit, then
firstly a search is performed to find out Hindi subject and then an update is made. A
search is a time consuming process here.
So these problem occurs in all the three operations.
11. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page11
Advantages of Hierarchical Model
• Easy to understand
• Performance is better than relational data model
• Disadvantages of Hierarchical Model
• Difficult to access values at lower level
• This model may not be flexible to accomodate the dynamic needs of an
organisation
• Deletion of parent node result in deletion of child node forcefully
• Extra space is required for the storage of pointers
The Network Model – Plex Structure
The network database or network model uses the plex structure as its basic data
structure. A network is a directed graph consisting of nodes connected by links or directed
arcs. The nodes corresponds to record types and the links to pointers or relationships.
All the relationship are hardwired or pre-computed and build into structure of
database itself because they are very efficient in space utilization and query execution time.
The network data structure looks like a tree structure except that a dependent node which
is called a child or member, may have more than one parent or owner node.
All figure shows the network model –
A diagram called as Bachman Diagram is used to represent a network data structure.
The nodes in the network are replaced by rectangles that represent records and links are
shown by lines connecting the rectangles.
12. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page12
A plex structure with two record types is shown / Example of Network Database :
Operations on Network Model/Network Database
• Insertion- In the above figure, it is clear that a new part or supplier can easily be
inserted.
• Deletion- For deletion only link is to be removed and no information will be lost.
For example, to remove PART 2, we delete the connector line between suppliers.
• Updation- Updation is also easy, for example, suppose SUPPLIER B supplies
PART 1 in place of SUPPLIER 2, so, updation will be successfully done by
changing the link of SUPPLIER B from PART 2 to PART 1.
Advantages of Network Model/ Network Database :
• Easy access to data.
• Flexible
• Efficient
• This model can be applied to real world problems, that require routine transactions.
Disadvantages of Network Model/ Network Database :
• Complex to design and develop.
13. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page13
• Extra memory is required for storage of pointers
• Performance is infexible and difficult to use.
• Operation and maintenance are time consuming and expensive for large databases.
The Relational Model – Normalised Structure
The relational model is a lower level model. It is based on the concept of a relation,
which is physically represented as a table. A table is a collection of rows & columns. The
relational model uses a collection of tables to represent both data and the relationships among
those data.
The tables are used to hold information about the objects to be represented in the
database. A relation or a table is represented as a two dimensional form in which the rows of
the table corresponds to individual records and the columns corresponds to attributes.
Each row is called a tuple and each column is called an attribute.For example, a
student relation is represented by the STUDENT table having columns for attributes SID,
NAME and BRANCH.
SID : Key
Number of Records = Cardinality
Number of Fields = Arity
Student (SID,Name,Branch) = Relational Schema (Table
Abstraction)
• The SID here is the primary key as it identifies a student record or tuple uniquely.(A
primary key is the key applied on an attribute(SID) which recognize a tuple.
• The Cardinality of the Relation or table is defined as the number of records in the
STUDENT relation which is 4.
• The Arity is defined as the number of fields or columns in the relation.
Domain of an Attribute –
Domain of an attribute is the set of allowable values for that attribute. It is a pool of
values from which the actual values appearing in a given column are drawn. For example, the
values appearing in the SID column are drawn from the domain of all SID.
Domains may be distinct, or two or more attributes may have same domain.
14. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page14
Operations in Relational Model –
• Insertion – A new student record can be easily inserted in the table.
• Deletion – An existing student record or tuple can easily be deleted from the
STUDENT relation.
• Updation – An existing student record can be update easily. For example, if a student
S2 changes its BRANCH from CS to IT, then it can easily be changed
Advantages of Relational Model –
• Easy to use an understand
• Very flexible.
• Widely used.
• Provides excellent support for adhoc queries.
• Users need not consider issues such as storage structure and access strategy.
• Specify control and authorization can be implemented more easily.
• Data independence is achieved more easily with normalisation structure used in a
relational database.
Disadvantages of Relational Model –
• For large databases, the performance in responding to queries is definitely
degraded.
• The processing requirements need to construct the indexes. So, the index position
of the file must be created and maintained along with the file records themselves.
• The file index must be searched sequentially before the actual file records are
obtained. This wastes time.
Data Description Languages
There are two main types of SQL statements that are executed within databases as
described in SQL. Before you can manipulate data residing in a database using SQL Data
Manipulation Language (DML), you have to create the logical structure to store information.
Data Definition Language (DDL) is the portion of SQL that deals with how data
should reside in the database at a logical level. Each database has its own set of object types
15. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page15
that it allows. Most include tables, indexes, views, store procedures, functions, synonyms,
and triggers. Each database has its own syntax for DDL statements and the clauses that can
be included. There are some basic key words that you will find in almost every RDBMS.
• CREATE
• ALTER
• DROP
• TRUNCATE
The CREATE Statement
The basic building blocks of the Relational Database Management System are tables.
I envision a table as a set of rows and columns. The columns represent fields of information.
The rows represent records in the table. In following graphic, the persons table has four fields
and four records.
You could simply create the table with the following statement:
CREATE TABLE books
( book_id VARCHAR(100),
book_name VARCHAR(100),
author_id NUMBER,
editor_id NUMBER);
16. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page16
sql
The problem with this table definition is that it allows rows to be created without concern for
if the data makes any sense. Envision our table looking like this:
Your database design should make sure that data inserted into a table is sensible. Let us
create a second table called the "persons" table. This time we will add constraints to make
sure that data entered into the table will make sense. It makes sense that each entry in the
table will be unique person so we give it a PRIMARY KEY. We will also want to make sure
to track when the table was last updated and who updated it by making those fields NOT
NULL.
CREATE TABLE persons
( person_id NUMBER NOT NULL PRIMARY KEY,
name VARCHAR(100),
birth_date DATE,
gender VARCHAR(30),
last_update DATE NOT NULL,
updated_by NUMBER NOT NULL
);
17. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page17
The ALTER Statement
The definition of an object in a database can be changed using the ALTER statement.
Example: Add constraints to the "books" table to assure the fields "book_name" and
"author_id" contain data.
ALTER TABLE books MODIFY (book_name NOT NULL);
ALTER TABLE books MODIFY (author_id NOT NULL);
sql
A FOREIGN KEY constraint can be added to the fields "author_id" and "the
editor_id" limiting the available values to ones that currently exist in the persons table in the
"person_id" field.
ALTER TABLE books ADD CONSTRAINT fk_author
FOREIGN KEY (author_id) REFERENCES persons (person_id);
ALTER TABLE books ADD CONSTRAINT fk_editor
FOREIGN KEY (editor_id) REFERENCES persons (person_id);
sql
What if we wanted to add a publication date to our books table? Use the 'ALTER' statement
to add the field.
ALTER TABLE books ADD ( publish_date DATE);
sql
You can alter more than just tables. Here are examples of some other ALTER statements.
ALTER ROLE book_reader IDENTIFIED BY r2Xe135DEw;
ALTER INDEX editor_indx DISABLE;
ALTER TRIGGER persons_update RENAME TO persons_trig;
18. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page18
sql
The TRUNCATE Statement
TRUNCATE TABLE books;
sql
The TRUNCATE statement removes all the data from a table. This is very similar to DML
statement.
DELETE FROM books;
sql
In the Oracle database, there is a difference between the two. TRUNCATE removes
all data where a DELETE can be specific in the rows it wants to delete. Also, if you make a
mistake with a DELETE statement you can use the transactional control
statement ROLLBACK to remove the changes.
The TRUNCATE command has no rollback capability. The biggest positive to using
the TRUNCATE statement is that it can be faster than the DELETE statement, especially if
the table has numerous rows, triggers, indexes, and other dependencies.
The DROP Statement
Removing an object from the database accomplished with the DROP statement.
DROP TABLE books;
DROP TABLE persons;
sql
When you drop a table it removes all the rows, invalidates dependent objects, removes
indexes, constraints and privileges that anyone had on that table. Just as with
the CREATE and ALTER statements, there are other DROP statement types.
19. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page19
3rd Normal Form Definition
A database is in third normal form if it satisfies the following conditions:
• It is in second normal form
• There is no transitive functional dependency
By transitive functional dependency, we mean we have the following relationships in the
table: A is functionally dependent on B, and B is functionally dependent on C. In this case, C
is transitively dependent on A via B.
3rd Normal Form Example
Consider the following example:
In the table able, [Book ID] determines [Genre ID], and [Genre ID] determines
[Genre Type]. Therefore, [Book ID] determines [Genre Type] via [Genre ID] and we have
transitive functional dependency, and this structure does not satisfy third normal form.
To bring this table to third normal form, we split the table into two as follows:
20. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page20
Now all non-key attributes are fully functional dependent only on the primary key. In
[TABLE_BOOK], both [Genre ID] and [Price] are only dependent on [Book ID]. In
[TABLE_GENRE], [Genre Type] is only dependent on [Genre ID].
Canonical Data structures
• A canonical data model (CDM) is a type of data model that presents data entities and
relationships in the simplest possible form.
• It is generally used in system/database integration processes where data is exchanged
between different systems, regardless of the technology used.
• A canonical data model is also known as a common data model.
A canonical data model primarily enables an organization to create and distribute a
common definition of its entire data unit. The design of a CDM requires identifying all
entities, their attributes and the relationships between them.
The importance of a CDM is particularly evident in integration processes where data units
are shared between different information system platforms. It utilizes a generalized data
format to present/define data that makes it simple to share data among multiple applications.
Varieties of data independences.
Data Independence is defined as a property of DBMS that helps you to change the
Database schema at one level of a database system without requiring to change the schema at
the next higher level. Data independence helps you to keep data separated from all programs
that make use of it.
You can use this stored data for computing and presentation. In many systems, data
independence is an essential function for components of the system.
Importance of Data Independence
• Helps you to improve the quality of the data
• Database system maintenance becomes affordable
• Enforcement of standards and improvement in database security
• You don't need to alter data structure in application programs
21. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page21
• Permit developers to focus on the general structure of the Database rather than
worrying about the internal implementation
• It allows you to improve state which is undamaged or undivided
• Database incongruity is vastly reduced.
• Easily make modifications in the physical level is needed to improve the
performance of the system.
Types of Data Independence
In DBMS there are two types of data independence
• Physical data independence
• Logical data independence.
Levels of Database
The database has 3 levels as shown in the diagram below
• Physical/Internal
• Conceptual
• External
22. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page22
Type of Schema Implementation
External Schema View 1: Course info(cid:int,cname:string)
View 2: studeninfo(id:int. name:string)
Conceptual Shema Students(id: int, name: string, login: string, age:
integer)
Courses(id: int, cname.string, credits:integer)
Enrolled(id: int, grade:string)
Physical Schema Relations stored as unordered files.
Index on the first column of Students.
23. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page23
Consider an Example of a University Database. At the different levels this is how the
implementation will look like:
Physical Data Independence
Physical data independence helps you to separate conceptual levels from the
internal/physical levels. It allows you to provide a logical description of the database without
the need to specify physical structures. Compared to Logical Independence, it is easy to
achieve physical data independence.
With Physical independence, you can easily change the physical storage structures or
devices with an effect on the conceptual schema. Any change done would be absorbed by the
mapping between the conceptual and internal levels. Physical data independence is achieved
by the presence of the internal level of the database and then the transformation from the
conceptual level of the database to the internal level.
Examples of changes under Physical Data Independence
• Due to Physical independence, any of the below change will not affect the
conceptual layer.
• Using a new storage device like Hard Drive or Magnetic Tapes
• Modifying the file organization technique in the Database
• Switching to different data structures.
• Changing the access method.
• Modifying indexes.
• Changes to compression techniques or hashing algorithms.
• Change of Location of Database from say C drive to D Drive
Logical Data Independence
Logical Data Independence is the ability to change the conceptual scheme without
changing
• External views
• External API or programs
24. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page24
Any change made will be absorbed by the mapping between external and conceptual
levels. When compared to Physical Data independence, it is challenging to achieve logical
data independence.
Examples of changes under Logical Data Independence
• Due to Logical independence, any of the below change will not affect the
external layer.
• Add/Modify/Delete a new attribute, entity or relationship is possible without a
rewrite of existing application programs
• Merging two records into one
• Breaking an existing record into two or more records
Difference between Physical and Logical Data Independence
Logica Data Independence Physical Data Independence
Logical Data Independence is mainly concerned
with the structure or changing the data definition.
Mainly concerned with the storage of the data.
It is difficult as the retrieving of data is mainly
dependent on the logical structure of data.
It is easy to retrieve.
Compared to Logic Physical independence it is
difficult to achieve logical data independence.
Compared to Logical Independence it is easy to
achieve physical data independence.
You need to make changes in the Application
program if new fields are added or deleted from the
database.
A change in the physical level usually does not need
change at the Application program level.
Modification at the logical levels is significant
whenever the logical structures of the database are
changed.
Modifications made at the internal levels may or
may not be needed to improve the performance of
the structure.
25. D.GAYA, Assistant Professor, Department of Computer Science, PUCC.
Page25
Concerned with conceptual schema Concerned with internal schema
Example: Add/Modify/Delete a new attribute Example: change in compression techniques,
hashing algorithms, storage devices, etc