2. CONTENTS
Database System Architecture: Data Abstraction, Data
Independence, Data Definition Language (DDL), Data
Manipulation Language (DML)
Data Models: Entity-relationship model, network model,
relational and object oriented data models, integrity
constraints, data manipulation operations
3. INTRODUCTION
In computerized information system data are the basic resource of the
organization. So, proper organization and management for data is
required for organization to run smoothly. Database management
system deals the knowledge of how data stored and managed on a
computerized information system. In any organization, it requires
accurate and reliable data for better decision making, ensuring privacy
of data and controlling data efficiently.
The examples include deposit and/or withdrawal from a bank, hotel,
airline or railway reservation, purchase items from supermarkets in all
cases, a database is accessed.
5. What is Data?
Data is a collection of a distinct small unit of information. It can be used in a variety of
forms like text, numbers, media, bytes, etc. it can be stored in pieces of paper or
electronic memory, etc.
In computing, Data is information that can be translated into a form for efficient
movement and processing. Data is interchangeable.
Information is the processed data on which decisions and actions are based. Information
can be defined as the organized and classified data to provide meaningful values.
What is Database Management System?
A database is an organized collection of data, so that it can be easily accessed and managed.
Database management System is software which is used to store and retrieve the database.For
example, Oracle, MySQL, etc.; these are some popular DBMS tools.
● DBMS provides the interface to perform the various operations like creation, deletion,
modification, etc.
● DBMS allows the user to create their databases as per their requirement.
● DBMS accepts the request from the application and provides specific data through the operating
system.
● DBMS contains the group of programs which acts according to the user instruction.
● It provides security to the database.
6. Advantages of DBMS:
1. Reduction of redundancies: Centralized control of data by the DBA avoids
unnecessary duplication of data and effectively reduces the total amount of data
storage required avoiding duplication in the elimination of the inconsistencies that
tend to be present in redundant data files
2. Sharing of Data: A database allows the sharing of data under its control by any
number of application programs or users
3. Data Integrity: Data integrity means that the data contained in the database is
both accurate and consistent. Therefore data values being entered for storage
could be checked to ensure that they fall within a specified range and are of the
correct format.
4. Data Security: The DBA who has the ultimate responsibility for the data in the
dbms can ensure that proper access procedures are followed including proper
authentication to access to the DataBase System and additional check before
permitting access to sensitive data.
5. Conflict Resolution: DBA resolve the conflict on requirements of various user and
applications. The DBA chooses the best file structure and access method to get
optional performance for the application.
7. Disadvantages of DBMS:
1. DBMS software and hardware (networking installation) cost is
high.
2. The processing overhead by the dbms for implementation of
security, integrity and sharing of the data.
3. Centralized database control.
4. Setup of the database system requires more knowledge, money,
skills, and time.
5. The complexity of the database may result in poor performance.
8. What is RDBMS?
The word RDBMS is termed as 'Relational Database Management System.' It is
represented as a table that contains rows and column.A relational database contains the
following components:
● Table
● Record/ Tuple
● Field/Column name /Attribute
● Instance
● Schema
● Keys
An RDBMS is a tabular DBMS that maintains the security, integrity, accuracy, and
consistency of the data.
9. DBMS vs. File System
● File System Approach
File based systems were an early attempt to computerize the manual system. It is also
called a traditional based approach in which a decentralized approach was taken where
each department stored and controlled its own data with the help of a data processing
specialist. The main role of a data processing specialist was to create the necessary
computer file structures, and also manage the data within structures and design some
application programs that create reports based on file data.
● DBMS Approach:
A database approach is a well-organized collection of data that are related in a meaningful
way which can be accessed by different users but stored only once in a system. The
various operations performed by the DBMS system are: Insertion, deletion, selection,
sorting etc.
10. Basis DBMS Approach File System Approach
Sharing of data Due to the centralized approach, data
sharing is easy.
Data is distributed in many files, and it may be
of different formats, so it isn't easy to share
data.
Security and Protection DBMS provides a good protection
mechanism.
It isn't easy to protect a file under the file
system.
Recovery Mechanism DBMS provides a crash recovery mechanism,
i.e., DBMS protects the user from system
failure.
The file system doesn't have a crash
mechanism, i.e., if the system crashes while
entering some data, then the content of the
file will be lost.
Manipulation Techniques DBMS contains a wide variety of
sophisticated techniques to store and
retrieve the data.
The file system can't efficiently store and
retrieve the data.
Concurrency Problems DBMS takes care of Concurrent access of
data using some form of locking.
In the File system, concurrent access has many
problems like redirecting the file while
deleting some information or updating some
information.
Data Redundancy and
Inconsistency
Due to the centralization of the database, the
problems of data redundancy and
inconsistency are controlled.
In this, the files and application programs are
created by different programmers so that
there exists a lot of duplication of data which
may lead to inconsistency.
11. DBMS Architecture
● The DBMS design depends upon its architecture. The basic client/server architecture is used to deal
with a large number of PCs, web servers, database servers and other components that are connected
with networks.
● DBMS architecture depends upon how users are connected to the database to get their request done.
12. 1-Tier Architecture
● In this architecture, the database is directly available to the user. It means the
user can directly sit on the DBMS and uses it.
● Any changes done here will directly be done on the database itself. It doesn't
provide a handy tool for end users.
● The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick
response.
13. 2-Tier Architecture
● The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications on the client end can
directly communicate with the database at the server side. For this interaction, API's like: ODBC, JDBC are used.
● The user interfaces and application programs are run on the client-side.
● The server side is responsible to provide the functionalities like: query processing and transaction management.
● To communicate with the DBMS, client-side application establishes a connection with the server side.
14. 3-Tier Architecture
● The 3-Tier architecture contains another layer between the client and server. In this architecture, client can't directly
communicate with the server.
● The application on the client-end interacts with an application server which further communicates with the database
system.
● End user has no idea about the existence of the database beyond the application server. The database also has no idea
about any other user beyond the application.
15. Data Abstraction
● Data Abstraction is a process of hiding unwanted or irrelevant details from the end user. It provides a
different view and helps in achieving data independence which is used to enhance the security of data.
● It is done through the three schema architecture or three-level architecture.
● The three schema architecture is also used to separate the user applications and physical database.
● The main objective of three level architecture is to enable multiple users to access the same data with a
personalized view while storing the underlying data only once. Thus it separates the user's view from the
physical structure of the database.
16. 1. Internal Level or Physical Level
● The internal level has an internal schema which describes the physical storage
structure of the database.
● The internal schema is also known as a physical schema.
● It uses the physical data model. It is used to define that how the data will be stored in
a block.
● The physical level is used to describe complex low-level data structures in detail.
17. 2. Conceptual Level or Logical Level
● The conceptual schema describes the design of a database at the conceptual level.
Conceptual level is also known as logical level.
● The conceptual schema describes the structure of the whole database.
● The conceptual level describes what data are to be stored in the database and also
describes what relationship exists among those data.
● In the conceptual level, internal details such as an implementation of the data
structure are hidden.
● Programmers and database administrators work at this level.
18. 3. External Level or View Level
● At the external level, a database contains several schemas that sometimes
called as subschema. The subschema is used to describe the different view of
the database.
● An external schema is also known as view schema.
● Each view schema describes the database part that a particular user group is
interested and hides the remaining database from that user group.
● The view schema describes the end user interaction with database systems.
19. Data Independence
● Data independence can be explained using the three-schema architecture.
● Data independence refers characteristic of being able to modify the schema at one level of the
database system without altering the schema at the next higher level.
There are two types of data independence:
1. Logical Data Independence
2. Physical Data Independence
20. 1. Logical Data Independence
● Logical data independence refers characteristic of being able to change the
conceptual schema without having to change the external schema.
● Logical data independence is used to separate the external level from the
conceptual view.
● If we do any changes in the conceptual view of the data, then the user view of the
data would not be affected.
● Logical data independence occurs at the user interface level.
21. 2. Physical Data Independence
● Physical data independence can be defined as the capacity to change the internal
schema without having to change the conceptual schema.
● If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.
● Physical data independence is used to separate conceptual levels from the internal
levels.
● Physical data independence occurs at the logical interface level.
22. Database Language
● A DBMS has appropriate languages and interfaces to express database queries and updates.
● Database languages can be used to read, store and update the data in the database.
23. 1. Data Definition Language (DDL)
● DDL stands for Data Definition Language. It is used to define database structure or pattern.
● It is used to create schema, tables, indexes, constraints, etc. in the database.
● Data definition language is used to store the information of metadata like the number of tables and schemas, their
names, indexes, columns in each table, constraints, etc.
Here are some tasks that come under DDL:
● Create: It is used to create objects in the database.
● Alter: It is used to alter the structure of the database.
● Drop: It is used to delete objects from the database.
● Truncate: It is used to remove all records from a table.
● Rename: It is used to rename an object.
● Comment: It is used to comment on the data dictionary.
These commands are used to update the database schema that's why they come under Data definition language.
24. 2. Data Manipulation Language (DML)
DML stands for Data Manipulation Language. It is used for accessing and manipulating data in a database. It handles
user requests.
Here are some tasks that come under DML:
● Select: It is used to retrieve data from a database.
● Insert: It is used to insert data into a table.
● Update: It is used to update existing data within a table.
● Delete: It is used to delete all records from a table.
● Merge: It performs UPSERT operation, i.e., insert or update operations.
● Call: It is used to call a structured query language or a Java subprogram.
● Explain Plan: It has the parameter of explaining data.
● Lock Table: It controls concurrency.
26. DATA MODELS:
Data models define how the logical
structure of a database is modeled.
Data Models are fundamental entities to
introduce abstraction in a DBMS.
Data models define how data is connected
to each other and how they are processed
and stored inside the system.
27. TYPES OF DATA MODELS
Relational data model 01 02 Entity-Relationship Data
Model
Object -based data 03 04 semi structured data
Model model
28. Relational data model
This type of model designs the data in the form of
rows and columns within a table. Thus, a relational
model uses tables for representing data and in-
between relationships. Tables are also called
relations. This model was initially described by
Edgar F. Codd, in 1969. The relational data model is
the widely used model which is primarily used by
commercial data processing applications.
29. Entity-relationship data model
An ER model is the logical representation of data as objects
and relationships among them. These objects are known as
entities, and relationship is an association among these
entities. This model was designed by Peter Chen and
published in 1976 papers. It was widely used in database
designing. A set of attributes describe the entities. For
example, student_name, student_id describes the 'student'
entity. A set of the same type of entities is known as an 'Entity
set', and the set of the same type of relationships is known as
'relationship set'.
30. Object-based data model
An extension of the ER model with notions of
functions, encapsulation, and object identity, as
well. This model supports a rich type system that
includes structured and collection types. Thus, in
1980s, various database systems following the
object-oriented approach were developed. Here,
the objects are nothing but the data carrying its
properties.
31. Semi-structured data model
This type of data model is different from the other three data models
(explained above). The semistructured data model allows the data
specifications at places where the individual data items of the same
type may have different attributes sets. The Extensible Markup
Language, also known as XML, is widely used for representing the
semistructured data. Although XML was initially designed for
including the markup information to the text document, it gains
importance because of its application in the exchange of data.
33. Basic Concepts
The E-R data model employs three basic notions :
entity sets, relationship sets and attributes.
34. Entity Sets:
An entity is a “thing” or “object” in the real world that is distinguishable from all
other objects. For example, each person in an enterprise is an entity. An entity
has a set properties and the values for some set of properties may uniquely
identify an entity. BOOK is entity and its properties (called as attributes)
bookcode, booktitle, price etc.
An entity set is a set of entities of the same type that share the same properties,
or attributes.
Relationship Sets:
A relationship is an association among several entities. A relationship set
is a set of relationships of the same type.
35. Attributes:An entity is represented by a set of attributes.
Attributes are descriptive properties possessed by each member of an
entity set.
Customer is an entity and its attributes are customerid, custmername,
custaddress etc.
An attribute as used in the E-R model, can be characterized by the
following attribute types.
a) Simple and Composite Attribute:
b) Single-Valued and Multi-Valued Attribute
c) Derived Attribute and stored
d) NULL Valued Attribute
36. Mapping Cardinalities: Mapping cardinalities or cardinality ratios, express the
number of entities to which another entity can be associated via a relationship set.
1. One to One: An entity in A is associated with at most one entity in B, and an entity in
B is associated with at most one entity in A. Eg: relationship between college and
principal.
2. One to Many: An entity in A is associated with any number of entities in B. An entity
in B is associated with at the most one entity in A. Eg: Relationship between
department and faculty.
3. Many to One: An entity in A is associated with at most one entity in B. An entity in B
is associated with any number in A.
4. Many to Many: Entities in A and B are associated with any number of entities from
each other
38. RELATIONAL MODEL Relational model is simple model in which database is
represented as a collection of “relations” where each relation is represented
by two-dimensional table.
Properties:
1) It is column homogeneous. In other words, in any given column of a table,
all items are of the same kind.
2)Each item is a simple number or a character string. That is a table must be in
first normal form.
3)All rows of a table are distinct.
4) The ordering of rows with in a table is immaterial.
5)The column of a table are assigned distinct names and the ordering of these
columns is immaterial
39. Domain, attributes tuples and relational:
Tuple: Each row in a table represents a record and is called a tuple .A table
containing ‘n’ attributes in a record is called is called n-tuple.
Attributes: The name of each column in a table is used to interpret its meaning
and is called an attribute.Each table is called a relation. In the above table,
account_number, branch name, balance are the attributes.
Domain: A domain is a set of values that can be given to an attributes. So
every attribute in a table has a specific domain. Values to these attributes can
not be assigned outside their domains.
Relation: A relation consist of 1) Relational schema 2)Relation instance
40. Keys:
Super key: A super key is an attribute or a set of attributes used to identify the
records uniquely in a relation. For example, customer-id, (cname, customer-id),
(cname,telno) .
Candidate key: Super keys of a relation can contain extra attributes. Candidate
keys are minimal super keys. i.e, such a key contains no extraneous attribute.
An attribute is called extraneous if even after removing it from the key, makes
the remaining attributes still has the properties of a key(atribute represents
entire table).
Primary key: The primary key is the candidate key that is chosen by the
database designer as the principal means of identifying entities with in an
entity set. The remaining candidate keys if any are called alternate key
41. Integrity Constraints:
Integrity Constraints are the protocols that a table's data columns
must follow. These are used to restrict the types of information
that can be entered into a table. This means that the data in the
database is accurate and reliable. Integrity Constraints may be
applied at the column or table level. The table-level Integrity
constraints apply to the entire table, while the column level
constraints are only applied to one column. When authorized users
make changes to the database, integrity constraints ensure that
the data remains consistent.
43. 1) Domain Integrity Constraint
A domain integrity constraint is a set of rules that restricts the kind of
attributes or values a column or relation can hold in the database table.
For example, we can specify if a particular column can hold null values
or not, if the values have to be unique or not, the data type or size of
values that can be entered in the column, the default values for the
column, etc.
44. For example, we want to create a “customer_details” table, with
information such as customer id, customer name, the number of
items purchased, date of purchase, etc. So, in order to ensure
domain integrity, we can specify the customer_id has to be unique,
the quantity of items purchased has to be an integer number only
and the date of purchase has to be a date or timestamp, etc.
45. 2) Entity Integrity Constraint
Entity Integrity Constraint is used to ensure the uniqueness of each record or
row in the data table. There are primarily two types of integrity constraints that
help us in ensuring the uniqueness of each row, namely, UNIQUE constraint and
PRIMARY KEY constraint. The unique key helps in uniquely identifying a record
in the data table. It can be considered somewhat similar to the Primary key as
both of them guarantee the uniqueness of a record. But unlike the primary key, a
unique key can accept NULL values and it can be used on more than one column
of the data table.
46. 3) Referential Integrity Constraint
Referential Integrity Constraint
ensures that there always exists a valid
relationship between two tables.This
makes sure that if a foreign key exists in
a table relationship then it should
always reference a corresponding value
in the second table or it should be null.
We can create relationships between
two tables in the following manner.
Here, we have created a “Department”
table and then “Employees” where the
“department” attribute references to
Department_ID” in the former table.
47. 4) Key Constraint
There are a number of key constraints in SQL that ensure that an entity or
record is uniquely or differently identified in the database. There can be more
than one key in the table but it can have only one primary key.
Some of the key constraints in SQL are :
1. Primary Key Constraint
2. Foreign Key Constraint
3. Unique Key Constraint
48. Data Manipulation operations
DBMS provides a set of operations or a language called data manipulation
language (DML) for modification of the data.
Data manipulation can be performed either by typing SQL statements or by
using a graphical interface, typically called Query-By-Example (QBE)
49. The main Data Manipulation operations are-
● Select: For Retrieval
● Insert into: For insertion
● Delete: For Deletion
● Update: For modification
50. Select
A query in SQL can consist of up to six clauses, but only the first two,, SELECT
and FROM are mandatory. The clauses are specified in the following order.
SELECT: SELECT statement retrieves the data from database according to the
constraints specifiers alongside.
FROM: The FROM clause specifies all relations needed in the query.
WHERE: The WHERE clause specifies the conditions for selection and joining of
tuples from the relations specified in the FROM clause.
GROUP: GROUP BY specifies grouping attributes.
HAVING : HAVING specifies a condition for selection of groups.
ORDER BY: ORDER BY specifies an order for displaying the result of a query.
52. Insert
Insert statement is used to insert data into database tables. In its simplest form,
it is used to add one or more tuples to a relation.
Attribute values should be listed in the same order as when the attributes were
specified in the CREATE TABLE command.
General syntax-
INSERT INTO <TABLE NAME>(<COLUMNS TO INSERT>) VALUES (<VALUES
TO INSERT>)
53. Delete
The Delete Command deletes records from the database table according to the
given constraints.
General syntax-
To delete all records from the table-