2. DBMS: A DBMS makes it possible for end users to create, read,
update and delete data in a database. The DBMS essentially serves
as an interface between the database and end users or application
programs, ensuring that data is consistently organized and remains
easily accessible.
•The DBMS manages three important things: the data, the
database engine that allows data to be accessed, locked and
modified -- and the database schema, which defines the database’s
logical structure.
•These three foundational elements help provide concurrency,
security, data integrity and uniform administration procedures.
• Typical database administration tasks supported by the DBMS
include change management, performance monitoring/tuning
and backup and recovery.
•Many database management systems are also responsible for
automated rollbacks, restarts and recovery as well as
the logging and auditing of activity.
3. •The DBMS is perhaps most useful for providing a centralized view
of data that can be accessed by multiple users, from multiple
locations, in a controlled manner.
•A DBMS can limit what data the end user sees, as well as how that
end user can view the data, providing many views of a single
database schema.
•End users and software programs are free from having to
understand where the data is physically located or on what type of
storage media it resides because the DBMS handles all requests.
•The DBMS can offer both logical and physical data independence.
That means it can protect users and applications from needing to
know where data is stored or having to be concerned about
changes to the physical structure of data (storage and hardware).
•As long as programs use the application programming interface
(API) for the database that is provided by the DBMS, developers
won't have to modify programs just because changes have been
made to the database.
4.
5. Popular types of DBMS:
Popular database models and their management systems include:
Relational database management system (RDMS) - adaptable to
most use cases, but RDBMS Tier-1 products can be quite expensive.
NoSQL DBMS - well-suited for loosely defined data structures that
may evolve over time.
In-memory database management system (IMDBMS) - provides
faster response times and better performance.
Columnar database management system (CDBMS) - well-suited
for data warehousesthat have a large number of similar data items.
Cloud-based data management system - the cloud service provider
is responsible for providing and maintaining the DBMS.
6. Advantages of a DBMS:
Using a DBMS to store and manage data comes with
advantages, but also overhead. One of the biggest advantages
of using a DBMS is that it lets end users and application
programmers access and use the same data while managing
data integrity. Data is better protected and maintained when it
can be shared using a DBMS instead of creating new iterations
of the same data stored in new files for every new application.
The DBMS provides a central store of data that can be
accessed by multiple users in a controlled manner.
Central storage and management of data within the DBMS
provides:
•Data abstraction and independence
•Data security
•A locking mechanism for concurrent access
7. •An efficient handler to balance the needs of multiple
applications using the same data
•The ability to swiftly recover from crashes and errors,
including restartability and recoverability
•Robust data integrity capabilities
•Logging and auditing of activity
•Simple access using a standard application
programming interface (API)
•Uniform administration procedures for data
•Another advantage of a DBMS is that it can be used to
impose a logical, structured organization on the data. A
DBMS delivers economy of scale for processing large
amounts of data because it is optimized for such
operations.
8. Characteristics of DBMS:
•Atomicity requires that each transaction be "all or nothing": if one
part of the transaction fails, then the entire transaction fails, and the
database state is left unchanged. An atomic system must guarantee
atomicity in each and every situation, including power failures, errors
and crashes. To the outside world, a committed transaction appears
(by its effects on the database) to be indivisible ("atomic"), and an
aborted transaction does not happen.
•The consistency property ensures that any transaction will bring the
database from one valid state to another. Any data written to the
database must be valid according to all defined rules,
including constraints, cascades, triggers, and any combination
thereof. This does not guarantee correctness of the transaction in all
ways the application programmer might have wanted (that is the
responsibility of application-level code), but merely that any
programming errors cannot result in the violation of any defined
rules.
9. •The isolation property ensures that the concurrent execution of
transactions results in a system state that would be obtained if
transactions were executed sequentially, i.e., one after the other.
Providing isolation is the main goal of concurrency control.
Depending on the concurrency control method (i.e., if it uses strict
- as opposed to relaxed - serializability), the effects of an
incomplete transaction might not even be visible to another
transaction.
•The durability property ensures that once a transaction has been
committed, it will remain so, even in the event of power
loss, crashes, or errors. In a relational database, for instance, once a
group of SQL statements execute, the results need to be stored
permanently (even if the database crashes immediately
thereafter). To defend against power loss, transactions (or their
effects) must be recorded in a non-volatile memory.
10. •Hierarchical Database Management System:
•A hierarchical database model is a data model in which the data is organized
into a tree-like structure. The data is stored as records which are connected to
one another through links. A record is a collection of fields, with each field
containing only one value. The entity type of a record defines which fields the
record contains.
•Example of a hierarchical model:
•A record in the hierarchical database model corresponds to a row in
the relational database model and an entity type corresponds to a table.
• The hierarchical database model mandates that each child record has only one
parent, whereas each parent record can have one or more child records.
•In order to retrieve data from a hierarchical database the whole tree needs to
be traversed starting from the root node. This model is recognized as the first
database model created by IBM in the 1960s.
•The Hierarchical Data Model is a way of organizing a database with multiple one
to many relationships. The structure is based on the rule that one parent can
have many children but children are allowed only one parent. This structure
allows information to be repeated through the parent child relations created by
IBM and was implemented mainly in their Information Management System.
11. •The model allows easy addition and deletion of new information.
Data at the top of the Hierarchy is very fast to access. It was very
easy to work with the model because it worked well with linear type
data storage such as tapes.
•The model relates very well to natural hierarchies such as assembly
plants and employee organization in corporations. It relates well to
anything that works through a one to many relationships.
•For example; there is a president with many managers below them,
and those managers have many employees below them, but each
employee has only one manager.
13. •A DBMS is said to be hierarchical DBMS, when the data is
organized like a tree structure. That means, it simply represents the
data using parent – child relationship. It is the oldest style of
organizing/storing data and still some organizations are using this.
•It follows one to many relationship. i.e. All parents can have more
than one child, but each child should have only one parent.
•All the records will be organized as per the relation between them.
The parent record at the top of the tree structure is called the Root
record. This parent record will be linked to all the related child
records and each child record will be linked to a parent. In this type
of DBMS, there is a drawback. If we want to add a new field or
record, it may require to redefine the entire database in some
cases.
•XML document is the best example for hierarchical DBMS.
14. NDBMS-Network Database Management System:
•Network Database: A network databases are mainly used on large
digital computers. It more connections can be made between different
types of data, network databases are considered more efficiency It
contains limitations must be considered when we have to use this kind
of database.
•It is Similar to the hierarchical databases; network databases.
•Network databases are similar to hierarchical databases by also having a
hierarchical structure.
•A network database looks more like a cobweb or interconnected
network of records.
•In network databases, children are called members and parents are
called occupier.
•The difference between each child or member can have more than one
parent. The Approval of the network data model similar with the esteem
of the hierarchical data model. Some data were more naturally modeled
with more than one parent per child.
•The network model authorized the modeling of many-to-many
relationships in data.
15. •The network model is very similar to the hierarchical model really.
•Actually the hierarchical model is a subset of the network model. However, instead of
using a single-parent tree hierarchy, the network model uses set theory to provide a
tree-like hierarchy with the exception that child tables were allowed to have more than
one parent.
•It supports many-to-many relationships.
There are some differences between hierarchical DBMS and Network DBMS.
•In hierarchical DBMS, we can have only one parent to a child. But in Network, we can
have more than one.
•Unlike hierarchical , Network DBMS does not necessarily follow downward tree
structure. In some cases it may follow upward tree structure.
16. •ABC College has two Child. i.e. Department A and College library.
It represents one to many relationship.
•Even though there is no relation between Department A and
College library, a student can be a member of both Department A
and College library. This represents many to one relationship.
•So as per the above example, student has two parents which tell
us this is the Network DBMS model. This is the simple and good
example for Network DBMS.
17. •The network model is a database model conceived as a
flexible way of representing objects and their relationships.
•Its distinguishing feature is that the schema, viewed as a
graph in which object types are nodes and relationship types
are arcs, is not restricted to being a hierarchy or lattice.
•The network model replaces the hierarchical model with a
graph thus allowing more general connections among the
nodes.
•The main difference of the network model from the
hierarchical model is its ability to handle many to many
relationships. In other words it allow a record to have more
than one parent.
18. ADVANTAGES OF NETWORK MODEL:
The major advantage of network model are-
1.) Conceptual simplicity-Just like the hierarchical model, the
network model is also conceptually simple and easy to design.
2.) Capability to handle more relationship types-The network
model can handle the one to many and many to many
relationships which is real help in modeling the real life
situations.
3.) Ease of data access-The data access is easier and flexible than
the hierarchical model.
4.) Data integrity- The network model does not allow a member
to exist without an owner.
5.) Data independence- The network model is better than the
hierarchical model in isolating the programs from the complex
physical storage details.
6.) Database standards
19. DIS-ADVANTAGES OF NETWORK MODEL:
1.) System complexity- All the records are maintained using
pointers and hence the whole database structure becomes very
complex.
2.) Operational Anomalies- The insertion,deletion and updating
operations of any record require large number of pointers
adjustments.
3.) Absence of structural independence-structural changes to the
database is very difficult.
20. Relational database management system:
•A relational database management system (RDBMS) is a program
that lets you create, update, and administer a relational database.
•Most commercial RDBMS's use the Structured Query Language
(SQL) to access the database, although SQL was invented after the
development of the relational model and is not necessary for its
use.
•The leading RDBMS products are Oracle, IBM's DB2 and
Microsoft's SQL Server. Despite repeated challenges by competing
technologies, as well as the claim by some experts that no current
RDBMS has fully implemented relational principles, the majority of
new corporate databases are still being created and managed with
an RDBMS.
21. Relational database management system:
•A relational database management system (RDBMS) is a database
management system (DBMS) based on the relational model invented by Edgar F.
Codd, of IBM's San Jose Research Laboratory fame. Most databases in
widespread use today are based on his relational database model.
•RDBMSs have been a common choice for the storage of information in
databases used for financial records, manufacturing and logistical information,
personnel data, and other applications since the 1980s.
•Relational databases have often replaced legacy hierarchical
databasesand network databases because they were easier to implement and
administer. Nonetheless, relational databases received continued, unsuccessful
challenges by object database management systems in the 1980s and 1990s,
(which were introduced in an attempt to address the so-called object-relational
impedance mismatch between relational databases and object-oriented
application programs), as well as by XML database management systems in the
1990s.
•However, due to the expanse of technologies, such as horizontal
scaling of computer clusters, NoSQL databases have recently begun to peck
away at the market share of RDBMSs.
22. The data in an RDBMS is stored in database objects which are called
as tables. This table is basically a collection of related data entries
and it consists of numerous columns and rows.
Remember, a table is the most common and simplest form of data
storage in a relational database. The following program is an
example of a CUSTOMERS table:
ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00
23. •Every table is broken up into smaller entities called fields. The
fields in the CUSTOMERS table consist of ID, NAME, AGE, ADDRESS
and SALARY.
•A field is a column in a table that is designed to maintain specific
information about every record in the table.
•A record is also called as a row of data is each individual entry
that exists in a table. For example, there are 7 records in the above
CUSTOMERS table. Following is a single row of data or record in
the CUSTOMERS table.
•A column is a vertical entity in a table that contains all
information associated with a specific field in a table.
24.
25. Object-Oriented Database Management System (OODBMS):
•An object-oriented database management system (OODBMS) is a database
management system that supports the creation and modeling of data as
objects.
•OODBMS also includes support for classes of objects and the inheritance of
class properties, and incorporates methods, subclasses and their objects.
•Most of the object databases also offer some kind of query language,
permitting objects to be found through a declarative programming approach.
Also called an object database management system (ODMS).
•An object-oriented database management system represents information in
the form of objects as used in object-oriented programming.
•OODBMS allows object-oriented programmers to develop products, store
them as objects and replicate or modify existing objects to produce new ones
within OODBMS.
•OODBMS allows programmers to enjoy the consistency that comes with one
programming environment because the database is integrated with the
programming language and uses the same representation model. Certain
object-oriented databases are designed to work with object-oriented
programming languages such as Delphi, Python, Java, Perl, Objective C and
Visual Basic .NET.
26.
27. •Object-oriented modeling (OOM) is the construction
of objects using a collection of objects that contain stored values
of the instance variables found within an object.
Unlike models that are record-oriented, object-oriented values
are solely objects.
•There is currently no widely agreed-upon standard for what
constitutes an OODBMS, and OODBMS products are considered to
be still in their infancy. In the meantime, the object-relational
database management system (ORDBMS), the idea that object-
oriented database concepts can be superimposed on relational
databases, is more commonly encountered in available products.
• An object-oriented database interface standard is being
developed by an industry group, the Object Data Management
Group (ODMG). The Object Management Group (OMG) has
already standardized an object-oriented data brokering interface
between systems in a network.
28. •DBMS is any Database Management System. The most popular
DBMS are relational database management systems in which we
store everything as a relation between entities. Entities are
Tables.
•Eg. Customer entity data is stored in CUSTOMER table. Order
entity data table is stored in ORDER table. Then we establish
relation between CUSTOMER and ORDER table by using a foreign
key.
•OODBMS stands for Object Oriented Database Management
System. In an OODBMS we store data in Object form. One Object
can be composed of more Objects. An Object can inherit another
Object. We use OODBMS with Object Oriented Programming
languages.
•For Java programming, we use Hibernate framework that helps
us in mapping Object domain of our program to a relational
database.
29. •OODBMS is a DBMS which allows information to be
represented in the form of objects as used in object-oriented
programming.
•RDBMS is based on the relational model and data in a RDMS are
stored in the form of related tables
•OODBMSs use object-oriented model while the RDBMSs use the
relational model.
•OODBMS can store/ access complex data more efficiently than
RDBMS. But learning OODBMS can be complex due to the
object-oriented technology, compared to learning RDBMS.
30. Below is a list of advantages and disadvantages of using an OODBMS over an RDBMS:
31. Query Processing:
It is the step by step process of breaking the high level language into low level
language which machine can understand and perform the requested action for
user. Query processor in the DBMS performs this task.
32. •Above diagram depicts how a query is processed in the database
to show the result. When a query is submitted to the database, it
is received by the query compiler. It then scans the query and
divides it into individual tokens.
•Once the tokens are generated, they are verified for their
correctness by the parser. Then the tokenized queries are
transformed into different possible relational expressions,
relational trees and relational graphs (Query Plans).
•Query optimizer then picks them to identify the best query plan
to process. It checks in the system catalog for the constraints and
indexes and decides the best query plan. It generates different
execution plans for the query plan.
•The query execution plan then decides the best and optimized
execution plan for execution.
•The command processor then uses this execution plan to retrieve
the data from the database and returns the result. This is an
overview of how a query processing works.
33. There are four phases in a typical query processing:
•Parsing and Translation
•Query Optimization
•Evaluation or query code generation
•Execution in DB’s runtime processor.
Parsing and Translation:
•This is the first step of any query processing. The user typically
writes his requests in SQL language. In order to process and
execute this request, DBMS has to convert it into low level –
machine understandable language.
•Any query issued to the database is first picked by query
processor. It scans and parses the query into individual tokens and
examines for the correctness of query. It checks for the validity of
tables / views used and the syntax of the query.
•Once it is passed, then it converts each tokens into relational
expressions, trees and graphs. These are easily processed by the
other parsers in the DBMS.
34. •Let us try to understand these steps using an example. Suppose
user wants to see the student details who are studying in
DESIGN_01 class. If the users say ‘Retrieve Student details who are
in DESIGN_01 class’, the DBMS will not understand.
•Hence DBMS provides a language - SQL which both user and DBMS
can understand and communicate with each other. This SQL is
written in simple English like form which both can understand. So
the user would write his request in SQL as below:
SELECT STD_ID, STD_NAME, ADDRESS, DOB
FROM STUDENT s, CLASS c
WHERE s.CLASS_ID = c.CLASS_ID
AND c.CLASS_NAME = ‘DESIGN_01’;
When he issues this query, the DBMS reads and converts it into the
form which DBMS can use to further process and synthesis it. This
phase of query processing is known as parsing and translation
phase.
35. Structure Query Language(SQL) is a database query language used
for storing and managing data in Relational DBMS. SQL was the first
commercial language introduced for E.F Codd's Relational model of
database.
Today almost all RDBMS(MySql, Oracle, Infomix, Sybase, MS
Access) use SQL as the standard database query language. SQL is
used to perform all types of data operations in RDBMS.
SQL defines following ways to manipulate data stored in an RDBMS.
DDL: Data Definition Language
This includes changes to the structure of the table like creation of
table, altering table, deleting a table etc.
All DDL commands are auto-committed. That means it saves all the
changes permanently in the database.
36. Command Description
create to create new table or database
alter for alteration
truncate delete data from table
drop to drop a table
rename to rename a table
37. DML: Data Manipulation Language
DML commands are used for manipulating the data stored in the
table and not the table itself.
DML commands are not auto-committed. It means changes are
not permanent to database, they can be rolled back.
Command Description
insert to insert a new row
update to update existing row
delete to delete a row
merge merging two rows or two tables
38. TCL: Transaction Control Language
These commands are to keep a check on other commands and
their affect on the database. These commands can annul changes
made by other commands by rolling the data back to its original
state. It can also make any temporary change permanent.
Command Description
commit to permanently save
rollback to undo change
savepoint to save temporarily
39. DCL: Data Control Language
Data control language are the commands to grant and take back
authority from any database user.
Command Description
grant grant permission of right
revoke take back permission.
DQL: Data Query Language
Data query language is used to fetch data from tables based on
conditions that we can easily apply.
Command Description
select retrieve records from one or more table
40. DBMS - Concurrency Control
In a multiprogramming environment where multiple transactions can be
executed simultaneously, it is highly important to control the concurrency
of transactions. We have concurrency control protocols to ensure atomicity,
isolation, and serializability of concurrent transactions. Concurrency control
protocols can be broadly divided into two categories:
•Lock based protocols
•Time stamp based protocols
Lock-based Protocols:
Database systems equipped with lock-based protocols use a mechanism by
which any transaction cannot read or write data until it acquires an
appropriate lock on it. Locks are of two kinds −
Binary Locks − A lock on a data item can be in two states; it is either locked
or unlocked.
Shared/exclusive − This type of locking mechanism differentiates the locks
based on their uses. If a lock is acquired on a data item to perform a write
operation, it is an exclusive lock. Allowing more than one transaction to
write on the same data item would lead the database into an inconsistent
state. Read locks are shared because no data value is being changed.
41. Timestamp-based Protocols:
•The most commonly used concurrency protocol is the timestamp
based protocol. This protocol uses either system time or logical
counter as a timestamp.
•Lock-based protocols manage the order between the conflicting
pairs among transactions at the time of execution, whereas
timestamp-based protocols start working as soon as a transaction
is created.
•Every transaction has a timestamp associated with it, and the
ordering is determined by the age of the transaction. A transaction
created at 0002 clock time would be older than all other
transactions that come after it. For example, any transaction 'y'
entering the system at 0004 is two seconds younger and the
priority would be given to the older one.
•In addition, every data item is given the latest read and write-
timestamp. This lets the system know when the last ‘read and
write’ operation was performed on the data item.
42. Timestamp Ordering Protocol:
The timestamp-ordering protocol ensures serializability among
transactions in their conflicting read and write operations. This is
the responsibility of the protocol system that the conflicting pair of
tasks should be executed according to the timestamp values of the
transactions.
The timestamp of transaction Ti is denoted as TS(Ti).
Read time-stamp of data-item X is denoted by R-timestamp(X).
Write time-stamp of data-item X is denoted by W-timestamp(X).
Timestamp ordering protocol works as follows −
If a transaction Ti issues a read(X) operation −
If TS(Ti) < W-timestamp(X)
Operation rejected.
If TS(Ti) >= W-timestamp(X)
Operation executed.
All data-item timestamps updated.
43. Data Warehouse and Data Mart:
Data Warehouse:
•Holds multiple subject areas
•Holds very detailed information
•Works to integrate all data sources
•Does not necessarily use a dimensional model but feeds dimensional models.
Data Mart:
Often holds only one subject area- for example, Finance, or Sales
May hold more summarised data (although many hold full detail)
Concentrates on integrating information from a given subject area or set of
source systems
Is built focused on a dimensional model using a star schema.
44. Data warehouse and Data mart are used as a data repository and
serve the same purpose. These can be differentiated through the
quantity of data or information they stores.
The vital difference between a data warehouse and a data mart is
that a data warehouse is a database that stores information-
oriented to satisfy decision-making requests whereas data mart is
complete logical subsets of an entire data warehouse.
45. Comparison Chart:
BASIS FOR COMPARISON DATA WAREHOUSE DATA MART
Basic Data warehouse is application
independent.
Data mart are specific to decision
support system application.
Type of system Centralised Decentralised
Form of data Detailed Summarized
Use of denormalisation The data is slightly denormalised. The data is highly denormalised.
Data model Top-down Bottom-up
Nature Flexible, data-oriented and long
life.
Restrictive, project-oriented and
short life.
Type of schema used Fact constellation Star and snowflake
Ease of building Hard to build Simple to build
46. A data warehouse is subject-oriented, integrated, time-variant, and nonvolatile
collection of data that supports management decision making process.
Alternatively, it a repository of information gathered from multiple sources,
stored in a unified schema, at a sole site that allows integration of a variety of
application systems. Once this data is collected it is stored for a long time, hence
has a long life and permit access to historic information.
47. •Consequently, data warehouse provides the user with a
single integrated interface to the data through which user
can write decision-support queries easily.
•Data warehouse helps in turning the data into
information. Designing a data warehouse includes top-
down approach.
•It gathers information about subjects that span the entire
organization, such as customers, items, sales, assets and
personnel and therefore its scope is enterprise-wide.
•Generally, fact constellation schema is used in it, which
covers a wide variety of subjects.
•A data warehouse is not a static structure and
it’s evolving continuously.
48. A data mart can be called as a subset of a data warehouse or a subset of
corporate-wide data that is of value to a specific group of users. Data warehouse
involves several departmental and logical data marts which must be consistent in
their data representation to ensure the robustness of a data warehouse. A data
mart is a set of tables that concentrate on a single task these are designed using
a bottom-up approach.
49. •Data mart scope is confined to some specific selected
subject, thus its scope is department-wide. These are
usually implemented on low-cost departmental servers.
•The implementation cycle of data marts is monitored in
weeks instead of month and year.
•The star and snowflake schema are commonly used in
the data mart since both are geared towards single
subject modeling. Although, the star schema is more
popular than snowflake schema.
•Depending on the data source the data marts can be
classified into two types:
Dependent and
Independent data marts.
50. Key Differences Between Data Warehouse and Data Mart:
1. Data warehouse is application independent whereas data mart is
specific to decision support system application.
2. The data is stored in a single, centralized repository in a data
warehouse. As against, data mart stores data decentrally in the user
area.
3. Data warehouse contains a detailed form of data. In contrast, data
mart contains summarized and selected data.
4. The data in a data warehouse is slightly de normalized while in case
of Data mart it is highly de normalized.
5. The construction of data warehouse involves top-down approach.
Conversely, while constructing a data mart the bottom-up approach
is used.
6. Data warehouse is flexible, information-oriented and longtime
existing nature. On the contrary, a data mart is restrictive, project-
oriented and has a shorter existence.
7. Fact constellation schema is usually used for modeling a data
warehouse whereas in data mart star schema is more popular.