2. What is a database?
• A database or database system is a collection
of related data. In its simplest form a database
consists of a collection of records and fields.
Each record contains the same set of fields,
each of which contains one piece of
information.
3. Database Management System
(DBMS)
Definition: A database management system (DBMS) is, as
its name suggests, the software used to manage a
database system.
• It manages:
the structure of the individual data files
the relationships between data items and between data
files
how the data is interrogated (i.e. how you get information
from the database)
the properties of the database, i.e. ensuring that all
queries, updating and amendments to structure are
processed reliably.
4. Sequential Files
• In a sequential file, records are stored one after the other, in the order in
which they were added to the storage medium, usually magnetic tape.
To read data from or write data to tape, sequential files must be used.
• There are two ways that records can be arranged in a sequential file. One
way is to have the records in some sort of order using a key field. A key
field is one which is unique to every record, i.e. every record has a
different value in that field. This is called ordered sequential.
5. • Alternatively, the records might be arranged with no
thought given to their order so they appear to be
unordered. Whether the file is ordered or unordered
affects the way in which the data is processed as well as
the type of processing that can be used.
• An unordered sequential file is often referred to as a serial
file, as the only method for retrieving information is to go
through each record one by one.
6. • Whether the file is ordered or unordered
affects the way in which the data is processed
as well as the type of processing that can be
used. An unordered sequential file is often
referred to as a serial file, as the only method
for retrieving information is to go through
each record one by one.
• In an ordered file, the records are put in order
of a key field such as customer ID, as shown
above. In an unordered file, the records are
not in any particular order.
7. Disadvantages to using sequential files
There are a number of disadvantages to using
sequential files:
The only way to add new records to a sequential
file is to store them at the end of the file.
A record can only be replaced if the new record is
exactly the same length as the original.
Records can only be updated if the data item used
to replace the existing data is exactly the same
length.
8. • The processing of records in a sequential file is slower than
with other types of file.
• In order to process a particular record all the records before
the one you want have to be read in sequence until you get
to the one you want.
• The use of sequential files is recommended only for those
types of application where most or all the records have to
be processed at one time.
• Adding records to the end of the file is fairly
straightforward. However, amending or deleting records is
not so easy.
• If the file is an unordered sequential file, then it cannot be
easily done.
• If it is an ordered sequential fi le, then the changes can be
made relatively easily providing the transaction tile – which
contains the actions to be carried out on the records - has
been sorted into the same order as the master file, using
the key field.
9. The letter in the Trans. column is the type of transaction. D is a deletion of, C is a
change to and A is an addition of a record.
The computer reads the first record in the transaction file and the first record in
the old master file. If the 10 doesn't match, the computer writes the master file
record to the new master file. The next record of the old master file is read and if
it matches, as it does in this example, the computer carries out the transaction.
10. • In this case the record has to be deleted, so instead of
writing this old master file record to the new master file the
computer ignores it and reads the next old master file
record and the next transaction record.
• We are now on the second record of the transaction file
and the third record of the old master file. If they don't
match, the old master file record is written to the new
master file and the next record (the fourth) of the old
master file is read. This carries on until the next old master
file record is found which matches the transaction file
record.
• In this case, the fifth old master file record 10 matches the
second transaction record. This requires a change, so data
in the transaction file is written to the new master file (not
the old master file record). This whole procedure carries on
until the transaction type ‘A’ is met. After this, all the
remaining records of the old master file are written
unchanged to the new master file and then the remaining
records of the transaction file are added to the master file.
11. Indexed sequential files
• Indexed sequential files are stored in order.
Ordinary sequential or serial files can be stored
on tape.
• An indexed sequential file is stored on disk to
enable some form of direct access.
• Each record consists of fixed length fields.
• This is a leftover from the use of magnetic tapes
where records had to be stored in the order they
were written to the file.
• The use of ordering facilitated a greater speed of
access.
12. • With an indexed sequential system the records are in
some form of order.
• For example by Surname for a record of employees.
The index is a pointer to whereabouts on the disk the
record is stored.
• In simple terms, the table might be numbered 1 to 26
(A to Z) and the whereabouts on the tape that all the
As can be found, all the Bs, and so on, is stored in this
index.
• This means that when a name beginning with S is
required the part of the file containing all the As to Rs
can be ignored and the disk is accessed where the Ss
begin. All the records beginning with S still have to be
read one by one until the appropriate record is found,
but it does mean that not every record from A onwards
has to be read.
13. Applications of
indexed sequential files
• Banks use sequential access systems for batch processing
cheques.
• This system would have to be at least indexed sequential
for faster access to records for online banking.
• Indexed sequential files are used with hybrid batch –
processing systems, such as employee records. The index
will allow for direct access when individual records are
required for human resource/personnel use.
• The records will be held sequentially to allow for serial
access when producing a payroll, since all records will be
processed o ne after the other.
14. Random Access files
• Random access is the quickest form of access.
• It does not matter whereabouts in the file the desired
record is; it will take the same amount of time to
access any particular record.
• Each record is fixed length and each has a key. "The
computer looks up the key and goes to the
appropriate place on the disk to access it.
16. Hierarchical database management
systems• Hierarchical DBMS are no
longer used as a form of file
management to any extent, as
they suffer from the problem of
one-way relationships.
• Hierarchical DBMS use a tree-
like structure similar to a family
tree system.
• Its main use is in file
organization within computer
directory structures.
• It enables fast access to data,
however, as large amounts of
data are bypassed as you go
down the levels.
17. History
• The hierarchical structure was used in early mainframe DBMS. Records'
relationships form a treelike model. This structure is simple but
inflexible because the relationship is confined to a one-to-many
relationship. The IBM Information Management System (IMS) and the
RDM Mobile are examples of a hierarchical database system with
multiple hierarchies over the same data. RDM Mobile is a newly
designed embedded database for a mobile computer system.
• The hierarchical data model lost traction as Codd's relational model
became the de facto standard used by virtually all mainstream database
management systems. A relational-database implementation of a
hierarchical model was first discussed in published form in 1992.
Hierarchical data organization schemes resurfaced with the advent of
XML in the late 1990s. The hierarchical structure is used primarily today
for storing geographic information and file systems. Currently the most
widely used hierarchical databases are IMS and Windows Registry by
Microsoft.
18. Network database management
systems• Network DBMS were developed to
overcome a lot of the faults of the
hierarchical type. Although the
technology is outdated, many existing
databases still rely on this form of
DBMS.
• Many are distributed database
systems. Parts of the database are
usually stored on a number of
computers that are linked through a
WAN or LANs.
• Many of the parts of the database are
duplicated so that it is unlikely that any
data is lost.
• Despite this, it appears to each user to
be a single system. The duplication also
enables faster processing.
19. • The system caters for very complex searches or
filters but does not necessarily carry out the
processing at the site where the user is.
• Another type of network database is stored on
one device but can be accessed from a number
of network locations through either a LAN or a
WAN.
• Users of the database can access the system
simultaneously without affecting the speed of
accessing data. Examples of this type are the
Police National Computer (PNC) and the Driver
and Vehicle Licensing Authority (DVLA) in the
UK. Both of these can be accessed by police
officers from their cars.
20. Relational database systems
• The term "relational database" was invented by E. F. Codd at
IBM in 1970, Codd introduced the term in his seminal paper "A
Relational Model of Data for Large Shared Data Banks“.
• In this paper and later papers, he defined what he meant by
"relational". One well-known definition of what constitutes a
relational database system is composed of Codd's 12 rules.
• However, many of the early implementations of the relational
model did not conform to all of Codd's rules, so the term
gradually came to describe a broader class of database
systems, which at a minimum:
– Present the data to the user as relations (a presentation in tabular
form, i.e. as a collection of tables with each table consisting of a set
of rows and columns);
– Provide relational operators to manipulate the data in tabular form.
21. • A relational database consists of a
number of separate tables that are
related in some way.
• Each table has a key field that is a field
in at least one other table. Data from
one table can then be combined with
data from another table when
producing reports.
• It is possible to select different fields
from each table for output, using the
key field as a reference point. For
example, relational tables could be used
to represent data from a payroll
application and from a human resources
application.
• The key field could be the works
number. Fields of personal data from
the human resources table could be
combined with fields from the payroll in
a report.
22. • The standard programming language in large applications to deal
with relational tables is the structured query language (SQL),
which is used for queries and producing reports.
• An advantage of relational databases is that data is not repeated
and therefore doesn't waste valuable storage capacity.
• ln contrast, the problem with flat file databases is that they repeat
data. A payroll file may have the name and contact details of a
worker and this would be duplicated in a human resources file.
• In a relational database, these would be in separate tables
connected by the key field - worker number.
• Data retrieval is quicker.
• Duplicated data can mean that hackers have
easier access to personal data that might be
repeated across different files, so relational
databases reduce this risk.
• Allows room for expansion.