2. Overview of Transaction Management
6 September 20192
The Transaction Part consists of basically 2 parts
Recovery
Concurrency
Recovery and Concurrency control are both concerned with the
general business of DATA PROTECTION , that is , protecting the data in
the database against loss or damage. Possible problems might be:
The system might crash in the middle of executing some programs, thereby leaving
the DB in an unknown state.
Two programs executing at the same time (“CONCURRENTLY”) might interfere with
each other and hence produce incorrect results, either inside the database or in the
outside world.
We shall discuss the two i.e. recovery and concurrency separately.
Dr Rafi Ullah / SM Irteza
3. Contents
6 September 20193
1. Recovery
2. Overview
3. Transactions
4. Log Based Recovery
5. Two Phase Commit
Dr Rafi Ullah / SM Irteza
4. Overview
6 September 20194
The DBMS is subject to several kinds of failures. The system must cope
with these failures in a way that guarantees the consistency of the
information in the database in spite of the failure having occurred.
The recovery subsystem is the part of the DBMS which ensures that the
system is recovered to a consistent state after a failure occurs and
before any new requests are accepted.
This does not mean that this part of the DBMS only runs when a failure
occurs. To ensure that recovery is possible, some sort of replication of
information must exist in the database and take place at regular times.
Dr Rafi Ullah / SM Irteza
5. Motivation
6 September 20195
Consider the following requests
raising the salaries of all employees by 5%.
transferring Rs.100000 from account A to account B.
removing a client from the video shop database and the details of all of his
loans.
and suppose some sort of failure occurs after the
operations start being executed but before they are fully
completed.
???What happens to the data in the tables?????
Dr Rafi Ullah / SM Irteza
6. Let us have a closer look
at the first request
6 September 20196
How do we finish the operation after restarting?
How do we know which employees had the salary updated?
Dr Rafi Ullah / SM Irteza
7. The Log or Journal
6 September 20197
A solution to the first question is to undo the operations that
were done and restart the whole process again (it would be too
complicated to simply continue from where the transaction stopped).
To solve the second question, we must keep a record of
everything that was actually updated. This is done in a special file
called the log (or journal).
The log keeps the details of all update operations, including the
values of the data items before and after the operation is
executed.
Dr Rafi Ullah / SM Irteza
8. More on the Log file
6 September 20198
Log records are of several kinds: some are used to identify the status of the
transactions – started, committed, etc; others are called update records and
contain the details of the update operations seen before.
A typical update record will contain the following information:
A transaction identifier
A data item identifier (what was changed)
The old value of the item (before-value)
The new value of the item (after-value)
The before and after values can be used to undo and redo the operations.
Dr Rafi Ullah / SM Irteza
9. Redundancy is necessary to achieve
Recovery
6 September 20199
We need to make sure that we store enough information during
normal processing so as to allow the recovery from a failure. This
is the main function of the log.
Notice that this is basically redundant information. The principle
is that most information in the database should be re-
constructible from some other information stored in a different
location or medium.
Different media have different capabilities: strengths and
weaknesses...
Dr Rafi Ullah / SM Irteza
10. Storage types
6 September 201910
Volatile storage: quick access, high risk (memory/cache).
Non-volatile storage: slower access, more resistant than volatile,
still subject to failures (disks, tapes, optical devices)
Stable storage: theoretically safe. In practice, only approximated:
RAID – redundant array of (inexpensive) independent disks –
systems, storage in different sites.
Dr Rafi Ullah / SM Irteza
11. Possible causes of failures
6 September 201911
a logical/program error, e.g., an overflow, an exception
not caught, etc
a disk crash
a power failure
a disaster, such as flood, fire, etc
many others
Dr Rafi Ullah / SM Irteza
12. The picture so far ...
6 September 201912
A failure of some sort might occur ...
We need to keep the database consistent: some sets of operations must
be executed on an all-or-nothing basis.
If a failure does occur we need to recover the consistency of the
database
For that, we replicate information in the system so that the information
not affected by the failure can be used to restore consistency of the
database as a whole
The replication must be made at least on non-volatile storage, but
ideally on more stable storage as well.
Dr Rafi Ullah / SM Irteza
13. Transactions
6 September 201913
We have mentioned the term transaction when we spoke about the
records in the log file.
The term transaction is used to describe a set of database operations
that is required to be treated as a single logical unit. It is of fundamental
importance for the topic of recovery.
Examples of such operations are
raising the salaries of all employees by 5%.
transferring £100.00 from account A to account B.
removing a client from the videoshop database and the details of all of his
loans.
Transactions have special properties called the ACID properties.
Dr Rafi Ullah / SM Irteza
14. Transactions pseudocode
6 September 201914
Begin Transaction;
Insert Into SP
Relation{ Tuple { S# S#(‘S5’),
P# P#(‘P1’),
QTY QTY(1000)} } ;
If Any Error Occurred Then Go To UNDO; END IF;
Update P where P# = P#(‘P1’)
Totqty := Totqty + QTY(1000);
If any error occurred THEN Go To UNDO; END IF;
COMMIT;
GO TO FINISH;
UNDO:
ROLLBACK;
FINISH:
RETURN;
Dr Rafi Ullah / SM Irteza
15. ACID properties
6 September 201915
Atomicity: Transactions are executed on a all-or-nothing
basis.
Consistency: The set of operations in a transaction, when
executed as a whole, does not affect the consistency of the
database.
Isolation: Distinct transactions updating the value of the
same piece of information are not executed concurrently.
One of the two has to wait until the given data is made
available when the other transaction finishes.
Durability: Once a transaction commits, its updates survive,
even if there is a failure immediately after.
Dr Rafi Ullah / SM Irteza
18. How are transactions specified?
6 September 201918
A transaction starts with the successful execution of a
BEGIN TRANSACTION statement and finishes with either
the successful execution of a COMMIT or ROLLBACK
statement.
The COMMIT statement tells the transaction manager that
the particular piece of work in the transaction is finished and
that all updates made can now be made permanent.
The ROLLBACK statement states the opposite. Something
has gone wrong and all updates done so far must be
undone (or rolled back).
Dr Rafi Ullah / SM Irteza
19. Commit Points
6 September 201919
A commit point is established by
All Updates made by the executing program or operation
since the previous commit point are made permanent
Prior to the COMMIT all the updates made are assumed
tentative only. Tentative in the sense that they can be
cancelled in the form of ROLLBACK.
All database positioning is lost and releases any tuple
locks held by the program (more on this on concurrency)
Database positioning means that at any given time an
existing program will have addressablity to certain tuples.
Dr Rafi Ullah / SM Irteza
20. More on transactions
6 September 201920
A program can contains several transactions.
If the program itself aborts some mechanism must be
provided to rollback its unsuccessful transactions.
The system will issue an implicit ROLLBACK for the
uncommitted transactions started in a program that does
not terminate normally.
Dr Rafi Ullah / SM Irteza
21. Write-ahead log rule
6 September 201921
When a transaction commits, it is possible that the updates
have been made to the data in memory but not physically
written to the disk.
If the system itself fails, these updates could be lost. To
ensure that the updates are not lost during a system failure,
it is necessary to ensure that they are physically written to
the log before COMMIT is completed.
This is known as the write-ahead log rule.
Dr Rafi Ullah / SM Irteza
22. More on System Failure
6 September 201922
If the System itself fails, the whole integrity of the database is
compromised. In particular, all transactions in progress during the failure
are affected.
Because the data in the main memory is lost, it is impossible to
determine how much of each transaction was actually accomplished,
and therefore all such transactions must be rolled back at system
restart.
If a transaction did commit, but the buffers with relevant data were not
written to the disk, it needs to be redone.
The problem now becomes: what transactions should be redone and
what transactions should be undone?
Dr Rafi Ullah / SM Irteza
23. Checkpoints
6 September 201923
The previous problem could be solved by searching the
entire log, but this would take too long.
That is what checkpoints are for: a special record in the log
file to help in the recovery from a system failure.
At regular intervals, the system takes a checkpoint. A
checkpoint requires that
the database buffers are physically written to the disk.
a record with all transactions currently in progress is written to the log.
Dr Rafi Ullah / SM Irteza
25. What happened to transactions T1–T5?
6 September 201925
Transactions T1, T2 and T3 were all committed, but only transaction T2
was physically written to the database (at time “checkpoint”)
Transaction T2 is not involved in the restart process.
Transactions T1 and T3 need to be redone.
Transactions T4 and T5, were
only partially done, so they
need to be undone.
Dr Rafi Ullah / SM Irteza
26. What happens at system startup after a
global failure
6 September 201926
The system builds an undo-list and a redo-list in the following way:
Search for the latest checkpoint record in the log and put all
transactions in progress at that time into the undo-list. Set the redo list
to empty.
Now process the log forward from the checkpoint record until its end,
adding any transaction for which a begin transaction is found for
transaction T then add T to the the undo list.
Whenever a commit is found in the log, remove the associated
transaction from the undo list and put it in the redo list.
When the end of the log is reached, the database is ready to be
restored. The system does this by processing the two lists above.
Dr Rafi Ullah / SM Irteza
27. Processing the undo and the redo lists
6 September 201927
Once the end of the log is reached, the system proceeds backwards
undoing all transactions in the undo list. When the list is finished it goes
forward in the log again, this time redoing all transactions in the redo
list.
These procedures are called backward recovery and forward recovery,
respectively.
The undo and redo operations must guarantee the correct behavior of
the recovery process.
After the two lists are processed, the system is ready to resume normal
activity.
Dr Rafi Ullah / SM Irteza
29. Media recovery
6 September 201929
In this case, there was no global failure, but a portion of the database
has been physically damaged.
To recover from this kind of failure, the system needs to restore a recent
backup copy or dump of the database and then use the log to make the
backup copy up to date.
Only completed transactions need to be (re)processed (the
uncompleted transactions never altered the state of the backup copy
anyway).
Note that backup copies must be made regularly.
Dr Rafi Ullah / SM Irteza
32. Two-Phase Commit
6 September 201932
This technique is used whenever a transaction involves items
associated with independent systems that have their own recovery
procedures, i.e., more than one resource managers, or more than one
database, or the DISTRIBUTED environment..
We want to ensure that if the transaction completes successfully, then
the updates to all such systems are committed and if it does not
complete, then all updates are rolled back.
The “global” commit or rollback is handled by a system component
called the coordinator.
Example might be the transaction on the web while buying a book.
Dr Rafi Ullah / SM Irteza
33. The Coordinator
6 September 201933
The coordinator ensures consistency of the “distributed” transaction.
First, it asks that all parts involved attempt to complete the transaction
and waits for their reply: success or failure.
It then analyses all replies so that he can issue a global status for the
transaction.
If all replies were success, then the global status is success and the
global transaction is committed, otherwise the global transaction needs
to be rolled back.
In either case, the coordinator informs all involved parts of the global
decision for an appropriate action on their part.
Dr Rafi Ullah / SM Irteza
34. The Coordinator continued
6 September 201934
Here is how it works
Assume that the transaction has completed successfully, so
the system wide instruction it issues is COMMIT, not
ROLLBACK.
On receiving the COMMIT request the coordinator goes
through two steps
Dr Rafi Ullah / SM Irteza
35. 1st Step
6 September 201935
First, It instructs all resource managers to get ready to “go either way”
on the transaction.
In practice this means that each participant in the process must force all
log entries in its for local resources used by the transaction out to its
physical log.
Assuming the forced write was successful, the resource manager now
replies OK to the coordinator, otherwise it replies NO OK.
Dr Rafi Ullah / SM Irteza
36. 2nd Step
6 September 201936
When the coordinator receives this info from all the participants, it forces
an entry to its own physical log, recording its decision regarding the
transaction.
If all replies were OK, then its decision is COMMIT
If any of the reply was NOT OK its decision is NOT OK.
Similar to the AND operator.
After the decision, it informs the participants about the decision , either
COMMIT , or ROLLBACK. Each participant MUST do what the
coordinator has issued.
Dr Rafi Ullah / SM Irteza
37. SQL Facilities
6 September 201937
Commit is issued by typing COMMIT which is short for
COMMIT WORK in SQL
Rollback is issued issuing ROLLBACK which is short for
ROLLBACK WORK in SQL
One difference is that there is no BEGIN TRANSCATION
statement. Instead , a transaction is begun when ever the
program initiates a transaction initiating statement.
SET TRANSACTION command is used in SQL to define
characteristics of the next transaction.
SET can only be used when no transaction is under way.
SET TRANSACTION <option commalist>
Dr Rafi Ullah / SM Irteza