1. Identify the basic function of file system
2. Describe the following file organization
techniques
3. Identify which file organization technique is
appropriate for as specific device
4. Describe the three types of file structure
5. Describe the various methods of file
allocation
6. State the benefits and weaknesses of using
each of the file allocation method
7. Describe the two free space management
techniques
8. Describe the various techniques for
implementing file access control
9. Explain the techniques used to prevent data
loss
10. Implement specific backup technique in a
given situation
Files
• Named collection of data that is manipulated
as a unit
• Reside on secondary storage devices
Operating systems can create an interface that
facilitates navigation of a user’s files
• File systems can protect such data from
corruption or total loss from disasters
• Systems that manage large amounts of shared
data can benefit from databases as an
alternative to files
File: a named collection of data that may be
manipulated as a unit by operations such as:
• Open
• Close
• Create
• Destroy
• Copy
• Rename
• List
Individual data items within a file may be manipulated by
operations like:
• Read
• Write
• Update
• Insert
• Delete
File characteristics include:
• Location
• Accessibility
• Type
• Volatility
• Activity
Files can consist of one or more records
Directories is ……. :
• Files containing the names and locations of
other files in the file system, to organize and
quickly locate files
Directory entry stores information such as:
• File name
• Location
• Size
• Type
• Accessed
• Modified and creation times
File systems
• Organize files and manages
access to data
• Responsible for file management, auxiliary
storage management, file integrity
mechanisms and access methods
• Primarily are concerned with managing
secondary storage space, particularly disk
storage
It is about …. How records are arranged &
characteristics of medium? used to store it
It is HOW the records of a file are arranged on
secondary storage
On magnetic disks, files can be organized as
• sequential
• direct
• indexed sequential
• Partitioned
*refers to Notes below.
Sequential
• Easiest to implement because records are stored &
retrieved serially, one after other.
• Optimization features - Built into system to speed
process.
E.g., select a key field from record & then sort
records by that field before storing them.
Aids search process. (filtering)
Complicates maintenance algorithms because
original order must be preserved every time records
added or deleted.
Direct (a.k.a Random)
• Uses direct access files which can be implemented only
on direct access storage devices (pronounced DAZ-dee –
magnetic storage / drum).
• Give users flexibility of accessing any record in any
order without having to begin search from beginning of
file.
• Records are identified by their relative addresses (their
addresses relative to beginning of file).
Logical addresses computed when records are stored &
again when records are retrieved.
Direct – Advantages
• Fast access to records.
• Can be accessed sequentially by starting at first
relative address & incrementing it by one to get to next
record.
• Can be updated more quickly than sequential files
because records quickly rewritten to original
addresses after modifications.
No need to preserve order of the records, so adding
or deleting them takes very little time.
Indexed Sequential
• Combines best of sequential & direct access.
• Records are arranged in a logical sequence
according to a key contained in each record.
• The system maintains an index containing the
physical address of contain records
Partitioned
• This refers to a file of sequential sub-files.
Each sequential sub-file is called a member of
the partitioned file.
• Improvement for sequential technique.
OS considers a file to be unstructured
OS does not know what is in the file.
All it sees are bytes.
Any meaning? must be imposed by user-level
programs.
Used by UNIX, Windows, most modern OS (FAT
family filesystem)
The byte sequences provides the maximum
flexibility (easy to read/fast read).
File is a sequence of fixed-length records,
each with some internal structure (head-body-
tail).
Read operation : returns one record
Write operation : overwrites or appends
(tambah/’topup’) one record.
OS can optimize operations on records
A file consists of a tree of records.
Record not necessarily all the same length
Each record containing a key field in a fixed
position in the record.
The tree is sorted on the key field why ?
allow fast searching for a particular key.
The basic operation here is to get the record with
a specific key (NTFS family filesystem).
Furthermore, new records can be added to the
file, with the operating system deciding where to
place them.
This type of file is clearly quite different from the
unstructured byte sequence used in UNIX and
Win. 98 (FAT filesystem)
Widely used on the large mainframe computers.
Still used in some commercial data processing.
For the zoo file of Figure (c), one could ask the
system to get the record whose key is pony, for
example, without worrying about its exact
position in the file.
Problem of allocating space and freeing space
on secondary storage.
Contiguous allocation systems have generally
been replaced by more dynamic non-contiguous
allocation systems. Why?
• Files tend to grow or shrink over time
• Users rarely know in advance how large their
files will be
File Allocation will used THREE (3) types of algorithm to
store files:
• Best fit (padanan TERBAIK – sama padan)
determine the best place to put the new data
minimise the wasted space
• First fit (padanan PERTAMA – cukup untuk padanan)
scanning from the beginning of available memory to the
end, which is at least big enough to accept the data is
found
• Worst Fit (padanan TERBURUK – ruang besar untuk
data kecil)
selects the largest possible free space that the
information can be stored on
File Allocation
Store each file as a contiguous run of disk
blocks
Wasteful of space (dynamic storage-allocation
problem).
Simple – only starting location and length
(number of blocks) are required.
E.g: Assume a disk of 1KB = 1 blocks 50 KB
file is allocated 50 consecutive blocks
Types of allocation : best fit or first fit
File Allocation
Advantages
• Easy to implement
• Two numbers needed for each file:
1 - disk address of the first block
2 - number of blocks in the file
• Read performance is excellent
File Allocation
Disadvantages
• Fragmentation of blocks
• Will need periodic compaction (wasting time)
• Will need to manage free lists
• Have to know a file’s maximum possible size
at the time it is created
• Good for CD-ROMs, DVDs
All file sizes are known in advance
Files are never deleted
File Allocation
The first word of each block is used as a pointer
to the next one.
The rest of the block is for data
Unlike contiguous allocation, every empty disk
block can be used in this method
No space is lost to disk fragmentation (except for
internal fragmentation in the last block of each file)
FAT used by DOS is a variation of linked allocation,
where all the links are stored in a separate table at
the beginning of the disk.
Benefits : cached in memory, improving random
access speeds
File Allocation
Advantage
• No fragmentation on disk (except internal
fragmentation on block)
• No need to pre-specify file sizes (files can
grow/shrink).
• Never necessary to defragment disk.
• For sequentially accessed files, performance
is optimal.
File Allocation
Disadvantage
• Random access is slow
• Can only be used effectively for sequentially
accessed files.
• The pointers use additional disk space.
• Reliability. What if a pointer gets corrupted?
File Allocation
Index allocation solves all of the problems of
contiguous allocation.
Support the random access of a file.
With indexed allocation each file is like a set of
linked blocks, except the pointers are all stored
in an index (the index block – unique to each
file).
The directory contains the location (disk block) of
the index block for each file.
File Allocation
Advantage
• No external fragmentation.
• No need to pre-specify file sizes (files can
grow/shrink).
• Never necessary to defragment disk.
Disadvantage
• The pointers use an additional disk block
(wastes more space than linked allocation
does).
File Allocation
Since the amount of disk space is limited, it is
necessary to reuse the space released by
deleted files.
In general, file systems keep a list of free disk
blocks (initially, all the blocks are free) and
manage this list by one of the following
techniques :
• a. using free lists
• b. using bitmaps
Free List - Linked list of blocks containing the
locations of free blocks
Manage as LIFO or FIFO on disk and store it in
main memory
Blocks are allocated from the beginning of the
free list
Newly freed blocks are appended to the end of
the list
Files are likely to be allocated in noncontiguous
blocks
Increases file access time
used to track allocated sector by some file
systems
If block is free : bit 0
If block is occupied : bit 1
A bitmap (bit-array) contains one bit for each
block in memory
• Xth bit corresponds to the Xth block on the
storage device
May be too large to hold in main memory
Hard to search; the bigger the storage more
hard to search
Advantage:
• Simple: Each bit directly corresponds to a
sector
• Efficient to find first free block
Disadvantage :
• The file system may need to search the entire
bitmap to find a free block
Files are often used to store sensitive data such
as:
• Credit card numbers
• Passwords
• Social security numbers
Therefore, they should include mechanisms to
control user access to data.
• Access control matrix
• Access control by user classes
Two-dimensional access control matrix:
In an installation with a large number of
users and a large number of files, this
matrix generally would be large.
Inappropriate for most systems
Figure 13.12 Access control matrix.
A B C D E F G H I J
0 – Cannot Access
1 – Can Access
A technique that requires considerably less
space is to control access to various user
classes
User classes can include:
• The file owner
• A specified user
• Group
• Project
• Public
Access control data
• Can be stored as part of the file control block
• Often consumes an insignificant amount of
space
Backup techniques
• Store redundant copies of information
Recovery techniques
• Enable the system to restore data after a
system failure
Physical safeguards such as locks and fire
alarms are the lowest level of data protection
Performing periodic backups is the most
common technique used to ensure the continued
availability of data
Physical backups
• Duplicate a storage device’s data at the bit
level
Logical backups
• Store file system data and its logical structure
• Inspect the directory structure to determine
which files need to be backed up, then write
these files to a backup device in a common,
often compressed, archival format
Incremental backups are logical backups that
store only file system data that has changed
since the previous backup
Notas del editor
Sequential - the file records are stored strictly in the same order as they occur physically in the fileDirect - The records are placed in any order, which is suited for application. The system supports random access or direct access of any record in the file.Indexed - the records are arranged in a logical sequence according to a key contained in each record. the system maintains an index containing the physical address of contain recordsPartitioned - This refers to a file of sequential sub-files. Each sequential sub-file is called a member of the partitioned file.
Sequential - the file records are stored strictly in the same order as they occur physically in the fileDirect - The records are placed in any order, which is suited for application. The system supports random access or direct access of any record in the file.Indexed - the records are arranged in a logical sequence according to a key contained in each record. the system maintains an index containing the physical address of contain recordsPartitioned - This refers to a file of sequential sub-files. Each sequential sub-file is called a member of the partitioned file.
Sequential - the file records are stored strictly in the same order as they occur physically in the fileDirect - The records are placed in any order, which is suited for application. The system supports random access or direct access of any record in the file.Indexed - the records are arranged in a logical sequence according to a key contained in each record. the system maintains an index containing the physical address of contain recordsPartitioned - This refers to a file of sequential sub-files. Each sequential sub-file is called a member of the partitioned file.
Sequential - the file records are stored strictly in the same order as they occur physically in the fileDirect - The records are placed in any order, which is suited for application. The system supports random access or direct access of any record in the file.Indexed - the records are arranged in a logical sequence according to a key contained in each record. the system maintains an index containing the physical address of contain recordsPartitioned - This refers to a file of sequential sub-files. Each sequential sub-file is called a member of the partitioned file.
Sequential - the file records are stored strictly in the same order as they occur physically in the fileDirect - The records are placed in any order, which is suited for application. The system supports random access or direct access of any record in the file.Indexed - the records are arranged in a logical sequence according to a key contained in each record. the system maintains an index containing the physical address of contain recordsPartitioned - This refers to a file of sequential sub-files. Each sequential sub-file is called a member of the partitioned file.
Sequential - the file records are stored strictly in the same order as they occur physically in the fileDirect - The records are placed in any order, which is suited for application. The system supports random access or direct access of any record in the file.Indexed - the records are arranged in a logical sequence according to a key contained in each record. the system maintains an index containing the physical address of contain recordsPartitioned - This refers to a file of sequential sub-files. Each sequential sub-file is called a member of the partitioned file.
Sequential - the file records are stored strictly in the same order as they occur physically in the fileDirect - The records are placed in any order, which is suited for application. The system supports random access or direct access of any record in the file.Indexed - the records are arranged in a logical sequence according to a key contained in each record. the system maintains an index containing the physical address of contain recordsPartitioned - This refers to a file of sequential sub-files. Each sequential sub-file is called a member of the partitioned file.
Sequential - the file records are stored strictly in the same order as they occur physically in the fileDirect - The records are placed in any order, which is suited for application. The system supports random access or direct access of any record in the file.Indexed - the records are arranged in a logical sequence according to a key contained in each record. the system maintains an index containing the physical address of contain recordsPartitioned - This refers to a file of sequential sub-files. Each sequential sub-file is called a member of the partitioned file.
Sequential - the file records are stored strictly in the same order as they occur physically in the fileDirect - The records are placed in any order, which is suited for application. The system supports random access or direct access of any record in the file.Indexed - the records are arranged in a logical sequence according to a key contained in each record. the system maintains an index containing the physical address of contain recordsPartitioned - This refers to a file of sequential sub-files. Each sequential sub-file is called a member of the partitioned file.
Sequential - the file records are stored strictly in the same order as they occur physically in the fileDirect - The records are placed in any order, which is suited for application. The system supports random access or direct access of any record in the file.Indexed - the records are arranged in a logical sequence according to a key contained in each record. the system maintains an index containing the physical address of contain recordsPartitioned - This refers to a file of sequential sub-files. Each sequential sub-file is called a member of the partitioned file.
Sequential - the file records are stored strictly in the same order as they occur physically in the fileDirect - The records are placed in any order, which is suited for application. The system supports random access or direct access of any record in the file.Indexed - the records are arranged in a logical sequence according to a key contained in each record. the system maintains an index containing the physical address of contain recordsPartitioned - This refers to a file of sequential sub-files. Each sequential sub-file is called a member of the partitioned file.