4. Image Types
1. Hard Copy Vs Soft Copy
hard copy such as surfaces like plastic, cloth, wood etc
soft copy such as electronic form.
2. Continuous Tone, Half Tone and Bitone
Photographs newspaper / Black & White
magazine
photographs
5. Image Data Types
Some file formats used in the Macromedia Director.
File Import :-
Image – BMP, GIF, JPEG etc
Palette – PAL, ACT
Sound – AIFF, AU, MP3, WAV
Video – AVI, MOV
Animation – DIR, FLA, GIF, PPT
File Export :-
Image – BMP
Video – AVI, MOV
Native – DIR, DXR, EXE
6. Image Data Types(Ref Fundamentals of multimedia by Ze-Nian Li, Mark S Drew)
1 bit images – images consists of pixels or pels. It consists of on
and off bits only and also referred to as a binary image. It is also
called 1 bit monochrome image, since it contains no colour.
8 bit gray level images – the entire image can be thought of as a 2
dimensional array(stored in hardware called frame buffer) of pixel
values ie a bitmap. Image resolution refers to the number of
pixels in a digital image.
7. Dithering
Full-colour photographs may contain an almost infinite range of
colour values. Dithering is the most common means of reducing
the colour range of images down to the 256 (or fewer) colours
seen in 8-bit GIF images.
Dithering is the process of juxtaposing pixels of two colours to
create the illusion that a third colour is present. A simple
example is an image with only black and white in the colour
palette. By combining black and white pixels in complex patterns
a graphics program like Adobe Photoshop can create the illusion
of grey values:
10. Image Data Types
2 most common data types for graphics and image file formats ;
24 bit color – each pixel (picture elements in digital images) is
represented by 3 bytes for RGB ie it supports 256x256x256
possible combined colors.
Many 24 bit images are actually stored as 32 bit images, with the
extra byte of data for each pixel storing an alpha value for
representing special effect information.
Alpha channel is used for compositing several overlapping
object.
11. Image Data Types
8 bit color – it is used for space consideration by quantizing the
color information to collapse it. The concept of Look Up Table is
used to store color information. A data structure called a color
histogram is also used to store the number of occurrences of a
particular color.
CLUTs (palette) – suppose pixel stores 25 => goto row 25 of a
CLUT. Images are usually stored in row-column order as simply
a long series of values.
Note : Formats can be platform independent or platform dependent.
12. Image File Formats
Each file format is characterized by a specific compression type and
colour depth. The choice of file formats would depend on :
-the final image quality required and
-the import capabilities of the authoring system.
Popular file formats are :
-BMP (Bitmap)
-JPEG (Joint Photographers Expert Group)
-GIF (Graphics Interchange Format)
-TIFF (Tagged Image File Format)
- PNG (Portable Network Graphics)
-PICT (Picture)
-TGA (Targa)
- PSD (Photoshop Document)
13. BMP
It is a standard Windows image format on DOS and Windows –
compatible computers.
It supports RGB, Indexed Color, Greyscale, and Bitmap color
modes and does not support alpha channel.
JPEG
It is used to display photographs and other continuous tone images in
HTML documents over the www.
It supports CMYK, RGB and Greyscale color modes and does not support
alpha channel.
14. GIF
It is used to display the indexed color graphics and images in HTML
over the www.
It preserves transparency in indexed color images. It uses 8 bit
color and efficiently compresses solid areas of color while
preserving sharp detail.
It can represent at the most 256 colors for images of higher color
depth by using the Color Look Up Table (CLUT).
15. TIFF
It is used to exchange files between applications and computer
platforms. It is a flexible bitmap image format supported by
virtually all paint, image-editing, and page layout applications.
It supports pixels resolutions of 48 bits, 16 bits for each of RGB and
therefore can be stored in a number of different color models
including CMYK, RGB, indexed color and greyscale images.
It uses lossless compression method and hence is an appropriate
format for printing purpose.
16. PNG, PICT, TGA, PSD
PNG supports 24 bit images and produces background
transparency without jagged edges.
PICT format is especially effective at compressing images with
large areas of solid color.
TGA format supports 24 bit RGB images (8 bits x 3 color channels)
and 32 bit RGB images (8 bits x 3 color + 8 bit alpha channel).
PSD format is used in the Adobe Photoshop package and the only
format supporting all available image models, guides, alpha
channels, spot channels and layers.
17. Image Acquisition
Image Input / Acquisition is the first step of image processing.
It deals with conversion of analog images into digital form, mainly
done with 2 devices:
Scanner – convert a printed image or document into the digital
form.
Digital Camera – digitizes real world images, similar to how a
conventional camera works.
18. Scanner
The scan head contains a source of white light which on getting
reflected by the paper image is made to fall on a grid of
electronic sensors, by an arrangement of mirrors and lenses.
The electronic sensors are called Charge Coupled Devices
(CCD) and are basically converters of the light energy into
voltage pulses.
After a complete scan, the image is converted from a continuous
entity into a discreet form represented by a series of voltage
pulses. This process is called Sampling.
19. Scanner ...
Scanner Types :
1. Flatbed scanner – head with a source of white light, mirrors.
2. Drum Scanners – cylindrical drum, photo multiplier tube
(PMT) .
3. Bar-code scanners – machine readable representation of
information in a visual format.
Color Scanning
20. Image Acquisition
(Ref : Digital Image Processing – Gonzales and Woods)
Elements of Visual Perception
1. Structure of the human eye – cornea, sclera, choroid
and retina
2. Image formation in the eye – radius of curvature of the
anterior surface of the lens is greater than the radius of its
posterior surface. The distance between the center of the lens
and the retina called the focal length, is variable. When the eye
focuses on an object farther away than about 3 m, the lens
exhibits its lowest refractive power.
3. Brightness adaptation and discrimination
24. Image Sensing and Acquisition
The types of images in which we are interested are generated by the
combination of an illumination source and the reflection or absorption of
energy from that source by the elements of the scene being imaged.
Transforming illumination energy into digital images :
1. incoming energy is transformed into a voltage by the combination
of input electrical power and sensor material that is responsive to the
particular type of energy being detected.
2. The output voltage waveform is the response of the sensors and a
digital quality is obtained from each sensor by digitizing its response.
25. Image Acquisition...
Using a Single Sensor
A photodiode is constructed of silicon material and its output voltage
waveform is proportional to light. The filter in front of a sensor improves
selectivity. For generating a 2 D image, there is relative displacement in
x and y direction between the sensor and the imaged area.
Such type of mechanical digitizers are referred to as microdensitometers.
Eg : Flat Bed with bidirectional sensor
26. Image Acquisition...
Using Sensor Strips
Sensor strips mounted in a ring configuration are used in medical and
industrial imaging to obtain cross sectional images of 3-D objects.
Sensing devices with 4000 or more in-line sensors are possible. In-line
sensors are used routinely in airborne imaging applications.
Eg : Flat bed scanners
/media/Sapana/swapna/Subjects/MultimediaPrinciples/slides/Chapter02-Art.ppt
27. ...
Basic steps of image processing : Input, Editing and Output
Basic Concepts in Sampling and Quantization
Suppose there is a curved continuous image, f(x,y), that is to be
converted to digital form. An image may be continuous w.r.t. to x and y
coordinate and also in amplitude.
Digitizing the coordinate values is called Sampling.
Digitizing the amplitude values is called Quantization.
28. Storage Processing
(Ref : Multimedia in Practice by Judith Jeffcoate)
The factors influencing the choice of a suitable storage system for
multimedia will vary according to user's circumstances. They include :
1. Quantity of data to be stored, required access time and
acceptable transfer rate.
2. Type of information to be stored : alphanumeric data, text, line
art,
halftones or grey scale, color, audio or video.
3. Stability of data, rates at which it is acquired and changed, its
expected life span and any legal requirements.
4. Number of copies of data, its distribution, whether system
must be
portable between sites.
5. Cost of data preparation, capture, storage media and related
equipment.
6. Skill and experience of users, their training needs.
7. Interfaces required to existing system, backups, security.
8. conversion of existing data – microfiche to optical disk.
29. Storage processing ...
1. Magnetic media
RAID – Random arrays of inexpensive disks.
2. Optical media
Analogue media
Digital media
3. Compact Disk
CD – DA (compact disk digital audio)
CD – ROM (compact disk read only memory)
Recordable compact disk
CD ROM XA (CD ROM Extended Architecture)
CD I (CD Interactive)
30. Communication
1. Building multimedia networks
Bandwidth – high capacity
Synchronization – video, sound and data.
Different types of information flow – isochronous (continuous) and
asynchronous (bursty)
Variable Demand
31. Image Enhancement(Ref – fundamentals of Digital Image Processing by Anil K Jain)
1. Enhancement by point processing – contrast stretching,
clipping, window slicing, histogram modelling.
2. Spatial filtering – smoothing, filtering, un-sharp masking,
zooming
3. Colour image processing.
32. Image Enhancement...
Image Enhancement refers to accentuation, sharpening of image
features such as edges, boundaries or contrast to make a
graphic
display more useful for display and analysis.
It does not increase the inherent information content in the data.
The greatest difficulty in image enhancement is quantifying the
criterion for enhancement.
33. Point Operations
Point operations are zero memory operations where a given gray
level u E [0,L] is mapped into a gray level v E [0,L] according to a
transformation
v = f(u)
Following are the transformations.
1. Contrast Stretching
Low contrast images occur often due to poor or nonuniform
lighting conditions or due to nonlinearity or small dynamic
range of the imaging sensor
The slope of transformation is chosen greater than unity in
the region of stretch.
34.
35. Explanation
For example, the gray scale intervals where pixels occur most frequently
would be stretched most to improve the overall visibility of a scene.
36. Point Operations...
2. Clipping and thresholding
Clipping is a special case of contrast stretching. This is useful
for noise reduction when the input signal is known to lie in the
range.
Thresholding is a special case of clipping. A binary image may
not give a binary output when scanned because of sensor noise
and background illumination variations. Thresholding is used to
make such an image binary.
37. Point Operations...
3. Digital Negative
A negative image can be obtained by reverse scaling of the gray levels
according to the transformation.
4. Intensity Level Slicing
These transformations permit segmentation of certain gray level
regions
from the rest of the image. This technique is useful when different
features of an image are contained in different gray levels. For example
segmentation of low temperature regions (clouds, hurricane) of two
images where high intensity gray levels are proportional to low
temperature.
39. Point Operations...
5. Bit Extraction
Suppose each image pixel is quantized to bits. It is desired to extract
the
nth most significant bit and display it.
40. Process for File Size Reduction using Mathematical Algorithms.
Raw / Uncompressed media data – analog file that has been
digitized and stored on disk as digital file. To compress size, it
needs to be filtered by a software called CODEC –
Compression / Decompression or Coder / Decoder
Compression process must be reversible.
Decompressed data may not be same as the original
uncompressed data.
Image compression
41. Lossless – The CODECs represent the existing information in a
more compact form without actually discarding any data ensuring
good quality. But the compression is not very high.
Eg. Medical images.
Lossy – Parts of original data are discarded permanently to reduce
the file size, quality is compromised. Compression is very high.
Eg – Multimedia presentations, web page content.
Types of compression
43. Symmetrical – a compression system that requires the same
processing power and time scale to compress and decompress
an image.
Asymmetrical – a compression system that requires the different
processing power and time scale to compress and decompress
an image.
symmetrical and asymmetrical
44. Based on the kind of redundancies.
Intraframe – applicable within a still image or a single video frame.
Redundancies which occur when different portions of an image
are identical, are detected for file compression.
Interframe – applicable when redundancies occur between adjacent
frames in a video sequence (temporal redundancy) occur.
Statistical Redundancy – relationship existing within media data.
Psycho-Visual Redundancy – visual information is not perceived
equally.
Intraframe and Interframe
45. They are also known as Entropy Encoding in which the
compression techniques do not consider the nature of the
information to be compressed. It ignores semantics of the
information to be compressed. Few methods are :
RLE – Run Length Encoding
Shannon – Fano algorithm
arithmetic coding.
Lossless / Statistical Compression
Techniques
46. RLE Method
Sequence of repetitive characters may be replaced by a more compact
form. 'n' successive characters may be replaced by a single instance of
the character and the number of occurrences.
47. This is a basic information theoretic algorithm. A simple example will be
used to illustrate the algorithm:
Symbol A B C D E
----------------------------------
Count 15 7 6 6 5
Encoding for the Shannon-Fano Algorithm:
* A top-down approach
1. Sort symbols according to their frequencies/probabilities, e.g., ABCDE.
2. Recursively divide into two parts, each with approx. same number of
counts.
Shannon – Fano algorithm
48. Symbol Count log(1/p) Code Subtotal (# of bits)
------ ----- -------- --------- --------------------
A 15 1.38 00 30
B 7 2.48 01 14
C 6 2.70 10 12
D 6 2.70 110 18
E 5 2.96 111 15
TOTAL (# of bits): 89
Shannon – Fano algorithm example
49. Also called Source Coding. Nature of input signal is considered for
the compression.
Human audio visual capabilities and limitations are considered for
isolating portions of media that cannot be perceived by average
human senses.
Eg. discarding of colours which cannot be perceived by human eye.
Frequencies not audible by the human ear may be filtered.
Lossy Compression
51. Vector quantization
Vector quantization (VQ) is a lossy data compression method
based on the principle of block coding. It is a fixed-to-fixed length
algorithm.
Block coding - divide message into blocks, each of k bits, called
datawords, add r redundant bits to each block to make the length
n = k + r. The resulting n-bit blocks are called codewords.
Vector quantization is used for lossy data compression, lossy data
correction and density estimation.
52. fractal compression technique
The method is best suited for textures and natural images, relying
on the fact that parts of an image often resemble other parts of
the same image. Fractal algorithms convert these parts into
mathematical data called "fractal codes" which are used to
recreate the encoded image.
53. transform coding
In transform coding, knowledge of the application is used to
choose information to discard, thereby lowering its
bandwidth. The remaining information can then be
compressed via a variety of methods. When the output is
decoded, the result may not be identical to the original
input, but is expected to be close enough for the purpose of
the application.
54. psychoanalysis
It is responsible for analyzing the transformed data and identifying
which portions may be irrelevant with respect to human visual or
acoustic system.
Ear – Psycho acoustic model, frequency masking (sensitivity to
sound), temporal masking (difference in sound level)
Eye – spatial frequency (close light and dark patterns are difficult to
detect)
55. interframe correlation
An inter frame is a frame in a video compression stream which is
expressed in terms of one or more neighboring frames. The
"inter" part of the term refers to the use of Inter frame prediction.
This kind of prediction tries to take advantage from temporal
redundancy (2 techniques) between neighboring frames allowing
to achieve higher compression rates.
Frame Replenishment
Motion Compensation
56. Hybrid – JPEG DCT
JPEG is a compression standard for continuous-tone gray-scale or
colour images. It uses a combination of discreet cosine
transform (DCT), quantization, run length encoding and supports
various modes of operations including lossless and lossy modes.
Its performance depends on the complexity of image.
DCT transforms each block from the spatial domain to the
frequency domain.