The document discusses various aspects related to sound and audio processing. It defines noise and sound, and describes the characteristics of sound like amplitude, wavelength, frequency, period, and waveform. It explains how the human ear can hear sounds within the frequency range of 20Hz to 20,000Hz. It also discusses sound processing components like microphones, amplifiers, and loudspeakers. Sound needs to be converted to digital format before processing using techniques like sampling, quantization, and code-word generation. Both lossy and lossless compression algorithms can be used to compress sound files.
1. • Noise can be defined as disagreeable or undesired sound.
• What is sound to one person may be a noise to somebody else.
• The amplitude of sound waves corresponds to its Loudness and the
frequency corresponds to its pitch and the waveform to the quality.
• Sound of frequency 20Hz to 20000 Hz can be heard by the human ear
and constitutes the audible frequency range.
• To process sound the main system components required are
microphones for sound input, amplifiers for boosting the loudness
levels and loudspeakers for output or playback of sound.
• When sound needs to be processed in a computer it should first be
converted to digital format. This is done by Sampling, Quantization
and code –word generation.
• To compress the sound files both lossy as well as lossless
compression algorithms can be used.
• Example the MP3 audio format for distributing music on the Internet
is based on a lossy compression algorithms.
• Music generated by synthesizers are pure digital sound which do not
2. Power of Sound
• When something vibrates in the air when it is
moving back and forth it creates wave of
pressure.
• The waves spread like ripples from pebble
tossed into a still pool and when it reaches the
eardrum the change of the pressure or
vibration is experienced as sound.
• The sound pressure levels are measured in
decibels (db).
3. Unit IV
Audio and Video
Sound – It is the result of pressure variations or
oscillations in an elastic medium like air, water, solids
generated by a vibrating surface or turbulent fluid
flow.
Sound is a form of energy capable of flowing from
one place to another through a material medium. It is
generated from vibrating objects when part of the
energy of vibration is converted to sound energy.
4. Multimedia Sound
• Windows system default file format for audio
is WAV which resides in the WindowsMedia
subdirectory.
• Macintosh systems use SND as default format
for audio.
5. Acoustics
• It is a branch of science ie physics that studies sound.
• It is a wave motion in gases, liquids and solids and the
effects of such wave motion.
• Thus the range of acoustics ranges from physical
acoustics to bioacoustics , psychoacoustics and music,
design of theatres and concert halls etc.
• Sound is a form of energy. It is generated from
vibrating objects and can flow through a material
medium from one place to another.
• During the generation of sound the kinetic energy of
the vibrating body is converted to sound energy.
• Sound energy flowing outwards from its point of
generation can be compared to spreading wave over
the surface of water.
6. Acoustic Engineering
• The application of acoustics in technology is called as acoustic engineering.
• The sub divisions of acoustics are:
1. Aero-acoustics – It is concerned with how gas flows produce sound and has
application in aeronautics ex- Sound and shockwaves produced by jets.
2. Architectural –acoustics: Concerned with the study of sound in building like
halls and auditoriums.
3. Bio-Acoustics: Concerned with the study of sound in medicine ex – ultra-
sonography .
4. Physical-acoustic: Concerned with interaction of sound with materials and
fluids ex- how shock waves travel through the solid and liquid portions of
the earth’s crust.
5. Speech communication: It is concerned with the production, anyalysis,
transmission and recognition of speech.
6. Ultrasonics: Concerned with the study of high frequency sounds beyond
human hearing range.
7. Musical – acoustics: Concerned with the study of sound in relation to
musical instruments .
7. Characteristic of Sound
• Amplitude – it is the maximum displacement of a particle in
the path of a wave and is measure of the peak-to-peak height
of the wave. For sound waves amplitude mean the loudness
of sound. Larger the energy of the sound wave more is its
amplitude and louder will be its effect in our ears.
• Wave length- The wavelength (λ) which is the distance
travelled by the wave during one cycle unit of wave length is
in meters.
• Frequency – The frequency (f) which is the number of particle
variation cycles in the medium per unit time or the number of
cycles per second and is expressed in HZ.
• Period – The period (T) which is the time taken for one cycle
of a wave to pass a fixed point. It is related to frequency by
T = 1/f sec.
8. • Speed/Velocity- The speed of sound
propagation c, the frequency f and the
wavelength, (λ) are related by the following
equation:
C=f λ
• Waveform – This indicates the actual shape of
the wave when represented pictorially.
Shapes of the waves can be sinusoidal, square,
triangular etc. Two sounds having the same
loudness and pitch but having different
waveforms give different sound in our ears.
9. Hearing Threshold and Masking
• Two fundamental phenomena that govern
human hearing are the minimum hearing
threshold and amplitude masking.
• Minimum threshold is the least audible sound
that the normal human ear can detect and hear.
• The minimum threshold values ie the amplitudes
of the least audible sounds when plotted against
frequency values gives rise to the minimum
threshold curve.
10. • The upper portion denotes the audible region
while the lower portion denotes the inaudible
range.
• The minimum value on the curve occurs
around the frequency of 1 KHZ where the
sensitivity of the human ear is the greatest.
11. • Amplitude Masking occurs because an audible
sound has a tendency to distort the threshold
curve and shift it upwards.
• The human ears contains about 30000 hair cells
which detect local vibrations and convey audio
information to the brain via electrical impulses.
• These hair cells respond to the strongest
stimulation in their local region.
• The amount of distortion of the curve is
restricted to a small region surrounding the
strongest sounds.
• The entire range of audible frequencies is divided
into a number of such regions known as critical
bands.
12. • The critical bands are much narrower at low
frequencies than at high frequencies and the
width of critical band determine the extent of
the distortion of the threshold curve.
• Within a critical band of frequencies only the
strongest sound will be heard while the others
are subdued.
• Sound 1 seems to distort the curve in such a
way that another sound, sound 2 which would
normally have been audible if it had been
occurred alone , falls below the threshold
curve and is rendered inaudible. Sound 1 is
called the masker and sound 2 is called
masked.
13. Temporal Masking
• This occurs when tones are sounded close in time
but not simultaneously.
• A louder tone occurring just before a softer tone
makes the latter inaudible.
• Temporal making increases as time differences
are reduced.
• The concept of masking is used in making digital
audio compressors by analyzing which portions
of sound are inaudible to the average human ear
and selectively discarding those portions to
reduce the file size of the digital audio.
14. Raster Scanning
• It is a rectangular pattern of image capture and
reconstruction in television.
• In this the scanning is done line by line which creates a
raster.
• It is the systematic process of covering the area
progressively one line at a time.
• In raster scan an image is subdivided into a sequence
of strips known as “ scan lines”.
• Each scan line can be transmitted in the form of an
analog signal as it is read from the video source.
• This ordering of pixels by rows is known as raster order
or raster scan.
15. • Raster scanning is the pattern of image
detection and reconstruction in television and
is the pattern of image storage and
transmission used in most computer bitmap
image systems.
• In raster scan an image is cut up into
successive samples called pixels along scan
lines.
• Each scan line can be transmitted as it is read
from the detector or can be stored as a row of
pixels values in an array in a computer system.
16. • After each scan line the position of the scan
line is advanced typically downward across
the image in a process known as vertical
scanning.
• A raster is a series of adjacent parallel lines
which together form an image on a display
screen.
17. Scanning and its types
• One sweep from top to bottom of screen is
called as field.
• Moving the beam from left to right and back is
called as scanning.
• There are two of scanning :
1) Interlaced scanning
2) Non- Interlaced canning
18. Interlace Scanning
• In this the scan-lines for one field are offset or
delayed and interleaved with the next field.
• After every other field the scan-line repeats
therefore two fields are required to make a
complete picture or frame.
• The beam sweeps 262.5 times horizontally for
each vertical sweep.
• To get better picture and avoid flickering
interlaced scanning is used.
19. Non-Interlace Scanning
• The CRTs used for computers usually have
non-interlace scanning.
• The vertical sweep rate is 60HZ and 260
sweep lines per field .
• In non-interlaced scanning there is only one
sweep which sweeps from top to bottom.
• Here the picture is comprised of one frame.
20. Sensors for Camera
• The primary function of a image sensor is the
conversion of light photons falling onto the
image plane into a corresponding spatial
distribution of electric charge.
• The accumulation and storage of this charge
at the point of generation and the conversion
of charge to a usable voltage signal.
• Sensors can be classified into two groups:
1) Vacuum tube 2) Solid-state devices.
21. Vacuum tube
• Vacuum tube devices in which the charge
readout is accomplished by an electron beam
sweeping across the charge in a raster fashion
similar to that in a television picture tube.
• Example of vacuum tubes – TV camera tube
22. Solid State device
• It is based on charge coupled devices or
photodiodes.
• Examples of Solid State device – Charge
Coupled Device (CCD)
• There are two types of CCDs – Linear CCD
- Area CCD
23. Vaccum Tubes – TV Camera Tubes
• Vacuum tubes where the only technology available for television for
many years till now.
• Light from the scene is imaged by a lens onto a photoconductive target
formed on the inner surface of an end window in a vacuum tube.
• An electron gun is placed at the opposite end of the tube to the window ,
and it provides a source of electrons that are focused into a beam which is
accelerated towards the target by a positive potential on a fine mesh
placed just in front of the target and scanned across the target by an
electrostatic or magnetic deflector system.
• The target consists of a glass faceplate on the inner surface, on which is
placed a transparent electrically conducting coating of indium tin oxide.
• On top of this is deposited a thin layer of photoconductive material in a
pattern of tiny squares, each insulated laterally from its neighbors.
• The transparent coating is connected to a positive voltage through an
electrical load resistor across which the signal voltage is developed.
24. • In the absence of light the electron beam causes
the external surface of the photoconductor to be
charged to near zero potential.
• Light causes the resistance of the
photoconductor to decrease and its surface to
acquire a more positive potential due to which
the accumulation of positive charge.
• At each point on the surface touched by the
electron beam some of the beam current is
deposited to neutralize the positive charge
present due to the illumination.
• The rest passes through the load resistor
generating an output voltage that is a function of
the light intensity at that point.
25. CCD
• CCD is another sensor. CCD is fabricated on a single crystal wafer of P –type silicon
and consists of one or two dimensional array of charge storage cells on centers .
• Each cell has several closely spaced electrodes ie gates on top, seperated from the
silicon by an insulating layer of silicon dioxide.
• The charge is stored under one of the electrodes and its location within the cell is
defined by the pattern of positive voltage applied to the electrodes.
• By applying a coordinated sequence of clock pulses to all the electrodes in the
array, packets of stored charge are transferred from one cell to the next until they
finally reach a sensing amplifier which generates a voltage signal proportional to
charge.
• The result is a device that has an linear variation of output voltage with light from
the minimum useful level set by noise to the maximum useful level set by
saturation of the output amplifier or the limited capacity of the charge storage
cells.
• Compared with photographic film CCDs are from 10 to 100 times more sensitive,
linear in response rather than nonlinear and have a much greater dynamic range
so that both faint and bright objects can be recorded in the same exposure.
26. Types Of CCDs
• There are two types of CCDs –
1) Linear Charge Coupled Devices and 2) Area Charge Coupled Device.
Linear Charge Coupled Device –
The Linear CCD consists of a line up to several thousand photosites and parallel CCD shift
register terminated by a sensing amplifier.
Each photosite is separated from a shift register cell by a transfer gate.
During operation a voltage is applied to each photosite gate to create empty storage wells
which then accumulate amounts of charge proportional to the integral of the light intensity
over time.
At the end of the desired integration period the application of a transfer pulse causes the
accumulated charge packets to be transferred simultaneously to the shift register cells
through the transfer gates.
The charges are clocked through the shift register to the sensing amplifier at a rate up to 20
MHz, producing a sequence of voltage pulses with amplitudes proportional to the integrated
light falling on the photosites.
There is a limit between 105 to 106 depending on photosite size and dimensions to the
number of electrons that can be stored in a particular cell beyond which electrons start to
spill over into adjacent cells.
The saturation charge in electrons is roughly 1000 to 2000 times the area of photosite in
square micrometer.
27. Area Charge Coupled Devices
• The Area CCDs uses three basic architectures:
1) The simplest is the full-frame CCD consisting of an imaging area separated from a
horizontal CCD shift register by a transfer gate.
2) In the imaging area, each photosite is one stage of a vertical shift register
separated from neighboring shift registers by channel stops and antiblooming
structures. During the light integration period the vertical clocks are stopped
creating potential wells which collect photoelectrons.
3) At the end of this period the charge is clocked out vertically one row at a time into
the horizontal shift register. The charge in the horizontal shift register is then very
rapidly shifted towards the output amplifier by the application of horizontal clock
signal.
4) In the diagram of full-frame CCD which has pixels arranged in a 1024x1024
configuration and dual horizontal shift registers and outputs to achieve a 30 frame
per second readout rate.
5) To avoid the image sneering during the readout period full frame CCD sensors
must be operated with external shutters as in digital cameras produced by most of
the companies.
28. Color Fundamental
• Within the computers the pixels are stored according to
RGB color space.
• There are three different ways to specify the colors ie –
Hue, Saturation and Brightness which are the components
of color.
• Hue –
It is the basic properties of what the human eye perceives
as colour.
It refers to the colour quality of the light and corresponds
to the colour names that are used such as orange , indigo,
cyan etc.
When we use colour it means Hue. Other words Hue is the
quality of colour.
29. Saturation –
•It refers to the purity of light.
•The more saturated the stimulus the stronger
the color experience and less saturated, the
more it appears white or gray or black which are
called achromatic or no color.
Brightness –
•It is the experience of intensity.
30. Primary and Secondary Colors
Primary Colors –
These are those colors that cannot be created by
mixing other colors.
There are three primary colors ie red, green and
Blue.
Secondary Colors –
•A color created by mixing equal amounts of two
primary colors is called a secondary color. Ex –
Green+Blue=Cyan, Blue+Red=Magenta,
Red+Green=Yellow
31. Color Mixing
• There are two different ways to combine colors those are
1) Additive mixing
2) Subtractive mixing
Additive Mixing –
It refers to combining lights of two different colors. Ex – by
mixing the shines of two colored spotlights on the same
white wall.
Additive color mixing comes about when two or more Hues
mixing together to create a third Hue.
The additive color model is the one ued in computer
displays as the image is formed on the face of the monitor
by combining beams of the green, and blue light in
different proportions
32. Subtractive Mixing –
•It describes how two colored paints or inks combine on
a piece of paper.
•The three subtractive primary colors are Cyan, Magenta
and Yellow.
•It subtracts all the wavelengths not in the color filter.
That is when we put a Magenta filter in front of a light we
are removing all the wave lengths that are not Magenta.
Magenta is comprised of Red and Blue. So when we put
that filter it is as we were using a Red and Blue filter
together. If we want to make Red we need to remove the
Blue.
•By adding Yellow to the Magenta we are in essence
canceling the Blue and leaving with Red.
•Color printers use the subtractive color model and use
Cyan, Magenta and Yellow inks.
33. Color Gamut
• The dyes , pigments and phosphors used to create
colors on paper or computer screens are imperfect and
cannot recreate the full range of visible colors.
• The actual range of colors achievable by a particular
device or medium is called its color gamut .
• Different devices such as computer monitors, printers,
scanners and photographic film all have different color
gamuts, so the problem of achieving consistent color is
quite challenging.
• Different media also differ in their total dynamic range
how dark is the darkest achievable black and how light
is the brightest white.
34. File Formats
• Audio File Formats :
I. WAV(Wave Form Audio)
II. MID or MIDI (Musical Instrument Digital Interface)
III. AU (Audio)
IV. MP3 (Mpeg Layer – III)
V. VOC (Voice)
VI. RMF (Rich Music Format)
VII.MP3PRO
VIII.WMA (Windows Media Audio)
IX. RA (Real Audio)
X. WAV Pack
XI. MPC (Muse Pack)
XII.MOD (Module)
35. • WAV (Waveform Audio) –
This is the format for sampled sounds defined by Microsoft for use
with Windows.
It is an expandable format which supports multiple data formats
and compression schemes .
• AIFF (Audio Interchange File Format) –
The Interchange File Format (IFF) is a generic File format developed
by Electronic Arts in 1985 to facilitate data transfer between
software programs of different vendors, so they don’t have any
common extension.
An IFF file is built up from chunks.
Each chunk is with a TypeID or OS Type followed by a 32-bit integer
specifying the size of the following data.
Chunks can hold different data types like text, numerical data or
raw data.
Examples of AIFF based file formats are 8SVX (audio)
AIFF (audio), ANIM(animation), DOC (pre Word97 text), IILBM
(Raster image)
36. MIDI –
MIDI files are textual files which contains instruction on how to play
a piece of music.
The actual music is generated from a digital synthesizer chip which
can recognize the instructions and retrieve corresponding audio
samples using a repository of sounds.
These files are very compact in size and ideal for web applications.
AU –
This is developed by Sun Microsystems.
This audio file format consists of a header of six 32-bit words which
defines the metadata about the actual audio data following it.
There are 6 numbers defining the file type:
oWord 0 is an identifier defining the file type,
oWord 1 is a data offset value,
oWord 2 is the data size in bytes,
oWord 3 is the data encoding format,
oWord 4 is the number of samples per second ,
oWord 5 indicates the number of channels.
37. • MP3 –
It is highly compressed audio format providing
almost CD-quality sound.
MP3 can compress a song into 5MBfor which it is
extensively used for putting audio content on the
internet.
The file can be coded at a variety of bit rates and
provides good results at bit rates of 96 kbps.
The coding scheme does not provide any error
correction or encryption.
VOC-
It is used for Sound Blaster Sound Cards.
Sound upto 16bit stereo is supported along the
compressed formats.
38. • RMF –
Beatnik is a software based high performance
music and audio playback technology created by
Beatnik Inc.
Beatnik Inc. developed their own audio file
format called Rich Music Format (RMF)
MP3Pro –
It is an extended MP3 version from coding
Technologies that uses Spectral Band Replication
(SBR) in order to increase its efficiency for bit
rates below 96 kbps/stero.
MP3Pro has been implemented in only a few
software and hardware products like Thomson
demo player, Nero etc.
39. • WMA –
It is a proprietary compressed audio file
format used by Microsoft.
It is a part of Windows Media framework.
WMA is always encapsulated in an Advanced
systems Format (ASP) file.
The resulting file may have the filename suffix
“vma” or “asf” with the “wma” suffix being
used only if the file is strictly audio files in this
format can be played using Windows Media
Player,Winamp and many other alternative
media players.
40. • RA –
It is developed by RealNetworks.
It is designed to conform to low bandwidths
and it can be used as a stream audio format ie
it can be played at the same time as it is
downloaded.
Many radio stations use Real Audio to stream
their programing over the Internet in real
time.
The file extention of RA is ra,.rm or ram.
The main player for RealAudio content is
RealNetworks , RealPlayer.
41. Video File Format
The various video file format are :
I.AVI (Audio/Video Interleaved)
II.MPEG (Motion Picture Expert Group)
III.Real Video
IV.Cinepack
V.Windows Media Video (WMV)
AVI
The native video file format on the Windows platform is AVT or audio-video interleaved.
The name implies that in the same file both audio and video media is stored , since a video clip
can contain both types of media.
The term interleaved means tht within the file the video data and the corresponding audio data
are kept in small chunks instead of widely separate blocks.
AVI is an un-compressed format, ie the image frames and audio are stored without any type of
compression and hence the sizes of AVI files could be large.
An AVI file is played both on Windows by using the Windows Media Player or by a Web
Browser.
42. MOV
The Quick Time movie format is developed by
Apple for both the Windows and the Macintosh
platforms.
These have an extension MOV and requires a
program called Movie Player for playback, which is
freely downloadable from the Apple website.
MOV is a compressed format and supports a
number of CODECs.
It is widely used by web developers to create
cross-platform video clips that can be downloaded
from the Internet.
43. MPEG
The MPEG file format developed by the Moving Pictures
Experts Group is a compressed format based on both intra-
frame and inter-frame compression.
There are several versions of MPEG:
I.MPEG-1 – is designed for CD-ROM based applications and
Video-CDs and provides a quality comparable to VHS quality.
II.MPEG-2 – is designed for DVD applications and provides a
quality comparable to SVHS quality.
III.MPEG-4- provides an efficient methods of object-oriented
content based storage and retrieval of multimedia content.
IV.MPEG-7- is a scheme for description of the multimedia
content through a set of standardized descriptors so that
media objects may be retrieved using queries.
44. REAL VIDEO
The RM file format was developed by Real
Networks for playing video files form Web
Pages.
It supports streaming which means that the
video file starts playing even before they are
downloaded from the Internet.
A program called Real Player is required to
play back a RM file which is freely downloadable
from the Real Networks web site.
45. CINEPAK
CINEPAK was originally developed to play small movies
on 386 and 030 systems from a single-speed CD-ROM
drive.
Its greatest advantage is its extremely low CPU
requirements.
Developers are using CINEPAK at higher data rates than
it was originally designed for and making ever-larger
movies.
VDO LIVE
VDOLive is a architecture for web video delivery
created by VDOnet corporation VDOLive is a server-
based,”true steaming” architecture that actually adjusts
to viewers connections as they watch movies.
VDOLive’s true streaming approach differs from
Quicktime’s progressive download approach.
46. WMV-
It is a set of proprietary streaming video technologies
developed by Microsoft and part of the Windows Media
Framework.
The Windows Media Audio format is used for audio track.
Some third party players like Mplayer for Linux can also play
back WMV files.
WMV also implements digital rights management facilities
Microsoft has submitted version 9 of the WMV CODEC to the
Society for Motion Pictures and Television Engineers for
approval as an international standard.
47. Unit V
Compression and Coding
• The different types of media used in multimedia applications are text, fax, images,
speech, audio and video are represented in digital form.
• These different applications in multimedia need different storage capacity.
• Since the space needed increases according to different applications it is difficult
to store and transfer such data.
• Following are the difficulties in possessing such data:
I. Since their size is huge, storage requirements increases with multiple media files
due to which there increases in the cost.
II. Even if there is sufficient storage space such files require a large data transfer rate
that may be beyond the capabilities of both the processor and hard disk.
Due to these facts it is necessary to subject the media files to a process called
compression.
Compression reduces their file sizes using mathematical algorithms after which it
becomes much easier to manipulate these files.
The amount of compression which can be achieved depends both on the original
media data as well as the compression technique applied.
48. Need FOR Compression
• Compression is the key to the future expansion of the web, it’s the key to
the increasingly use of multimedia and 3-D technology.
• When multimedia data objects like binary document images, grey scale
images, color images, photographic or video images , animated images,
full- motion video etc are digitized large volume of data are generated.
• Since every bit incurs a cost when stored, retrieved, transmitted and
displayed.
• The data objects requires a large amount of data storage due to which the
access time for retrieving data increases.
• In order to manage large multimedia data objects efficiently these data
objects need to be compressed to reduce the file size for the storage of
these objects.
• The goal of data compression is to represent an information source as
accurately as possible using the fewest number of bits.
• Thus the compression techniques compresses the data which becomes
easy to store, cost is reduced to store the large data.
49. Types of Compression
• The two types of compression are Lossy and Lossless compression.
Lossless data compression
It makes use of data compression algorithms that allows the exact original
data to be reconstructed from the compressed data.
Examples are executable programs and source code.
Some image file formats like PNG use only lossless compression and GIF
uses a lossless compression method.
In lossless compression data is not altered or lost in the process of
compression or decompression.
Decompression generates an exact replica of the original objects.
Lossless compression technique is good for text data and for repetitive
data in images like binary images and grey scale images.
Lossless compression techniques have been able to achieve reduction in
size in the range of 1/10 to 1/50 of the original uncompressed size
without visibility affecting image quality.
50. Lossy Compression
When compression methods are used that may result in loss of some
amount of information the key issue is the effect of this loss.
Lossy compression is often used for compressing audio, grey scale or color
images and video objects in which absolute data accuracy is not essential.
Gray scale or color image are known as continuous tone images.
When a video image is decompressed on a frame by frame basis the loss of
data in one frame will not be perceived by the eye.
In all cases there is usually a time factor that helps reduce the loss of
information.
Lossy compression technique can be used alone or in combination with
other compression methods in a multimedia objects consisting of audio, color
images and video as well as other specialized data types.
Some lossy compression mechanisms are:
1.Joint Photographic Experts Group (JPEG)
2.Moving Picture Experts Group (MPEG)
3.Intel DVI
4.Factals
51. Lossy versus Lossless Techniuqe
I. A lossy method can produce much smaller compressed file than
any known lossless method .
II. Lossy methods are most often used for compressing sound,
images or videos.
III. The compression ratio of lossy video codecs are nearly always far
superior to those of the audio and still-images.
IV. Audio can be compressed at 10:1 with no noticeable loss of
quality.
V. Video can be compressed immensely with little visible quality loss
e.g. 300:1.
VI. When a user acquires a lossy-compressed file the retrieved file
can be quite different from the original at the bit level while being
in distinguishable to the human ear or eye for most practical
purposes.
52. I. Lossless compression algorithms usually exploit
statistical redundancy in such a way as to
represent the sender’s data more concisely.
II. Lossless compression is possible because most
real-world data has statistical redundancy.
III. Lossless compression schemes are reversible so
that the original data can be reconstructed .
IV. Lossless data compression algorithms will
always fail to compress some files.
V. Attempts to compress data that has been
compressed already will result in an expansion.
53. Video Compression Techniques
SIMPLE PREDICTIVE
TRANSFORM
Truncation INTERPOLATIVE DPCM STATISTICAL
DCT
CLUT SUSAMPLE MOTION HUFFMAN
RUN-LENGTH COMP
FIXED ADAPTIVE
COLOR VIDEO
BIT
COMPONENTS COMPRESSION
ASSIGNMENT
ALGORITHM
54. • In video compression the word technique refers to a single method
of compression .
• Algorithm refers to the collection of all the techniques used by any
particular video compression system.
• The block diagram above explains how techniques are used to
create an algorithm.
• Most compression systems will deal with the color components
separately, processing each one by itself.
• In decompression the components similarly are separately
recovered and then combined into the appropriate display format
after decompression.
• The output of a compression process is a bit stream it is usually no
longer a bitmap and individual pixels may not be recognizable.
• The structure of the bit stream is important because it can also
affect the compression efficiency and the behaviour of the system
when errors occur in transmission or storage.
• In the above diagram the separate box called bit assignment is
where the bit stream structure is imposed on the compressed data.
55. Various video compression techniques are:
1)Simple compression techniuqe
2)Interpolative compression technique
3)Predictive compression technique
4)Transform compression technique
5)Statistical compression technique
Simple Video Compression techniques -
This compression technique has three types
I.Truncation
II.Clut
III.Run length
56. Truncation –
•In truncation the data is reduced by lowering of
•the bits pixel.
•This is done by removing away some of the least significant
bits for every pixel.
•If truncation removes major information pixel then
contouring occurs and image will start looking like cartoon.
CLUT – Color Lookup Table
•It is a mechanism used to transform a range of input colors
into another range of colors.
•It can be a hardware device built into an imaging system or a
software function built into an image processing application.
•The hardware color look-up table will convert the logical
color numbers stored in each pixel of video memory into
physical color normally represented as RGB triplets that can
be displayed on a computer monitor.
57. • The palette is simply a block of fast RAM which is
addressed by the logical color and whose output is
split into the red, green and blue levels which drive the
actual display.
Run Length Encoding
• In this technique blocks of repeated pixels are replaced
with a single value and a count of how many times the
repeat that value.
• It works well on images which have areas of solid
colors example – computer generated images ,
cartoons or CLUT images.
• Depending on the kind of image the RL coding can
achieve large amount of compression but effectiveness
is limited to image that contain large amount of
repeated values.
58. 2) Interpolative Compression Technique –
•In this the compression is done only on the
chrominance part of the image while the
luminance part is not interpolated.
3) Predictive Compression Technique –
•In this compression technique only that part is
transmitted which is different as we have
predicted the part that is same .
•Different predictive techniques are : PCM,
DPCM and ADPCM.
59. Transform-
•In transform coding compression technique it is
a process that converts a bundle of data into an
alternate form which is more convenient for
some particular purpose.
Statistical –
•Statistical coding takes the advantage of the
distribution of the pixels values of an image.
60. JPEG
The JPEG standard was designed to provide a
common methodology for compression of
continuous-tone images i.e for images not
restricted to dual-tone only.
This standard allows compressed files for gray-
scale images, photographic images and still
video which can be utilized by various
multimedia storage and communication
applications.
61. JPEG Compression Technique
• The most common accepted single-frame image compression is JPEG.
• JPEG consists of a minimum implementation which are required to support various extension for specific
applications.
• JPEG compressor chips and PC boards are available to speed up the compression/decompression
operation.
• JPEG compression algorithms involves eliminating redundant data.
• The amount of loss is determined by the compression ratio about 16:1 with no visible degradation.
• JPEG compression involves several processing stages, starting with an image from a camera or other video
source.
• The image frame consists of three 2-D patterns of pixels, one for luminance and tow for chrominance. As
the human eye is less sensitive to high-frequency color information, JPEG calls for the coding of
chrominance information at a reduced resolution compared to the luminance information.
• In the pixel format there is usually a large amount of low-spatial-frequency information and relatively
small amounts of high-frequency information.
• The image information is then transformed from the pixel domain to the frequency domain by a discrete
cosine transform, a DSP algorithm similar to the fast Fourier transform.
• This produces two-dimensional spatial-frequency components many of which will be zero and discarded.
• Near zero components are truncated to zero and need not be sent on either.
• JPEG while designed for still images is often applied to moving images or video.
• Motion JPEG is possible if the compression/decompression algorithm is executed fast enough to keep up
with the video data stream, but at typical compression ratios of about 16:1.