Compression

Multimedia Compression
 Audio, image and video require vast amounts
of data
 320x240x8bits grayscale image: 77Kb
 1100x900x24bits color image: 3MB
 640x480x24x30frames/sec: 27.6 MB/sec
 Low network’s bandwidth doesn't allow for
real time video transmission
 Slow storage or processing devices don't
allow for fast playing back
 Compression reduces storage requirements

E.G.M. Petrakis Multimedia Compression 1

Classification of Techniques
Lossless: recover the original
representation
Lossy: recover a representation similar
to the original one
high compression ratios
more practical use
Hybrid: JPEG, MPEG, px64 combine
several approaches

Compression Standards
Furht at.al. 96


Lossless Techniques

Furht at.al. 96


Lossy Techniques

Furht at.al. 96


JPEG Modes of Operation
Sequential DCT: the image is encoded in
one left-to-right, top-to-bottom scan
Progressive DCT: the image is encoded
in multiple scans (if the transmission
time is long, a rough decoded image can
be reproduced)
Hierarchical: encoding at multiple
resolutions
Lossless : exact reproduction

JPEG Block Diagrams

Furht at.al. 96


JPEG Encoder
Three main blocks:
Forward Discrete Cosine Transform (FDCT)
Quantizer
Entropy Encoder
Essentially the sequential JPEG encoder
Main component of progressive, lossless
and hierarchical encoders
For gray level and color images


Sequential JPEG
 Pixels in [0,2p-1] are shifted in [-2p-1,2p-1-1]
 The image is divided in 8x8 blocks
 Each 8x8 block is DCT transformed
C (u ) C ( v ) 7 7 ( 2 x + 1)uπ ( 2 y + 1)vπ
F ( u, v ) =
2
∑∑ f ( x, y ) cos 16 cos 16
2 x =0 y =0
 1
 for u = 0
C (u ) =  2
 1 for u > 0

 1
 for v = 0
C (v ) =  2
 1 for v > 0


DCT Coefficients
F(0,0) is the DC coefficient:
average value over the 64 samples
The remaining 63 coefficients are
the AC coefficients
Pixels in [-128,127]: DCTs in
[-1024,1023]
Most frequencies have 0 or near to 0
values and need not to be encoded
This fact achieves compression

Quantization Step
All 64 DCT coefficients are quantized
Fq(u,v) = Round[F(u,v)/Q(u,v)]
Reduces the amplitude of coefficients
which contribute little or nothing to 0
Discards information which is not visually
significant
Quantization coefficients Q(u,v) are
specified by quantization tables
A set of 4 tables are specified by JPEG

Quantization Tables
Furht at.al. 96

 for (i=0; i < 64; i++)

for (j=0; j < 64; j++) Q[i,j] = 1 + [ (1+i+j) quality];
 quality = 1: best quality, lowest compression
 quality = 25: poor quality, highest compression


AC Coefficients
 The 63 AC coefficients Furht at.al. 96

are ordered by a “zig-zag”
sequence
 Places low frequencies
before high frequencies
 Low frequencies are likely
to be 0
 Sequences of such 0
coefficients will be
encoded by fewer bits

DC Coefficients
 Predictive coding of DC Coefficients
 Adjacent blocks have similar DC intensities
 Coding differences yields high compression


Entropy Encoding
 Encodes sequences of quantized DCT
coefficients into binary sequences
 AC: (runlength, size) (amplitude)
 DC: (size, amplitude)
 runlength: number consecutive 0’s, up to 15
 takes up to 4 bits for coding
 (39,4)(12) = (15,0)(15,0)(7,4)(12)
 amplitude: first non-zero value
 size: number of bits to encode amplitude
 0 0 0 0 0 0 476: (6,9)(476)

Huffman coding
Converts each sequence into binary
First DC following with ACs
Huffman tables are specified in JPEG
Each (runlength, size) is encoded using
Huffman coding
Each (amplitude) is encoded using a
variable length integer code
(1,4)(12) => (11111101101100)

Example of Huffman table
Furht at.al. 96


JPEG Encoding of a 8x8 block

Furht at.al. 96


Compression Measures
 Compression ratio (CR): increases with higher
compression
 CR = OriginalSize/CompressedSize
 Root Mean Square Error (RMS): better
quality with lower RMS
1
∑
n
RMS = i =1
( X i − xi ) 2

n
Xi: original pixel values
xi: restored pixel values
 n: total number of pixels

Furht at.al. 96


JPEG Decoder
The same steps in reverse order
The binary sequences are converted to
symbol sequences using the Huffman tables
F’(u,v) = Fq(u,v)Q(u,v)
Inverse DCT

1 7 7 ( 2 x + 1)uπ ( 2 y + 1)vπ 
F ( x, y ) =  ∑∑ C (u )C ( v ) F (u, v ) cos cos 
4  u =0 v = 0 16 16 


Progressive JPEG
 When image encoding or transmission takes
long there may be a need to produce an
approximation of the original image which is
improved gradually
Furht at.al. 96


Progressive Spectral Selection
The DCT coefficients are grouped into
several bands
Low-frequency bands are first
band1: DC coefficient only
band2: AC1,AC2 coefficients
band3: AC3, AC4, AC5, AC6 coefficients
band4: AC7, AC8 coefficients


Lossless JPEG
Simple predictive encoding Furht at.al. 96

prediction schemes


Hierarchical JPEG
 Produces a set of images at multiple
resolutions
 Begins with small images and continues
with larger images (down-sampling)
 The reduced image is scaled-up to the
next resolution and used as predictor for
the higher resolution image


Encoding
1. Down-sample the image by 2a in each x, y
2. Encode the reduced size image
(sequential, progressive ..)
3. Up-sample the reduced image by 2
4. Interpolate by 2 in x, y
5. Use the up-sampled image as predictor
6. Encode differences (predictive coding)
7. Go to step 1 until the full resolution is
encoded

Furht at.al. 96


JPEG for Color images
Encoding of 3 bands (RGB, HSV etc.) in
two ways:
Non-interleaved data ordering: encodes
each band separately
Interleaved data ordering: different bands
are combined into Minimum Coded Units
(MCUs)
Display, print or transmit images in parallel with
decompression


Interleaved JPEG
 Minimum Coded Unit (MCU): the smallest
group of interleaved data blocks (8x8)

Furht at.al. 96


Video Compression
Various video encoding standards:
QuickTime, DVI, H.261, MPEG etc
Basic idea: compute motion between
adjacent frames and transmit only
differences
Motion is computed between blocks
Effective encoding of camera and object
motion


MPEG
The Moving Picture Coding Experts
Group (MPEG) is a working group for the
development of standards for
compression, decompression, processing,
and coded representation of moving
pictures and audio
MPEG groups are open and have
attracted large participation
http://mpeg.telecomitalialab.com

MPEG Features
Random access
Fast forward / reverse searches
Reverse playback
Audio – visual synchronization
Robustness to errors
Auditability
Cost trade-off


MPEG -1, 2
At least 4 MPEG standards finished or
under construction
MPEG-1: storage and retrieval of moving
pictures and audio on storage media
352x288 pixels/frame, 25 fps, at 1.5 Mbps
Real-time encoding even on an old PC
MPEG-2: higher quality, same principles
720x576 pixels/frame, 2-80 Mbps


MPEG-4
Encodes video content as objects
Based on identifying, tracking and
encoding object layers which are
rendered on top of each other
Enables objects to be manipulated
individually or collectively on an
audiovisual scene (interactive video)
Only a few implementations
Higher compression ratios

MPEG-7
Standard for the description of
multimedia content
XML Schema for content description
Does not standardize extraction of
descriptions
MPEG1, 2, and 4 make content
available
MPEG7 makes content semantics
available

MPEG-1,2 Compression
 Compression of full motion video, interframe
compression, stores differences between frames
 A stream contains I, P and B frames in a given pattern
 Equivalent blocks are compared and motion vectors
are computed and stored as P and B frames
Furht at.al. 96


Frame Structures
 I frames: self contained, JPEG encoded
 Random access frames in MPEG streams
 Low compression
 P frames: predicted coding using with
reference to previous I or P frame
 Higher compression
 B frames: bidirectional or interpolated coding
using past and future I or P frame
 Highest compression


Example of MPEG Stream

Furht at.al. 96

 B frames 2 3 4 are bi-directionally coded
using I frame 1 and P frame 5
 P frame 5 must be decoded before B frames 2 3 4
 I frame 9 must be decoded before B frames 6 7 8
 Frame order for transmission: 1 5 2 3 4 9 6 7 8

MPEG Coding Sequences
The MPEG application determines a
sequence of I, P, B frames
For fast random access code the
whole video as I frames (MJPEG)
High compression is achieved by using
large number of B frames
Good sequence: (IBBPBBPBB)
(IBBPBBPBB)...

Motion Estimation
The motion estimator finds the best
matching block in P, B frames
Block: 8x8 or16x16 pixels
P frames use only forward prediction: a
block in the current frame is predicted
from past frame
B frames use forward or backward or
prediction by interpolation: average of
forward, backward predicted blocks

Motion Vectors

block:
16x16pixles
Furht at.al. 96

 One or two motion vectors per block
 One vector for forward predicted P or B frames or
backward predicted B frames
 Two vectors for interpolated B frames


MPEG Encoding
 I frames are JPEG compressed
 P, B frames are encoded in terms of future or
previous frames
 Motion vectors are estimated and differences
between predicted and actual blocks are
computed
 These error terms are DCT encoded
 Entropy encoding produces a compact binary code
 Special cases: static and intracoded blocks


MPEG encoder
JPEG encoding

Furht at.al. 96


MPEG Decoder
Furht at.al. 96


Motion Estimation Techniques
Not specified by MPEG
Block matching techniques
Estimate the motion of an nxm block in
present frame in relation to pixels in
previous or future frames
The block is compared with a previous or
forward block within a search area of size
(m+2p)x(n+2p)
m = n = 16
p = 6

Block Matching

Furht at.al. 96

 Search area in block matching techniques
 Typical case: n=m=16, p=6
 F: block in current frame
 G: search area in previous (or future) frame


Cost functions
 The block has moved to the position that
minimizes a cost function
I. Mean Absolute Difference (MAD)

1 n/2 m/2
MAD ( dx, dy ) = ∑/ 2 j=∑/ 2F (i, j ) − G (i + dx, j + dy )
mn i = − n − m
 F(i,j) : a block in current frame
 G(i,j) : the same block in previous or future
frame
 (dx,dy) : vector for the search location
 dx=(-p,p), dy=(-p,p)

More Cost Functions
II. Mean Squared Difference (MSD)
1 n/2 m/2
∑/ 2 j =∑/ 2F (i, j ) − G(i + dx, j + dy )
2
MSD (dx, dy ) =
mn i = − n − m

III. Cross-Correlation Difference (CCF)
∑∑ F (i, j )G(i + dx, j + dy )
CCF (dx, dy ) = i j
1/ 2 1/ 2
   
 ∑∑ F 2 (i, j )   ∑∑ G 2 (i + dx, j + dy ) 
   
 i j   i j 


More cost Functions
IV. Pixel Difference Classification (PDC)
PDC ( dx, dy ) = ∑∑ T ( dx, dy , i, j )
i j

1 if F (i, j ) − G (i + dx, j + dy ) ≤ t
T ( dx, dy , i, j ) = 
0 otherwise

 t: predefined threshold
 each pixel is classified as a matching
pixel (T=1) or a mismatching pixel (T=0)
 the matching block maximizes PDC

Block Matching Techniques
Exhaustive: very slow but accurate
Approximation: faster but less accurate
Three-step search
2-D logarithmic search
Conjugate direction search
Parallel hierarchical 1-D search (not
discussed) Pixel difference classification
(not discussed here)


Exhaustive Search
Evaluates the cost function at every
location in the search area
Requires (2p+1)2 computations of the cost
function
For p=6 requires169 computations per
block!!
Very simple to implement but very slow


Three-Step Search
Computes the cost function at the
center and 8 surrounding locations in
the search area
The location with the minimum cost
becomes the center location for the next
step
The search range is reduced by half


Three-Step Motion Vector
Estimation (p=6)
Furht at.al. 96


Three–Step Search
1. Compute cost (MAD) at 9 locations
• Center + 8 locations at distance 3 from center
1. Pick min MAD location and recompute MAD
at 9 locations at distance 2 from center
2. Pick the min MAD locations and do same at
distance 1 from center
• The smallest MAD from all locations indicates
the final estimate
• M24 at (dx,dy)=(1,6)
• Requires 25 computations of MAD

2-D Logarithic Search
Combines cost function and predefined
threshold T
Check cost at M(0,0), 2 horizontal and 2
vertical locations and take the minimum
If cost at any location is less than T
then search is complete
If no then, search again along the
direction of minimum cost - within a
smaller region

Furht at.al. 96

 if cost at M(0,0) < T then search ends!
 compute min cost at M1,M2,M3,M4; take their min;
 if min cost < M(0,0)
 if (cost less than T) then search ends!
 else compute cost at direction of minimum cost (M5,M6 in the example);
else compute cost at the neighborhood of min cost within p/2 (M5 in
the example)

Conjugate Direction Search
Furht at.al. 96

 Repeat
 find min MAD along dx=0,-1,1 (y fixed): M(1,0) in example
 find min MAD along dy=0,-1,1 starting from previous min (x
fixed): M(2,2)
 search similarly along the direction connecting the above mins

Other Compression
Techniques
Digital Video Interactive (DVI)
similar to MPEG-2
Fractal Image Compression
Find regions resembling fractals
Image representation at various resolutions
Sub-band image and video coding
Split signal into smaller frequency bands
Wavelet-based coding

References
 B. Furht, S. W. Smoliar, H-J. Zang, “Video and Image
Processing in Multimedia Systems”, Kluwer Academic
Pub, 1996


Compression

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (19)

Similar a Compression

Similar a Compression (20)

Compression