2. Page 2
Image CompressionImage Compression
• Compression techniques are used to reduce the redundant
information in the image data in order to facilitate the storage,
transmission and distribution of images (e.g. GIF, TIFF, PNG, JPEG).
• Figure 7.1 depicts a general data compression scheme, in which
compression is performed by an encoder and decompression is
performed by a decoder.
3. Page 3
The JPEG Standard
• We call the output of the encoder codes or code words. The
intermediate medium could either be data storage or a
communication/computer network. If the compression and
decompression processes introduce no information loss, the
compression scheme is lossless; otherwise, it is lossy.
• JPEG is an image compression standard that was developed by the
“Joint Photographic Experts Group”. JPEG was formally accepted as an
international standard in 1992.
• JPEG is a lossy image compression method.
4. Page 4
• If the total number of bits required to represent the data before
compression is B0 and the total number of bits required to represent
the data after compression is B1, then we define the compression ratio
as:
• the compression ratio for image data using lossless compression
techniques (e.g., Huffman Coding, Arithmetic Coding, LZW) is low when
the image histogram is relatively flat.
5. Page 5
• For image compression in multimedia applications, where a higher
compression ratio is required, lossy methods are usually adopted. In
lossy compression, the compressed image is usually not the same as the
original image but is meant to form a close approximation to the
original image perceptually.
• An image is a function of i and j (or conventionally x and y) in the spatial
domain. The 2D DCT is used as one step in JPEG in order to yield a
frequency response which is a function F(u, v) in the spatial frequency
domain, indexed by two integers u and v.
Main Steps in JPEG ImageMain Steps in JPEG Image
CompressionCompression
6. Page 6
• The effectiveness of the DCT transform coding method in JPEG relies on
3 major observations:
Observation 1: Useful image contents change relatively slowly across the
image, i.e., it is unusual for intensity values to vary widely several times
in a small area, for example, within an 8×8 image block.
• much of the information in an image is repeated, hence “spatial
redundancy”.
Observations for JPEG Image
Compression
7. Page 7
Observations for JPEG Image
Compression
(continued)
Spatial frequency indicates how many times pixel values change across
an image block .The DCT formalizes this notion with a measure of how
much the image contents change in relation to the number of cycles of a
cosine wave per block .
8. Page 8
Observations for JPEG Image
Compression
(continued)
Observation 2: Psychophysical experiments suggest that humans are much
less likely to notice the loss of very high spatial frequency components
than the loss of lower frequency components.
• the spatial redundancy can be reduced by largely reducing
the high spatial frequency contents.
9. Page 9
Observations for JPEG Image
Compression
(continued)
Observation 3: Visual acuity (accuracy in distinguishing closely spaced
lines) is much greater for gray (“black and white”) than for color.
• chroma subsampling (4:2:0) is used in JPEG.
11. Page 11
Main Steps in JPEG Image
Compression
• Transform RGB to YIQ or YUV and subsample color.
• DCT on image blocks.
• Quantization.
• Zig-zag ordering and run-length encoding.
• Entropy coding.
12. Page 12
DCT on image blocks
• Each image is divided into 8 × 8 blocks. The 2D DCT is applied to each
block image f(i, j), with output being the DCT coefficients F(u, v) for
each block.
• Using blocks, however, has the effect of isolating each block from its
neighboring context. This is why JPEG images look choppy (“blocky”)
when a high compression ratio is specified by the user.
14. Page 14
Quantization
•The quantization step in JPEG is aimed at reducing the total number of
bits needed for a compressed image.
•It consists of simply dividing each entry in the frequency space block
by an integer, then rounding:
15. Page 15
Quantization
(9.1)
• F(u, v) represents a DCT coefficient, Q(u, v) is a “quantization matrix” entry, and
represents the quantized DCT coefficients which JPEG will use in the
succeeding entropy coding.
– The quantization step is the main source for loss in JPEG compression.
– The entries of Q(u, v) tend to have larger values towards the lower right
corner. This aims to introduce more loss at the higher spatial frequencies
— a practice supported by Observations 1 and 2.
– Table 9.1 and 9.2 show the default Q(u, v) values obtained from
psychophysical studies with the goal of maximizing the compression ratio
while minimizing perceptual losses in JPEG images.
( , )ˆ( , )
( , )
F u v
F u v round
Q u v
= ÷
22. Page 22
Preparation for Entropy Coding
we have so far seen two of the main steps in JPEG compression : DCT and
Quantization . The remaining small steps shown in the block diagram in fig
9.1 all lead up to entropy coding of the quantized DCT coefficients .Theses
Additional data compression steps are lossless. Interestingly ,the DC and
AC
Coefficients are treated quite differently before entropy coding : run-
Length encoding on Acs versus DPCM on DCs .
23. Page 23
Run-length Coding (RLC) on AC
coefficients
Notice in Figure 9.2 the many zeros in after quantization is applied.
Run-length Coding (RLC) (or Run-length Encoding, RLE) is therefore useful in
turning the values into sets {#-zeros-to-skip, next nonzero value}. RLC
is even more effective when we use an addressing scheme, making it most
likely to hit a long run of zeros: a zigzag scan turns the 8 x 8 matrix
into a 64-vector, as Figure 9.4 illustrates
ˆ ( , )F u v
24. Page 24
Fig. 9.4: Zig-Zag Scan in JPEG
•The zigzag scan order has a good chance of concatenating long runs of
zeros. For example, F^(u, v) in Figure 9.2 will be tuned into (32, 6, -1, -1, 0,
1, 0, 0, 0, -1, 0, 0, 1, 0, 0, ... ,0)
25. Page 25
• The RLC step replaces values by a pair (RUNLENGTH, VALUE} for each
run of zeros in the AC coefficients of F, where RUNLENGTH is the
number of zeros in the run and VALUE is the next nonzero coefficient.
• To further save bits, a special pair (0,0) indicates the end-of-block after
the last nonzero AC coefficient is reached .In the above example , not
considering the first (DC) component, we will thus have :
(0,6) (0,-1) (0,-1) (1,-1) (3,-1) (2,1) (0,0)
• Note : It is worth noting that unlike the run-length coding of the AC
Coefficients which is performed on each individual block , DPCM for
the DC coefficients in JPEG is carried out on the entire image at once.
.
26. Page 26
DPCM on DC coefficients
The DC coefficients are coded separately from the AC ones. Each 8×8 image
block has only one DC coefficient. The values of the DC coefficients for
various blocks could be large and different, because the DC value reflects
the average intensity of each block, but consistent with Observation 1
above, the DC coefficient is unlikely to change drastically within a short
distance. This makes DPCM an ideal scheme for coding the DC coefficients.
27. Page 27
DPCM on DC coefficients
If the DC coefficients for the first five image blocks are 150, 155, 149, 152,
144, DPCM would produce 150, 5, −6, 3, −8, assuming the predictor for the
i th block is simply di = DCi − DCi−1, and d0 = DC0. We expect DPCM codes to
generally have smaller magnitude and variance, which is beneficial for the
next entropy coding step. It is worth noting that unlike the run-length coding
of the AC coefficients, which is performed on each individual block, DPCM
for the DC coefficients in JPEG is carried out on the entire image at once .
28. Page 28
Entropy Coding
• The DC and AC coefficients finally undergo an entropy coding step .
• Below, we will discuss only the basic entropy coding method, which
uses Huffman coding and supports only 8-bit pixels in the original
images (or color image components).
29. Page 29
Huffman Coding of DC Coefficients
• Each DPCM-coded DC coefficient is represented by a pair of symbols
(SIZE.AMPLITUDE)
• where SIZE indicates how many bits are needed for representing the
coefficient and AMPLITUDE contains the actual bits.
30. Page 30
• Notice that DPCM values could require more than 8 bits and could be
negative values.
• The one's complement scheme is used for negative numbers that is,
binary code 10 for 2, 01 for-2; 11 for 3, 00 for -3; and so on. In the
example we are using, codes 150, 5, -6, 3, -8 will be turned into :
(8, 10010110), (3, 101), (3, 001), (2, 11), (4, 0111)
31. Page 31
Huffman Coding of AC Coefficients
• Recall we said that the AC coefficients are run length coded and are
represented by pairs of numbers :
( RUNLENGTH, VALUE) .
• However in an actual JPEG implementation, VALUE is further
represented by SIZE and AMPLITUDE,
• as for the DCs. To save bits, RUNLENGTH and SIZE are allocated only 4
bits each and squeezed into a single byte -let's call this Symbol 1.
Symbol 2 is the AMPLITUDE value its number of bits is indicated by
SIZE:
Symbol1: (RUNLENGTH, SIZE) , Symbol2: (AI1PLITUDE)
32. Page 32
Four Commonly Used JPEG
Modes
• Sequential Mode
• Progressive Mode.
• Hierarchical Mode.
• Lossless Mode
33. Page 33
Sequential Mode
•Sequential Mode : the default JPEG mode, implicitly assumed in the
discussions so far. Each graylevel image or color image component is
encoded in a single left-to-right, top-to-bottom scan.
34. Page 34
Progressive Mode
Progressive JPEG delivers low quality versions of the image quickly,
followed by higher quality passes.
1. Spectral selection: Takes advantage of the “spectral” (spatial frequency
spectrum) characteristics of the DCT coefficients:higher AC components
provide detail information.
Scan 1: Encode DC and first few AC components, e.g.,
AC1, AC2.
Scan 2: Encode a few more AC components, e.g., AC3,
AC4, AC5.
...
Scan k: Encode the last few ACs, e.g., AC61, AC62, AC63.
35. Page 35
Progressive Mode (Cont’d)
2. Successive approximation: Instead of gradually encoding spectral bands,
all DCT coefficients are encoded simultaneously but with their most
significant bits (MSBs) first.
Scan 1: Encode the first few MSBs, e.g., Bits 7, 6, 5, 4.
Scan 2: Encode a few more less significant bits, e.g., Bit 3.
... Scan m: Encode the least significant bit (LSB), Bit 0.
36. Page 36
Hierarchical Mode
• The encoded image at the lowest resolution is basically a compressed
low-pass filtered image, whereas the images at successively higher
resolutions provide additional details (differences from the lower
resolution images).
• Similar to Progressive JPEG, the Hierarchical JPEG images can be
transmitted in multiple passes progressively improving quality.
38. Page 38
Lossless Mode
Lossless JPEG is a very special case of JPEG which indeed has no loss in its
image quality. As discussed in Chap. 7, however, it employs only a simple
differential coding method, involving no transform coding. It is rarely used,
since its compression ratiois very low compared to other, lossy modes. On
the other hand, it meets a special need, and the subsequently developed
JPEG-LS standard is specifically aimed at lossless image compression .
39. Page 39
The JPEG2000 Standard
• The JPEG standard is no doubt the most successful and popular image
format to date. The main reason for its success is the quality of its
output for relatively good compression ratio. However, in anticipating
the needs and requirements of next-generation imagery applications,
the JPEG committee also defined a new standard: JPEG2000. The main
part, the core coding system, is specified in ISO/IEC 15444-1 .The
newJPEG2000 standard aims to provide not only a better rate-
distortion tradeoff and improved subjective image quality but also
additional functionalities the current JPEG standard lacks .
41. Page 41
In particular, the JPEG2000 standard addresses the following problems:
• Low bitrate compression. The current JPEG standard offers excellent
rate-distortion performance at medium and high bitrates. However, at
bitrates below 0.25 bpp, subjective distortion becomes unacceptable.
This is important if we hope to receive images on our web-enabled
ubiquitous devices, such as web-aware wristwatches, and so on .
• Lossless and lossy compression. Currently, no standard can provide
superior lossless compression and lossy compression in a single
bitstream .
42. Page 42
• Large images. The new standard will allow image resolutions greater
than 64 k×64 k without tiling. It can handle image sizes up to 232
− 1.
•Single decompression architecture. The current JPEG standard has 44
modes, many of which are application-specific and not used by the
majority of JPEG decoders.
•Transmission in noisy environments. The new standard will provide
improved error resilience for transmission in noisy environments such as
wireless networks and the Internet.
•Progressive transmission. The new standard provides seamless quality
and resolution scalability from low to high bitrates. The target bitrate and
reconstruction resolution need not be known at the time of compression .
.
43. Page 43
• Region-of-interest coding. The newstandard permits specifying Regions
of Interest (ROI), which can be coded with better quality than the rest
of the image. We might, for example, like to code the face of someone
making a presentation withmore quality than the surrounding
furniture.
• Computer-generated imagery. The current JPEG standard is optimized
for natural imagery and does not perform well on computer-generated
imagery.
44. Page 44
• Compound documents. The new standard offers metadata
mechanisms for incorporating additional nonimage data as part of the
file. This might be useful for including text along with imagery, as one
important example
45. Page 45
Main Steps of JPEG2000 Image
Compression
• The main compression method used in JPEG2000 is the (Embedded
Block Coding with Optimized Truncation) algorithm (EBCOT), designed
by Taubman .In addition to providing excellent compression efficiency,
EBCOT produces a bitstream with a number of desirable
features,including quality and resolution scalability and random access.
• The basic idea of EBCOT is the partition of each sub-band LL, LH, HL, HH
produced by the wavelet transform into small blocks called code blocks.
Each code block is coded independently, in such a way that no
information for any other block is used.
46. Page 46
• A separate, scalable bitstream is generated for each code block. With
its blockbasedcoding scheme, the EBCOT algorithm has improved error
resilience. The EBCOT algorithm consists of three steps:
1. Block coding and bitstream generation
2. Postcompression rate-distortion (PCRD) optimization
3. Layer formation and representation.
47. Page 47
Features of JPEG2000
Protective image security: the open architecture of the JPEG2000m
standard makes easy the use o protection techniques of digital images
such as watermarking, labeling, stamping or encryption
Region-of-interest coding: in this mode, regions of interest (ROI’s) can be
defined. These ROI’s can be encoded and transmitted with better quality
than the rest of the image .
48. Page 48
Example of region of interest coding
• A region of interest in the Barbara image is reconstructed with quality
scalability. The region of interest is decoded first before any background
information .
50. Page 50
Pre-processing
• Three steps:
Image tiling (optional) for each image component.
DC level shifting samples of each tile are subtracted the same quantity
(i.e. component depth).
Color transformation (optional) from RGB to Y Cb Cr )
52. Page 52
Forward Transform
• Discrete Wavelet Transform (DWT) is used to decompose each tile
component into different sub-bands
53. Page 53
• sets of samples are decomposed into low-pass and high-pass samples.
54. Page 54
Quantization
• After transformation, all coefficients are quantized using scalar
quatization.
Quantization reduces coefficients in precision.
The operation is lossy
The process follows the formula:
55. Page 55
Modes of Quantization
• Integer mode integer-to-integer transforms are employed.
Quantization step are fixed to one. Lossy coding is still achieved by
discarding bit-planes.
• Real mode real-to-real transforms are employed. Quantization steps
are chosen in conjunction with rate control.
In this mode, lossy compression is achieved by discarding bi-planes or
changing the size of the quantization step or both.
56. Page 56
Entropy Coding: Bit-planes
• The coefficients in a code block are separated into bit-planes.
57. Page 57
Entropy Coding: Coding Passes
• The coding passes are:
Significance propagation pass coefficients that are insignificant and have a
certain preferred neighborhood are coded.
Magnitude refinement pass the current bits of significant coefficients are
coded.
Clean-up pass the remaining insignificant coefficients for which no
information has yet been coded are coded.
58. Page 58
JPEG2000 Bit-stream
• For each code-block, a separate bit-stream is generated.
• The coded data of each code-block is included in a packet.
• If more that one layer is used to encode the image information, the
code-block
• bit-streams are distributed across different packets corresponding to
different layers.
60. Page 60
Layers of the JPEG2000 Bit-stream
Therefore, each layer consists of a number of consecutive bit-plane coding
passes from each code-block in the tile, including all sub-bands of all
components for that tile.
61. Page 61
THE JPEG-LS STANDARD
• The main advantage of JPEG-LS over JPEG2000 is that JPEG-LS is based
on a low-complexity algorithm.
• JPEGLS is part of a larger ISO effort aimed at better compression of
medical images.
• JPEG-LS is in fact the current ISO/lTU standard for lossless or "near loss
less" compression
62. Page 62
• The core algorithm in JPEG-LS is called Low Complexity Lossless
compression for Images (LOCO-I), proposed by Hewlett-Packard [11).
LOCO-I exploits a concept called context modeling. The idea of context
modeling is to take advantage of the structure in the input source
63. Page 63
LOCO-1 Algorithm
• LOCO-I can be broken down into three components:
• Prediction
• Context determination
• Residual coding
64. Page 64
Prediction
A better version of prediction can use an adaptive model based on a
calculation of the local edge direction. However, because JPEG-LS is aimed
at low complexity, the LOCO-I algorithm instead uses a fixed predictor that
performs primitive tests to detect vertical and horizontal edges. The fixed
predictor used by the algorithm is given as follows:
65. Page 65
• It is easy to see that this predictor switches between three simple
predictors.
• It outputs a when there is a vertical edge to the left of the current
location
• it outputs b when there is a horizontal edge above the current location
and finally it outputs a + b - c when the neighboring samples are
relatively smooth.
66. Page 66
Context Model
two probability models: flat areas and edge-areas
compute d1=d-a, d2=a-c, d3=c-b
quantize d1, d2, d3 to Q1, Q2, Q3 using thresholds T1, T2, T3. For 8-bit
images they are 3, 7, 21
any Q can take up to 9 possible values based on the threshold interval it is
in This produces 93-1=728 combinations for {Q1,Q2,Q3}, or 364 using
symmetry
67. Page 67
Residual Coding
For any image, the prediction residual has a finite size, a. For a given
prediction X, the residual c is in the range -X ≤ <ᵋ α - X. Since the value
׈can be generated by the decoder, the dynamic range of the residual e
can be reduced modulo a and mapped into a value between –[α÷2] and
[α÷2]-1.
It can be shown that the error residuals follow a two-sided geometric
distribution (TSGD).
68. Page 68
Bi-level Image Compression
Standards
As more and more documents are handled in electronic form, efficient
methods for compressing bi-level images (those with only 1-bit, black-and-
white pixels) aresometimes called for. A familiar example is fax images.
Algorithms that take advantage of the binary nature of the image data
often perform better than generic image
compression algorithms
69. Page 69
The JBIG Standard
JBIG is the coding standard recommended by the Joint Bi-level Image
Processing Group for binary images. This lossless compression standard is
used primarily to code scanned images of printed or handwritten text,
computer-generated text, and facsimile transmissions. It offers progressive
encoding and decoding capability, in the sense that the resulting bitstream
contains a set of progressively higher resolution images. This standard can
also be used to code grayscale and color images by coding each bitplane
independently, but this is not the main objective.
70. Page 70
The JBIG2 Standard
While the JBIG standard offers both lossless and progressive (lossy to
lossless) coding abilities, the lossy image produced by this standard has
significantly lower quality than the original, because the lossy image
contains at most only one-quarter of the number of pixels in the original
image. By contrast, the JBIG2 standard isexplicitly designed for lossy,
lossless, and lossy to lossless image compression. The design goal for JBIG2
aims not only at providing superior lossless compression. performance
over existing standards but also at incorporating lossy compression at a
much higher compression ratio, with as little visible degradation as
possible.
71. Page 71
Model-Based Coding
The idea behind model-based coding is essentially the same as that of
context-based coding. From the study of the latter, we know we can
realize better compression performance by carefully designing a context
template and accurately estimating the probability distribution for each
context. Similarly, if we can separate the image content into different
categories and derive a model specifically for each, we are much more
likely to accurately model the behavior of the data and thus achieve higher
compression ratio.
72. Page 72
Text Region Coding
Each text region is further segmented into pixel blocks containing
connected black pixels. These blocks correspond to characters that make
up the content of this region. Then, instead of coding all pixels of each
character, the bitmap of one representative instance of this character is
coded and placed into a dictionary. For any character tobe coded, the
algorithm first tries to find a match with the characters in the dictionary. If
one is found, then both a pointer to thecorresponding entry in the
dictionary and the position of the character on the page are coded.
Otherwise, the pixel block is coded directly and added to the dictionary.
This technique is referred to as pattern matching and substitution in the
JBIG2 specification
73. Page 73
Halftone-Region Coding
The JBIG2 standard suggests two methods for halftone image coding. The
first is similar to the context-based arithmetic coding used in JBIG. The only
difference is that the new standard allows the context template to include
as many as 16 template pixels, four of which may be adaptive. The second
method is called descreening. This involves converting back to grayscale
and coding the grayscale values .
74. Page 74
Preprocessing and Postprocessing
JBIG2 allows the use of lossy compression but does not specify a method
for doing so. From the decoder point of view, the decoded bitstream is
lossless with respectto the image encoded by the encoder, although not
necessarily with respect to the original image.