Image Compression Coding Schemes

Resmi N.G.
Reference:
Digital Image Processing 2
nd
Edition
Rafael C. Gonzalez
Richard E. Woods
Error-Free Compression
Variable-Length Coding
Huffman Coding
Other Near Optimal Variable Length Codes
Arithmetic Coding
LZW Coding
Bit-Plane Coding
Bit-Plane Decomposition
Constant Area Coding
One-Dimensional Run-Length Coding
Two-Dimensional Run-Length Coding
Lossless Predictive Coding
Lossy Compression
Lossy Predictive Coding
3/24/2012 CS 04 804B Image Processing Module 3 2
Devise an alternative representation of the image in which
its interpixel redundancies are reduced.

Code the representation to eliminate the coding
redundancies.
Huffman Coding
Arithmetic Coding
LZW Coding
Bit-Plane Coding
Lossy Compression
Reduces only coding redundancy.

Assign the shortest possible codewords to the most
probable graylevels.
Huffman Coding
Yields the smallest possible number of code symbols per
source symbol.
Creates a series of source reductions by ordering the
probabilities of the symbols.
Combines the lowest probability symbols into a single
symbol that replaces them in the next source reduction.
Codes each reduced source, starting with the smallest
source and working back to the original source.
This operation is repeated for each reduced source until
the original source is reached.
Huffmans procedure creates optimal code which is an
instantaneous uniquely decodable block code.

Block code each source symbol is mapped into a fixed
sequence of code symbols.
Instantaneous each codeword can be decoded without
referencing succeeding symbols.
Uniquely decodable any string of code symbols can be
coded in only one way.
Arithmetic Coding
Generates non-block codes.
An entire sequence of source symbols is assigned a single
arithmetic codeword.
The codeword defines an interval of real numbers between
0 and 1.
As the number of symbols in the message increases, the
interval used to represent it becomes smaller and the
number of information units required to represent the
interval becomes larger.
Huffman Coding
Arithmetic Coding
LZW Coding
Bit-Plane Coding
Lossy Compression
Lempel-Ziv-Welch (LZW) Coding
Assigns fixed length codewords to variable length
sequences of source symbols.
Requires no apriori knowledge of the probability of
occurrence of the symbols to be encoded.
A codebook or dictionary containing the source symbols
to be coded is constructed while the data are being
encoded.
An LZW decoder builds an identical decompression
dictionary during decoding.
Huffman Coding
Arithmetic Coding
LZW Coding
Bit-Plane Coding
Lossy Compression
Bit-Plane Coding
Reduces interpixel redundancies.
Decomposes a multilevel image into a series of binary
images.
Compresses each image using one of the binary
compression methods.

1 2 1 0
1 2 1 0
2 2 .... 2 2
m m
m m
m bit gray scaleimage
a a a a

+ + + +
Use special codewords to identify large areas of
contiguous 1s or 0s.

Image is divided into blocks of size pxq pixels, which are
classified as all white, all black or mixed intensity.

Most probable or frequently occurring category is
assigned 1-bit codeword 0, the other two are assigned 2-
bit codes 10 and 11.
One-dimensional run-length coding
Represents each row of an image or bit plane by a
sequence of lengths that describe successive runs of black
and white pixels.

Code each contiguous group of 0s or 1s encountered in a
left to right scan of a row by its length and specify the
value of the first run of each row.

Black and white run lengths may be coded separately
using variable-length codes.
Huffman Coding
Arithmetic Coding
LZW Coding
Bit-Plane Coding
Lossy Compression
Based on eliminating the interpixel redundancies of
closely spaced pixels by extracting and coding only the
new information in each pixel.

The new information of a pixel is defined as the difference
between the actual and predicted value of that pixel.

System consists of an encoder and a decoder each
containing an identical predictor.

The prediction error is coded using variable-length code.

n n n
e f f
.
=
The decoder reconstructs the error from the received
variable-length codewords and performs the inverse
operation.

Prediction is usually formed by a linear combination of m
previous pixels.

In 1-D linear predictive coding,
n n n
f e f
.
= +
1
m
i n i n
i
f round f o
.
=
(
=
(

1
( , ) ( , )
m
i n
i
f x y round f x y i o
.
=
(
=
(

Huffman Coding
Arithmetic Coding
LZW Coding
Bit-Plane Coding
Lossy Compression
Huffman Coding
Arithmetic Coding
LZW Coding
Bit-Plane Coding
Lossy Compression
Lossy predictive coding
Quantizer absorbs the nearest-integer function of the error-
free encoder and is inserted between the symbol encoder
and the point at which the prediction error is formed.

It maps prediction error into a limited range of outputs and
establishes the amount of compression and distortion
associated with lossy predictive coding.

The lossy encoders predictor is placed within a feedback
loop, where its input , is generated as a function of past
predictions and the corresponding quantized errors.

.
n
f

Output of the decoder is also given by
Optimum predictors minimize the mean square prediction
error,

1
n n n
m
i n i n
i
f e f
f round f o
.
.
=
= +
(
=
(

. .
n n n
f e f
.
= +
. .
2
2
1
{ } ,
n n n
m
n n n n n n i n i n
i
E e E f f subject totheconstraints
f e f e f f and f f o
.
. . .
=

(
=
`
(

)
= + ~ + = =
. .
Optimum quantizers minimize the mean square
quantization error, .

| |
{ }
2
i
E s t
Transform Coding
Transform Selection
Subimage Size Selection
Bit Allocation
Zonal Coding Implementation
Threshold Coding Implementation
Wavelet Coding
Wavelet Selection
Decomposition Level Selection
Quantizer Design
Image Compression Standards
Binary Image Compression Standards
One Dimensional Compression
Two Dimensional Compression
Transform Coding
Predictive coding directly operates on the pixels of an
image (spatial domain method).

Transform coding based on modifying the transform of
an image.

A reversible linear transform (eg; Fourier transform) is
used to map the image into a set of transform coefficients.
These coefficients are then quantized and coded.
Transform Coding
Subimage decomposition NxN image is decomposed
into subimages of size nxn.

Transformation This process decorrelates the pixels of
each subimage or packs as much information as possible
into the smallest number of coefficients.

Quantization Selectively eliminates or more coarsely
quantizes the coefficients that carry the least information.

Encoding Codes the quantized coefficients.
Transform Selection
Transform is selected based on the amount of reconstruction
error that can be tolerated and the available computational
resources.

The forward discrete transform T(u,v) of an image f(x,y) of
size NxN can be expressed as,
1 1
0 0
( , ) ( , ) ( , , , )
, 0,1,..., 1
N N
x y
T u v f x y g x y u v
for u v N

= =
=
=
The inverse discrete transform f(x,y) can be obtained as,

g(x,y,u,v) and h(x,y,u,v) are called the forward and inverse
transformation kernels respectively and are also called as
basis functions or basis images.

T(u,v) for u,v=0,1,N-1 are called the transform
coefficients.

1 1
0 0
( , ) ( , ) ( , , , )
, 0,1,..., 1
N N
u v
f x y T u v h x y u v
for x y N

= =
=
=
The forward kernel g(x,y,u,v) is said to be separable if

g(x,y,u,v) = g
1
(x,u)g
2
(y,v)
The kernel is symmetric if g
1
is functionally equal to g
2
.

The same applies for inverse kernel.

A 2-D transform with a separable kernel can be computed
using row-column or column-row passes of the
corresponding 1-D transform.

The forward and inverse kernels determine the type of
transform that is computed and the overall computational
complexity and reconstruction error of the transform coding
system.

Most well-known transform-pair (DFT)

Walsh-Hadamard transform-pair

2 ( )
2
2 ( )
1
( , , , )
( , , , )
j ux vy
N
j ux vy
N
g x y u v e
N
h x y u v e
t
t
+
+
=
=
1
0
( ) ( ) ( ) ( )
( , , , ) ( , , , ) ( 1)
2 .
m
i i i i
i
b x p u b y p v
m
g x y u v h x y u v
where N
=
+
(

= =
=
The summation in the exponent is performed in modulo 2
arithmetic and b
k
(z) is the k
th
bit(from right to left) in the
binary representation of z.

2 1 0
0 1
1 1 2
2 2 3
1 1 0
3, 6 (110 ),
( ) 1, ( ) 1, ( ) 0
( ) ( )
( ) ( ) ( )
( ) ( ) ( )
:
( ) ( ) ( )
m
m m
m m
m
If m z inbinary
b z b z b z
p u b u
p u b u b u
p u b u b u
p u b u b u
= =
= = =
=
= +
= +
= +
Discrete cosine transform

( , , , ) ( , , , )
(2 1) (2 1)
( ) ( ) cos cos
2 2
1
0
( )
2
1, 2,... 1
g x y u v h x y u v
x u y v
u v
N N
for u
N
where u
for u N
N
t t
o o
o
=
+ +
( (
=
( (


1 1
0 0
1 1
0 0
( , ) ( , ) ( , , , )
, 0,1,..., 1
( , )
(0, 0, , ) (0,1, , ) ... (0, 1, , )
(1, 0, , ) (1,1, , ) ... (1, 1, , )
: : : :
( 1, 0, , ) (
n n
u v
n n
uv
u v
uv
Consider a sub imageof size nxn
f x y T u v h x y u v
for x y n
T u v
h u v h u v h n u v
h u v h u v h n u v
h n u v h n

= =

= =
=
=
=
F H
H
1,1, , ) ... ( 1, 1, , ) u v h n n u v
(
(
(
(
(

The matrix F with pixels of input sub-image is a linear
combination of n
2
matrices of size nxn, H
uv
for u,v =
0,1,n-1.

These n
2
matrices of size nxn are the basis images and the
associated T(u,v) are the expansion coefficients.

F can be compressed by truncating the transform
coefficients (setting transform coefficients to zero based
on a specified truncation criterion).

For this purpose, we define a transform coefficient
masking function.

1 1
0 0
0 ( , )
( , )
1
( , ) ( , )
n n
uv
u v
if T u v staisfies a specified truncation
u v
otherwise
Approximationof fromthetruncated expansion
u v T u v

.
= =
F
F H
( )
2
2
1 1 1 1
0 0 0 0
2
1 1
0 0
1
2
( , )
0
( , ) ( , ) ( , )
( , ) 1 ( , )
rms
n n n n
uv uv
u v u v
n n
uv
u v
n
T u v
v
The meansquareerror between subimage and
approximation is
e E
E T u v u v T u v
E T u v u v
o
.
.

= = = =

= =
=

=
`

)

=
`

)

=
`

)
=

F
F
F F
H H
H
( )
1
0
1 ( , )
n
u
u v
The total mean-square approximation error is thus the sum

of the variances of the discarded transform coefficients
(coefficients for which (u,v) = 0 so that 1- (u,v) = 1).

The mean-square error of the N/n
2
subimages of an NxN
image are identical. Thus, the mean-square error of the
NxN image equals that of a single subimage.
Subimage size selection
Subdivide the image so that the correlation or redundancy
between adjacent subimages is reduced to a specified
acceptable level.

Bit Allocation
The retained coefficients are mostly selected on the basis
of maximum variance (zonal coding) or on the basis of
maximum magnitude (threshold coding).

The overall process of truncating, quantizing and coding
the coefficients of a transformed subimage is called bit
allocation.

The transform coefficients of maximum variance carry the
most image information and must be retained in the
coding process.

Zonal sampling Multiplying each transformed
coefficient T(u,v) with the corresponding element in a
zonal mask (1 for locations with maximum variance and 0
for all other locations).

Coefficients of maximum variance are usually located
around the origin of an image transform.

The coefficients retained during zonal coding process
must be quantized and coded.

Mostly, the coefficients are allocated the same number of
bits (coefficients are generally normalized by their
standard deviations and are uniformly quantized) or a
fixed number of bits is distributed among them unequally
(a quantizer is designed for each coefficient).

The retained coefficients, selected on the basis of
maximum variance, are allocated bits proportional to the
logarithm of the coefficient variances.

Zonal coding uses single fixed mask for all subimages.
Threshold coding
Adaptive transform coding approach.
Based on the concept that for any subimage, the transform
coefficients of the largest magnitude make the most
significant contribution to reconstructed subimage quality.
The location of transform coefficients retained for each
subimage therefore varies from one subimage to another.
The elements of (u,v)T(u,v) are reoredered to form a 1-D
run length coded sequence.

Three ways to threshold a transformed subimage:
A single global threshold to all subimages Level of
compression varies from image to image depending on the
number of coefficients that exceed the threshold.

A different threshold for each subimage (called N-largest
coding) Same number of coefficients are discarded for each
subimage. The code rate is therefore constant and known in
advance.

Threshold as a function of the location of each coefficient
within the subimage Results in variable code rate.

In the third approach, thresholding and quantization can
be combined by replacing (u,v)T(u,v) in

with

where is a thresholded and quantized
approximation of T(u,v) and Z(u,v) is an element of the
transform normalization array Z.
1 1
0 0
( , ) ( , )
n n
uv
u v
u v T u v

.
= =
=
F H
( , )
( , )
( , )
T u v
T u v round
Z u v
.
(
=
(

( , ) T u v
.

A normalized subimage transform, must be
denormalized by multipying with Z(u,v) before it is
inverse transformed to obtain the approximation of
subimage f(x,y).

( , ) T u v
.
( , ) ( , ) ( , ) T u v T u v Z u v
.
=
.
Transform Coding
Transform Selection
Bit Allocation
Wavelet Coding
Wavelet Selection
Quantizer Design
Wavelet Coding
Based on the idea that the coefficients of a transform that
decorrelates the pixels of an image can be coded more
efficiently than the original pixels themselves.

If the transforms basis functions(here, wavelets) can pack
most of the information into a small number of
coefficients, the remaining coefficients can be truncated to
zero.

Discrete wavelet transform of the image is computed
which converts a large portion of the original image to
vertical, horizontal and diagonal decomposition
coefficients.

Many of the computed coefficients carry little information
and hence can be quantized and coded to minimize the
intercoefficient and coding redundancy.

Lossless coding methods (Run-length, Huffman,
Arithmetic, and Bit-plane coding) can be used for the final
symbol coding process.

Main difference between wavelet coding and transform
coding Wavelet coding does not require subdivision of
original image.
Less reconstruction error.
Wavelet selection
Most widely used are Daubechies wavelets and biorthogonal
wavelets.
Decomposition level selection
The number of operations in the computation of the forward
and inverse transforms increases with the number of
decomposition levels.
Quantizer design
An enlarged quantization interval around zero
Adapting the size of quantization interval from scale to
scale.
The selected intervals are transmitted to decoder with the
encoded image bit stream.

Transform Coding
Transform Selection
Bit Allocation
Wavelet Coding
Wavelet Selection
Quantizer Design

International Standardization Organization (ISO)
Consultative Committee of the International Telephone
and Telegraph (CCITT)

Binary and Continuous-tone image compression
Still-frame and Video(Sequential frame) applications
Transform Coding
Transform Selection
Bit Allocation
Wavelet Coding
Wavelet Selection
Quantizer Design
Continuous Tone Still Image Compression Standards
JPEG
Lossy Baseline Coding System
Extended Coding System
Lossless Independent Coding System
JPEG 2000
Video Compression Standards

Continuous Tone Still Image
Compression Standards
Based principally on lossy transform coding techniques.

Original DCT-based JPEG standard
Wavelet-based JPEG 2000 standard
JPEG-LS standard
JPEG
Three different coding systems:
Lossy baseline coding system based on DCT
Extended coding system for higher precision
Lossless independent coding system for reversible
compression

Baseline coding system
Input and output data precision is limited to 8 bits.
Quantized DCT values are restricted to 11 bits.
Compression is performed in 3 steps:
DCT computation
Quantization
Variable-length code assignment

The image is first sub-divided into pixel blocks of size
8x8, which are processed from left to right, top to bottom.

The pixels in each block are level shifted by subtracting
the quantity 2
n-1
, where 2
n
is the maximum number of gray
levels.

2D-DCT of the block is then computed, quantized and
reordered to form a 1D-sequence of quantized
coefficients.
The original 8x8 sub-image
Level shifted subimage (subtract 2
7
from each pixel)
Apply forward DCT
Normalization array Z
( , )
( , ) ;
( , )
(0, 0) 415
(0, 0) 26
(0, 0) 16
T u v
T u v round
Z u v
T
T round round
Z
.
.
(
=
(

(
(
= = =
(
(

The coefficients are then reordered in a zig-zag pattern, resulting
in a 1D coefficient sequence.

The default JPEG code is then constructed for this
sequence.

First, the difference between current DC coefficient
and that of the previously encoded subimage is
computed.

For a DC difference category K, an additional K bits are
needed and are computed as either the K LSBs of the positive
difference or the K LSBs of the negative difference minus 1.

Example:
DC difference = -9
DC Difference category = 4 (ie; K = 4)
DC Code = 101 (requires additional 4 bits)
9 : 1001
-9: 0110 + 0001 = 0111
0111 1 = 0110 (K LSBs of 0110)
Therefore, the complete code is 1010110.
Decoder regenerates the normalized transform coefficients.
: ( , ) ( , ) ( , )
(0, 0) (0, 0) (0, 0) ( 26)(16) 416
Denormalization T u v T u v Z u v
T T Z
.
.
=
= = =
.
.
Inverse DCT of denormalized array gives completely
reconstructed subimage.
Level shift each inverse transformed pixel by adding 2
7
.
Difference between original and reconstructed images.
JPEG 2000
Extends the initial JPEG standard to provide increased
flexibility.

Portions of JPEG 2000 compressed image can be
extracted for retransmission, storage, display and editing.

Based on wavelet coding.

Quantized coefficients are arithmetically coded on bit-
plane basis.
Wavelet Theory Some Concepts
Wavelet transform based on small waves of varying
frequency and limited duration called wavelets.
1. First step of the encoding process is to DC level shift the
samples of the n-bit image to be coded by subtracting
2
n-1
.
If image has more than one component(R,G,B), each
component is individually shifted.
2. If there are exactly 3 components, they can be
decorrelated using a reversible or irreversible linear
combination of the components.
3. eg; irreversible component transform
Y
0
(x,y) = 0.299I
0
(x,y) + 0.587I
1
(x,y) + 0.114I
2
(x,y)
Y
1
(x,y) = - 0.169I
0
(x,y) 0.331I
1
(x,y) + 0.5I
2
(x,y)
Y
2
(x,y) = 0.5I
0
(x,y) 0.419I
1
(x,y) - 0.081I
2
(x,y)

I
0
, I
1
, I
2
are the level-shifted input components.
Y
0
, Y
1
, Y
2
are the corresponding decorrelated
components.
3. After the image has been level-shifted and decorrelated,
its components are divided into tiles, which are
rectangular arrays of pixels containing same relative
proportion of all components.
These components can be extracted and reconstructed
independently.
4. The 1D Discrete Wavelet Transform(DWT) of rows
and columns of each tile component is then computed.
A lifting-based approach involving 6 lifting and scaling
operations is used.
The resultant even-indexed values of Y are equivalent to
fast wavelet transform lowpass filtered output; odd-
indexed values of Y correspond to highpass filtered
output.

This produces 4 components: a low-resolution
approximation of the tile component, and its horizontal,
vertical and diagonal frequency characteristics.

When each of the tile components has been processed,
the total number of transform coefficients is equal to the
number of samples in the original image.
5. Quantization

6. Final step: coefficient bit modeling, arithmetic coding, bit-
stream layering and packetizing.

Coefficients of each transformed tile component are arranged
into rectangular blocks called code blocks, which are
individually coded one bit plane at a time.

Starting from the most significant bit plane with non-
zero element, each bit plane is processed in 3 passes.

( , )
( , ) [ ( , )].
.
b
b b
a u v
q u v sign a u v floor
b
b is the quantizationstepsize
(
=
(
A

A
Each bit is coded in one of the 3 passes:
Significance propagation coefficients that are
insignificant but has atleast one significant coefficient in
neighbourhood are coded.
Magnitude refinement current bits of significant
coefficients are coded.
Cleanup - remaining insignificant coefficients are coded.

The outputs are arithmetically coded and grouped with similar
passes from other code blocks to form layers.

These layers are partitioned into packets, which are the
fundamental units of the encoded stream.

JPEG 2000 decoders simply invert these operations.

Decodes the bit-modeled, arithmetically coded, layered and
packetized code-stream.

A user-selected number of the original images tile component
subbands are reconstructed.

Coefficients are dequantized.
Dequantized coefficients are inverse transformed by column
and by row using inverse fast wavelet transform or inverse
lifting operations.

Final steps are the assembly of tile components, inverse
component transformation and DC level shifting.
For Self-Study
Video Compression Standards
Thank You

Image Compression Coding Schemes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Image Compression Coding Schemes

Uploaded by

Copyright:

Available Formats

Resmi N.G.

The inverse discrete transform f(x,y) can be obtained as,

The forward kernel g(x,y,u,v) is said to be separable if

3/24/2012 CS 04 804B Image Processing Module 3 41

The total mean-square approximation error is thus the sum

You might also like