You are on page 1of 13

COMPRESSION

CODEC
Digital files have to be compressed after analog
conversion
Software called Compression / Decompression or
Coder / Decoder.
Redundant information discarded without affecting the
media quality.

Types of Compression
Lossless Vs Lossy Compression
Lossless
original data is not changed permanently during compression.
Advantage : original data stays intact without degradation of
quality.
Disadvantage : compression achieved is not very high.

Lossy
Parts of original data are discarded permanently to reduce file size.
After decompression original data cannot be recovered.
Compression achieved 10 to 50 times

Types of Compression
Intraframe Vs Interframe
Intraframe
is applicable within a still image or a single video
frame
Spatial redundancies are detected and exploited to
reduce the file size occur when different portions of
an image are identical or similar to each other. Ex.
Area of flat color has pixels of identical values.

Interframe
Exploits the redundancies between adjacent frames
in a video sequence referred to as temporal
redundancy.
Such redundancies occur when subsequent frames in
a video sequence are identical or very similar. Ex.
Video of a news reader.

Types of Redundancies
Statistical
Related to the actual information (ex.
Intensity ) or representation of the
information (ex. Frequently occurring
data)
statistical relationship within the media
data.

Psycho-visual
Originates from the characteristics of the
HVS.
Ex. Some colors in the image cannot be

Lossless Compression
Also known as entropy encoding
Entropy
measure of energy in a physical system that cannot be
used to do useful work.
H(x) = - p(i) log2 p(i) p(i) is the probability of state i.
Means sum of all possible outcomes i of x.

Run Length Encoding


Any sequence of repetitive characters may be replaced by
a more compact form.
A series of n successive character replaced by a single
instance of the character.
Ex. Uncompressed data RRRRSSSETTTGHIEM

Compressed data - !4R!3S!2E!3TGHIM

Lossless Compression

(CONT)

Huffman Coding
identifying the most frequent bit or byte
patterns
coding these patterns with fewer bits than
initially represented.
Table of correspondence called code-book
between the initial patterns - generated
by analyzing the frequencies of the
occurrences of each character in the file.
new representations must be available at
both the encoding and the decoding ends.
binary tree is generated with 0 or 1.

Lossless Compression

(CONT)

Ex. AAAAAAAAAAAABBCD ie., 12As


2Bs 1 C and 1 D.
p(A) = 3 / 4 , p(B) = 1/8 p(C) = p(D)
= 1/16
for this 7-ASCII bits is used, hence 7
x 16 = 112 bits.

Arithmetic Encoding
Uses single codeword for string of character
First step - divide the range between 0 and 1 into a
no. of segment based on probability of each
character from highest value
For Ex.
ABC has prob. A = 0.3 , B = 0.5 and C = 0.2
B -> 0 to 0.5 (0.4999) , A -> 0.5 to 0.8 (0.7999) and
C -> 0.8 to 1.0
A is Subdivided into 3 segments
B -> 0.5 to (.5 + .5 x 0.3) = 0.65
A -> 0.65 to (0.65 + 0.3 x 0.3) = 0.74
C -> 0.74 to (0.74 + 0.2 x 0.3) = 0.8

Further it is subdivided into 3 segments


B -> 0.5 to (0.5 + 0.5 x 0.15) = 0.575
A -> 0.575 to (0.575 + 0.3 x 0.15) = 0.62
C -> 0.62 to (0.62 + 0.2 x 0.15) = 0.65

Next character in the string C & its range is 0.62 to 0.65 is


subdivided into 3 segments
B -> 0.62 to (0.62 + 0.5 x 0.03) = 0.635
A -> 0.635 to (0.635 + 0.3 x 0.03) = 0.644
C -> 0.644 to (0.644 + 0.2 x 0.03) = 0.65

The process continues until the last character (C) has


been coded.
Final code word is a number between 0.644 & 0.65
Decoder follow the same procedure followed by encoder

For Ex.
Received codeword is 0.645
First char. is A, since it is within 0.5 to 0.8
Next is B, since it is within 0.5 to 0.65
Next is C, since it is within 0.62 to 0.65

Lempel-Ziv (LZ) Coding


Uses codes documents by considering a string of
characters at a time
Text compression, table holds all possible words in
the documents. Index is also maintained.
Table is used as dictionary , LZ is known as
Dictionary-based compression algorithm

Lempel-Ziv-Welsh (LZW) coding


Allows to build the dictionary
dynamically.
Initially, dictionary contains ASCII chars.
Suppose, dictionary is a 8 bit table and
therefore 256 entries.

Lossless Compression

You might also like