You are on page 1of 26

www.jntuworld.

com

Audio Coding
Quantization and Coding
S. R. M. Prasanna

FaaDoOEngineers.com
Dept of ECE,

IIT Guwahati,
prasanna@iitg.ernet.in

Audio Coding p. 1/2

Analog sampled values to binary words

www.jntuworld.com

Objectives of Quantization & Coding


Quantization: Infinite to finite amplitude levels
Coding: Each finite amplitude level by binary word
Some distortion is inherent due to quantization
Technical Goal:
Minimum possible distortion for given bit rate or
Achieve given acceptable level of distortion with
least possible bit rate

FaaDoOEngineers.com

Bit Rate = SamplingF requency Bits/sample

Audio Coding p. 2/2

Signals for compression are correlated in nature

www.jntuworld.com

Basis for Quantization & Coding


They have perceptually redundant, perceptually and
statistically irrelevant information
In an image signal amplitudes of pixels with intensity
less than 1/256 are perceptually redundant

FaaDoOEngineers.com

Presence of dominant and weak frequency components


at a given instant of time masks perception of weak
component
Most certain event does not have much information for
transmission

Audio Coding p. 3/2

Compressed signal carries only perceptually


non-redundant and perceptually and statistically
relevant information

www.jntuworld.com

Basis (contd.)

Quantization eliminates perceptually redundant


information

FaaDoOEngineers.com

Coding eliminates perceptually and statistically


irrelevant information

Audio Coding p. 4/2

www.jntuworld.com

Distortion Measurement
Objective Measure: Signal-to-Noise Ratio (SNR)
Subjective Measure: Mean Opinion Score (MOS)
SNR Measurement
Let x(n), y(n) and e(n) be the input, output and error
during reconstruction for given codec, respectively
e(n) = x(n) y(n)
Let x2 , y2 and e2 are the variances of x(n), y(n) and
e(n), respectively
Assuming signals of length M samples to be zero
mean
PM 2
2
u = 1/M n=1 u (n), where u = x, y or e

FaaDoOEngineers.com

Audio Coding p. 5/2

SNR =
SNR =
SNR =

signalpower
noisepower
PM
2
x
(n)
Pn=1
M
2
n=1 e (n)

www.jntuworld.com

Distortion Measurement (contd.)

signalvariance
reconstructionerrorvariance

FaaDoOEngineers.com

SNR (dB) =

x2
10log10( e2 )

e2 is commonly termed as Mean Square Error (MSE)

Objective in coding is to minimize this MSE (MMSE)

Audio Coding p. 6/2

Memoryless v/s with memory

www.jntuworld.com

Classification of Quantization Schemes


Uniform v/s nonuniform
Scalar v/s vector
Parametric v/s nonparametric
Memoryless v/s with memory
Memoryless depends only on current sample (PCM).
With memory depends on past samples also
(DPCM, DM, ADPCM).

FaaDoOEngineers.com

Uniform v/s nonuniform


Step size is fixed in uniform nonadaptive
quantization (Uniform PCM).
Step size is variable in nonuniform quantization
(Nonuniform PCM).

Audio Coding p. 7/2

Scalar v/s vector quantization


In scalar quantization each sample value is
quantized (PCM).
In vector quantization group of sample values are
quantized (Vector PCM).

www.jntuworld.com

Classification (contd.)

FaaDoOEngineers.com

Parametric v/s nonparametric


Signal is processed to extract parameter or feature
vectors and these vectors are quantized in
parametric
Group of signal values themselves are quantized in
nonparametric

Audio Coding p. 8/2

www.jntuworld.com

Uniform Quantization
Step size is constant
No. of quant. levels Q = 2Rb , where Rb no. of bits
Signal amplitude s has the range (smax , smax )
Step size =

2smax
2Rb

FaaDoOEngineers.com

Quantization noise eq is assumed to have uniform PDF


i.e., /2 eq /2
PDF: Peq (eq ) = 1/, for |eq | /2
Variance of quantization noise:

2
eq

2
12

s2max 2Rb
3

Increase in 1 bit reduces noise variance by factor of four


SN R = 6.02Rb + k1

Increase by 1 bit will improve SNR by 6 dB.

Audio Coding p. 9/2

Uniform quantization does not exploit inherent signal


properties

www.jntuworld.com

Nonuniform Quantization

Nonuniform quantization exploits statistical structure of


the signal
Accordingly it uses nonuniform step size

FaaDoOEngineers.com

PDF optimized nonuniform quantization uses fine step


sizes for frequently occurring amplitudes and coarse
step sizes for less frequently occurring amplitudes
Log quantizers provide nonlinear compression of input
signal amplitudes
Compression is linear for low amplitudes and nonlinear
for high amplitudes

Audio Coding p. 10/2

Log quantizers ( law & A law) employ a nonlinear


mapping function g(.) that maps nonuniform step sizes
to uniform such that a simple linear quantizer is used

www.jntuworld.com

-Law & A-Law Quantizers

Decoder uses an expansion function g 1 (.) to recover


signal

FaaDoOEngineers.com

-law quantizer
|g(s)| =

log(1+|s/smax |)
log(1+)

is controlling parameter for nonlinearity


= 1 leads to uniform quantization
= 255 provides linear mapping for small amplitudes
and log mapping for larger amplitudes

Audio Coding p. 11/2

A-law quantizer
|g(s)| =

A|s/smax |
(1+log(A))

|g(s)| =

1+log(A|s/smax |)
(1+log(A))

www.jntuworld.com

-Law & A-Law Quantizers (contd.)


for 0 |s/smax | < 1/A
for 1/A < |s/smax | < 1

Reduces bit rates without degradation by as much as 4


bits/sample relative to uniform PCM.

FaaDoOEngineers.com

64 kbps to 32 kbps without perceptual degradation

Audio Coding p. 12/2

www.jntuworld.com

Vector Quantization
Quantization of a block of data (vector) at a time
Each block of input data is allotted a unique binary code
by comparing with codebook entries
At the receiver the same codebook will be present
Synthesis of data is done using the received binary
words as indices to codebook.

FaaDoOEngineers.com

Audio Coding p. 13/2

Number of bits required to quantize an audio frame with


reduced audible distortion

www.jntuworld.com

Bit Allocation

Based on perceptual or spectral characteristics


In audio coding parameters to be quantized usually will
be transform domain coefficients (DCT)

FaaDoOEngineers.com

Because they provide better energy compaction

Let x = [x1 , x2 , x3 , . . . , xNf ], where Nf is number of


transform coefficients/frame
Let the total number of bits available be N
Bit Allocation = Optimum way of distributing N bits
among Nf coefficients

Audio Coding p. 14/2

www.jntuworld.com

Uniform Bit Allocation


Distortion
measure:
P
Nf
D = N1f i=1
E[(xi xi )2 ] =

1 PNf
i=1 di
Nf
ith unquantized

where xi and xi denote the


and
quantized transform coefficients, respectively
Let ni be number of bits assigned to the coefficient xi
PNf
for quantization, such that i=1 ni N

FaaDoOEngineers.com

If xi are uniformly distributed for all i, then uniform bit


allocation across all transform coefficients: ni = NNf , for
1 i Nf
In practice transform coefficients may not have uniform
distributions

Audio Coding p. 15/2

Employing equal number of bits for both large and small


amplitudes may result in spending extra bits for smaller
amplitudes.

www.jntuworld.com

Optimum Bit Allocation

Also, for a given N , the distortion D can be very high.


Some cost function which minimizes the distortion di by
PNf
keeping i=1 ni N
1 PNf
minni {D} = minni { Nf i=1 E[(xi xi )2 ]} =
1 PNf
minni { Nf i=1 i2 }, where i2 is the variance

FaaDoOEngineers.com

If quantization noise has uniform PDF then

i2

x2i
3(22ni )

Substituting for i2 and minimizing with respect to ni we


get ni = 21 log2 x2i + K

Audio Coding p. 16/2

Substituting for ni in
we get K =

N
Nf

PNf

i=1 ni = N and
QNf 2
1
2Nf log2 ( i=1 xi )

simplifying for K

www.jntuworld.com

Optimum Bit Allocation (contd.)

Substituting for K in ni = 12 log2 x2i + K we get


nopt
i

N
Nf

x2i
1
2 log2 ( QNf 2 N1
( i=1 xi ) f

FaaDoOEngineers.com
Example: N = 16, N = 64,
f

Uniform allocation 4 bits for all 16 coefficients, Optimum


bit allocation may have [5, 4, 5, 3, 4, 5, . . .]
Uniformly distributed: Distortion Du = 0.00024 and
Do = 0.00025
Gaussian distributed: Distortion Du = 0.00042 and
Do = 0.00023

Audio Coding p. 17/2

If the geometric mean of x2i is less than the arithmetic


mean of x2i , then optimal bit allocation performs better
QNf 2 N1
Geometric mean: GM = ( i=1 xi ) f
1 PNf 2
Arithmetic mean: AM = Nf i=1 xi

www.jntuworld.com

Uniform v/s Optimum

FaaDoOEngineers.com
and

Spectral Flatness Measure (SFM) =


0 SF M 1

GM
AM

If SFM is small then optimum bit allocation is preferred.

Audio Coding p. 18/2

www.jntuworld.com

Entropy Coding
Min. No. of Bits required to represent given audio
frame.
For given message X , according to Shannon it will be
given by the entropy He (X)
Entropy is a measure of uncertainty of a random
variable

FaaDoOEngineers.com

Let X = [x1 , x2 , . . . , xN ] be the input vector of length N


pi be the probability that ith symbol over the symbol set
V = [v1 , v2 , . . . , vK ] is transmitted.
PK
Entropy, He (X) = i=1 pi log2 (pi )

Audio Coding p. 19/2

Let X = [4, 5, 6, 6, 2, 5, 4, 4, 5, 4, 4] = N = 11

www.jntuworld.com

Illustration of Entropy Computing


Symbol set, V = [2, 4, 5, 6], with
pi = [1/11, 5/11, 3/11, 2/11] with K = 4
PK
He (X) = i=1 pi log2 (pi )
= {(1/11)log2 (1/11) + (5/11)log2 (5/11) +
(3/11)log3 (3/11) + (2/11)log2 (2/11)}
= 1.7899

FaaDoOEngineers.com

On an avg. min. No. of 1.7899 bits are needed to


transmit each symbol

Audio Coding p. 20/2

To construct an ensemble code for each message, such


that the code is uniquely decodable, prefix-free, and
optimum in the sense that it provides minimum
redundancy encoding

www.jntuworld.com

Objective in Entropy Coding

An unique bit sequence for given audio frame

FaaDoOEngineers.com

Audio Coding p. 21/2

Each message should be assigned a unique code

www.jntuworld.com

Requirements for Entropy Codes


The codes must be prefix-free i.e., no code entirely can
be found as a prefix of another code
Ex: CS 1 = {00, 11, 10, 011, 001} 00 prefix of 001 and
hence difficult to decode
CS 2 = {00, 11, 10, 011, 010} No code is prefix of other
and hence easy for decoding

FaaDoOEngineers.com

Additional information regarding beginning and


end-point of a message source will not be available at
the decoder
Necessary condition for a code to be prefix-free by
PN Li
Krafts inequality KI = i=1 2
1, where Li is
codeword length of ith symbol.

Audio Coding p. 22/2

To obtain minimum redundancy code, the compression


rate R must be minimized and also Krafts inequality
PN
should be satisfied, where R = i=1 pi Li

www.jntuworld.com

Requirements (contd.)

Illustration: X = [4, 5, 6, 6, 2, 5, 4, 4, 1, 4, 4]
Symbol Set V = [1, 2, 4, 5, 6] and Probabilities
pi = [1/11, 1/11, 5/11, 2/11, 2/11] K = 5
PK
Entropy He (X) = i=1 pi log2 (pi ) = 2.04
Total symbols Nf = 11, Total bits N = 33
Uniform coding = 33/11=3 bit/symbol
Shannon-Fano Coding = 24/11 = 2.18 bits/symbol
Huffmann Coding = 23/11 = 2.09 bits/symbol

FaaDoOEngineers.com

Audio Coding p. 23/2

www.jntuworld.com

Expt. 2-AC-Log Quantizer


Objective is to implement and study -law quantizer
Take music signal of 1 sec (44.1 kHz, 16 bits/sample)
Resample the signal (44.1 kHz, 8 bits/sample)
Pass it through a -law compressor with = 255
Code compressed data using 4 bit uniform quantizer

FaaDoOEngineers.com
Uncompress using inverse of -law
Select a 50 ms segment, plot it in TD and FD

Compute MSE between original and reconstructed


Listen to original and reconstructed
Comment on MSE and perceptual difference
Repeat for values as 1, 50 and 100

Audio Coding p. 24/2

Objective is to implement and study vector quantizer

www.jntuworld.com

Expt. 3-AC-Vector Quantizer


Take music signal of 1 sec (44.1 kHz, 16 bits/sample)
Resample the signal (8 kHz, 16 bits/sample)
Write vector quantization program using binary split and
k-means clustering.

FaaDoOEngineers.com

Consider the signal in non-overlapping blocks of 8


samples

Each block of 8 samples is a vector of dimension 8


Using these vectors to build a codebook of size 64
Resynthesize signal using these codebook entries
Comment on bit rate reduction and perceptual quality

Audio Coding p. 25/2

Problem No. 3.12 (pp. 89) of Spanias book on Audio


Signal Processing

www.jntuworld.com

Expt. 4-AC-Entropy Coding

FaaDoOEngineers.com

Audio Coding p. 26/2

You might also like