ADSP 06 AC Quantization&Coding EC623 ADSP PDF

www.jntuworld.
com
Audio Coding
Quantization and Coding
S. R. M. Prasanna
FaaDoOEngineers.com
Dept of ECE,
IIT Guwahati,
prasanna@iitg.ernet.in
Audio Coding p. 1/2
Analog sampled values to binary words
www.jntuworld.com
Objectives of Quantization & Coding

Quantization: Infinite to finite amplitude levels
Coding: Each finite amplitude level by binary word
Some distortion is inherent due to quantization
Technical Goal:
Minimum possible distortion for given bit rate or
Achieve given acceptable level of distortion with
least possible bit rate
FaaDoOEngineers.com
Bit Rate = SamplingF requency Bits/sample
Audio Coding p. 2/2
Signals for compression are correlated in nature
www.jntuworld.com
Basis for Quantization & Coding

They have perceptually redundant, perceptually and
statistically irrelevant information
In an image signal amplitudes of pixels with intensity
less than 1/256 are perceptually redundant
FaaDoOEngineers.com
Presence of dominant and weak frequency components

at a given instant of time masks perception of weak
component
Most certain event does not have much information for
transmission
Audio Coding p. 3/2
Compressed signal carries only perceptually

non-redundant and perceptually and statistically
relevant information
www.jntuworld.com
Basis (contd.)
Quantization eliminates perceptually redundant

information
FaaDoOEngineers.com
Coding eliminates perceptually and statistically

irrelevant information
Audio Coding p. 4/2
www.jntuworld.com
Distortion Measurement
Objective Measure: Signal-to-Noise Ratio (SNR)
Subjective Measure: Mean Opinion Score (MOS)
SNR Measurement
Let x(n), y(n) and e(n) be the input, output and error
during reconstruction for given codec, respectively
e(n) = x(n) y(n)
Let x2 , y2 and e2 are the variances of x(n), y(n) and
e(n), respectively
Assuming signals of length M samples to be zero
mean
PM 2
2
u = 1/M n=1 u (n), where u = x, y or e
FaaDoOEngineers.com
Audio Coding p. 5/2
SNR =
SNR =
SNR =
signalpower
noisepower
PM
2
x
(n)
Pn=1
M
2
n=1 e (n)
www.jntuworld.com
Distortion Measurement (contd.)
signalvariance
reconstructionerrorvariance
FaaDoOEngineers.com
SNR (dB) =
x2
10log10( e2 )
e2 is commonly termed as Mean Square Error (MSE)
Objective in coding is to minimize this MSE (MMSE)
Audio Coding p. 6/2
Memoryless v/s with memory
www.jntuworld.com
Classification of Quantization Schemes

Uniform v/s nonuniform
Scalar v/s vector
Parametric v/s nonparametric
Memoryless v/s with memory
Memoryless depends only on current sample (PCM).
With memory depends on past samples also
(DPCM, DM, ADPCM).
FaaDoOEngineers.com
Uniform v/s nonuniform

Step size is fixed in uniform nonadaptive
quantization (Uniform PCM).
Step size is variable in nonuniform quantization
(Nonuniform PCM).
Audio Coding p. 7/2
Scalar v/s vector quantization

In scalar quantization each sample value is
quantized (PCM).
In vector quantization group of sample values are
quantized (Vector PCM).
www.jntuworld.com
Classification (contd.)
FaaDoOEngineers.com
Parametric v/s nonparametric

Signal is processed to extract parameter or feature
vectors and these vectors are quantized in
parametric
Group of signal values themselves are quantized in
nonparametric
Audio Coding p. 8/2
www.jntuworld.com
Uniform Quantization
Step size is constant
No. of quant. levels Q = 2Rb , where Rb no. of bits
Signal amplitude s has the range (smax , smax )
Step size =
2smax
2Rb
FaaDoOEngineers.com
Quantization noise eq is assumed to have uniform PDF

i.e., /2 eq /2
PDF: Peq (eq ) = 1/, for |eq | /2
Variance of quantization noise:
2
eq
2
12
s2max 2Rb
3
Increase in 1 bit reduces noise variance by factor of four

SN R = 6.02Rb + k1
Increase by 1 bit will improve SNR by 6 dB.
Audio Coding p. 9/2
Uniform quantization does not exploit inherent signal

properties
www.jntuworld.com
Nonuniform Quantization
Nonuniform quantization exploits statistical structure of

the signal
Accordingly it uses nonuniform step size
FaaDoOEngineers.com
PDF optimized nonuniform quantization uses fine step

sizes for frequently occurring amplitudes and coarse
step sizes for less frequently occurring amplitudes
Log quantizers provide nonlinear compression of input
signal amplitudes
Compression is linear for low amplitudes and nonlinear
for high amplitudes
Audio Coding p. 10/2
Log quantizers ( law & A law) employ a nonlinear

mapping function g(.) that maps nonuniform step sizes
to uniform such that a simple linear quantizer is used
www.jntuworld.com
-Law & A-Law Quantizers
Decoder uses an expansion function g 1 (.) to recover

signal
FaaDoOEngineers.com
-law quantizer
|g(s)| =
log(1+|s/smax |)
log(1+)
is controlling parameter for nonlinearity

= 1 leads to uniform quantization
= 255 provides linear mapping for small amplitudes
and log mapping for larger amplitudes
A-law quantizer
|g(s)| =
A|s/smax |
(1+log(A))
|g(s)| =
1+log(A|s/smax |)
(1+log(A))
www.jntuworld.com
-Law & A-Law Quantizers (contd.)

for 0 |s/smax | < 1/A
for 1/A < |s/smax | < 1
Reduces bit rates without degradation by as much as 4

bits/sample relative to uniform PCM.
FaaDoOEngineers.com
64 kbps to 32 kbps without perceptual degradation
www.jntuworld.com
Vector Quantization
Quantization of a block of data (vector) at a time
Each block of input data is allotted a unique binary code
by comparing with codebook entries
At the receiver the same codebook will be present
Synthesis of data is done using the received binary
words as indices to codebook.
FaaDoOEngineers.com
Number of bits required to quantize an audio frame with

reduced audible distortion
www.jntuworld.com
Bit Allocation
Based on perceptual or spectral characteristics

In audio coding parameters to be quantized usually will
be transform domain coefficients (DCT)
FaaDoOEngineers.com
Because they provide better energy compaction
Let x = [x1 , x2 , x3 , . . . , xNf ], where Nf is number of

transform coefficients/frame
Let the total number of bits available be N
Bit Allocation = Optimum way of distributing N bits
among Nf coefficients
www.jntuworld.com
Uniform Bit Allocation

Distortion
measure:
P
Nf
D = N1f i=1
E[(xi xi )2 ] =
1 PNf
i=1 di
Nf
ith unquantized
where xi and xi denote the

and
quantized transform coefficients, respectively
Let ni be number of bits assigned to the coefficient xi
PNf
for quantization, such that i=1 ni N
FaaDoOEngineers.com
If xi are uniformly distributed for all i, then uniform bit

allocation across all transform coefficients: ni = NNf , for
1 i Nf
In practice transform coefficients may not have uniform
distributions
Employing equal number of bits for both large and small

amplitudes may result in spending extra bits for smaller
amplitudes.
www.jntuworld.com
Optimum Bit Allocation
Also, for a given N , the distortion D can be very high.

Some cost function which minimizes the distortion di by
PNf
keeping i=1 ni N
1 PNf
minni {D} = minni { Nf i=1 E[(xi xi )2 ]} =
1 PNf
minni { Nf i=1 i2 }, where i2 is the variance
FaaDoOEngineers.com
If quantization noise has uniform PDF then
i2
x2i
3(22ni )
Substituting for i2 and minimizing with respect to ni we

get ni = 21 log2 x2i + K
Substituting for ni in
we get K =
N
Nf
PNf
i=1 ni = N and
QNf 2
1
2Nf log2 ( i=1 xi )
simplifying for K
www.jntuworld.com
Optimum Bit Allocation (contd.)
Substituting for K in ni = 12 log2 x2i + K we get

nopt
i
N
Nf
x2i
1
2 log2 ( QNf 2 N1
( i=1 xi ) f
FaaDoOEngineers.com
Example: N = 16, N = 64,
f
Uniform allocation 4 bits for all 16 coefficients, Optimum

bit allocation may have [5, 4, 5, 3, 4, 5, . . .]
Uniformly distributed: Distortion Du = 0.00024 and
Do = 0.00025
Gaussian distributed: Distortion Du = 0.00042 and
Do = 0.00023
If the geometric mean of x2i is less than the arithmetic

mean of x2i , then optimal bit allocation performs better
QNf 2 N1
Geometric mean: GM = ( i=1 xi ) f
1 PNf 2
Arithmetic mean: AM = Nf i=1 xi
www.jntuworld.com
Uniform v/s Optimum
FaaDoOEngineers.com
and
Spectral Flatness Measure (SFM) =

0 SF M 1
GM
AM
If SFM is small then optimum bit allocation is preferred.
www.jntuworld.com
Entropy Coding
Min. No. of Bits required to represent given audio
frame.
For given message X , according to Shannon it will be
given by the entropy He (X)
Entropy is a measure of uncertainty of a random
variable
FaaDoOEngineers.com
Let X = [x1 , x2 , . . . , xN ] be the input vector of length N

pi be the probability that ith symbol over the symbol set
V = [v1 , v2 , . . . , vK ] is transmitted.
PK
Entropy, He (X) = i=1 pi log2 (pi )
Let X = [4, 5, 6, 6, 2, 5, 4, 4, 5, 4, 4] = N = 11
www.jntuworld.com
Illustration of Entropy Computing

Symbol set, V = [2, 4, 5, 6], with
pi = [1/11, 5/11, 3/11, 2/11] with K = 4
PK
He (X) = i=1 pi log2 (pi )
= {(1/11)log2 (1/11) + (5/11)log2 (5/11) +
(3/11)log3 (3/11) + (2/11)log2 (2/11)}
= 1.7899
FaaDoOEngineers.com
On an avg. min. No. of 1.7899 bits are needed to

transmit each symbol
To construct an ensemble code for each message, such

that the code is uniquely decodable, prefix-free, and
optimum in the sense that it provides minimum
redundancy encoding
www.jntuworld.com
Objective in Entropy Coding
An unique bit sequence for given audio frame
FaaDoOEngineers.com
Each message should be assigned a unique code
www.jntuworld.com
Requirements for Entropy Codes

The codes must be prefix-free i.e., no code entirely can
be found as a prefix of another code
Ex: CS 1 = {00, 11, 10, 011, 001} 00 prefix of 001 and
hence difficult to decode
CS 2 = {00, 11, 10, 011, 010} No code is prefix of other
and hence easy for decoding
FaaDoOEngineers.com
Additional information regarding beginning and

end-point of a message source will not be available at
the decoder
Necessary condition for a code to be prefix-free by
PN Li
Krafts inequality KI = i=1 2
1, where Li is
codeword length of ith symbol.
To obtain minimum redundancy code, the compression

rate R must be minimized and also Krafts inequality
PN
should be satisfied, where R = i=1 pi Li
www.jntuworld.com
Requirements (contd.)
Illustration: X = [4, 5, 6, 6, 2, 5, 4, 4, 1, 4, 4]
Symbol Set V = [1, 2, 4, 5, 6] and Probabilities
pi = [1/11, 1/11, 5/11, 2/11, 2/11] K = 5
PK
Entropy He (X) = i=1 pi log2 (pi ) = 2.04
Total symbols Nf = 11, Total bits N = 33
Uniform coding = 33/11=3 bit/symbol
Shannon-Fano Coding = 24/11 = 2.18 bits/symbol
Huffmann Coding = 23/11 = 2.09 bits/symbol
FaaDoOEngineers.com
www.jntuworld.com
Expt. 2-AC-Log Quantizer

Objective is to implement and study -law quantizer
Take music signal of 1 sec (44.1 kHz, 16 bits/sample)
Resample the signal (44.1 kHz, 8 bits/sample)
Pass it through a -law compressor with = 255
Code compressed data using 4 bit uniform quantizer
FaaDoOEngineers.com
Uncompress using inverse of -law
Select a 50 ms segment, plot it in TD and FD
Compute MSE between original and reconstructed

Listen to original and reconstructed
Comment on MSE and perceptual difference
Repeat for values as 1, 50 and 100
Objective is to implement and study vector quantizer
www.jntuworld.com
Expt. 3-AC-Vector Quantizer

Take music signal of 1 sec (44.1 kHz, 16 bits/sample)
Resample the signal (8 kHz, 16 bits/sample)
Write vector quantization program using binary split and
k-means clustering.
FaaDoOEngineers.com
Consider the signal in non-overlapping blocks of 8

samples
Each block of 8 samples is a vector of dimension 8

Using these vectors to build a codebook of size 64
Resynthesize signal using these codebook entries
Comment on bit rate reduction and perceptual quality
Problem No. 3.12 (pp. 89) of Spanias book on Audio

Signal Processing
www.jntuworld.com
Expt. 4-AC-Entropy Coding
FaaDoOEngineers.com

ADSP 06 AC Quantization&Coding EC623 ADSP PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ADSP 06 AC Quantization&Coding EC623 ADSP PDF

Uploaded by

Copyright:

Available Formats

www.jntuworld.

Audio Coding p. 1/2

Analog sampled values to binary words

Objectives of Quantization & Coding

Bit Rate = SamplingF requency Bits/sample

Audio Coding p. 2/2

Signals for compression are correlated in nature

Basis for Quantization & Coding

Presence of dominant and weak frequency components

Audio Coding p. 3/2

Compressed signal carries only perceptually

Quantization eliminates perceptually redundant

Coding eliminates perceptually and statistically

Audio Coding p. 4/2

Audio Coding p. 5/2

Distortion Measurement (contd.)

e2 is commonly termed as Mean Square Error (MSE)

Objective in coding is to minimize this MSE (MMSE)

Audio Coding p. 6/2

Memoryless v/s with memory

Classification of Quantization Schemes

Uniform v/s nonuniform

Audio Coding p. 7/2

Scalar v/s vector quantization

Parametric v/s nonparametric

Audio Coding p. 8/2

Quantization noise eq is assumed to have uniform PDF

Increase in 1 bit reduces noise variance by factor of four

Increase by 1 bit will improve SNR by 6 dB.

Audio Coding p. 9/2

Uniform quantization does not exploit inherent signal

Nonuniform quantization exploits statistical structure of

PDF optimized nonuniform quantization uses fine step

Audio Coding p. 10/2

Log quantizers ( law & A law) employ a nonlinear

-Law & A-Law Quantizers

Decoder uses an expansion function g 1 (.) to recover

is controlling parameter for nonlinearity

Audio Coding p. 11/2

-Law & A-Law Quantizers (contd.)

Reduces bit rates without degradation by as much as 4

64 kbps to 32 kbps without perceptual degradation

Audio Coding p. 12/2

Audio Coding p. 13/2

Number of bits required to quantize an audio frame with

Based on perceptual or spectral characteristics

Because they provide better energy compaction

Let x = [x1 , x2 , x3 , . . . , xNf ], where Nf is number of

Audio Coding p. 14/2

Uniform Bit Allocation

where xi and xi denote the

If xi are uniformly distributed for all i, then uniform bit

Audio Coding p. 15/2

Employing equal number of bits for both large and small

Optimum Bit Allocation

Also, for a given N , the distortion D can be very high.

If quantization noise has uniform PDF then

Substituting for i2 and minimizing with respect to ni we

Audio Coding p. 16/2

Optimum Bit Allocation (contd.)

Substituting for K in ni = 12 log2 x2i + K we get

Uniform allocation 4 bits for all 16 coefficients, Optimum