You are on page 1of 15

Chapter-3

ESTIMATION AND SUPRESSION OF CFO


Summary: In this chapter there will be detailed analysis of various frequency offsets which
occurs during OFDM transmission along with their mathematical analysis. During transmission
Carrier Frequency Offsets produces the ISI, ICI and attenuation. Here we use a Blind Estimation
Technique to reduce the effect of CFO. Prior to channel estimation processes the CFO effect is
compensated by the by the ICA algorithm and then channel estimation is done of the received
symbols.
3.1 OFDM Synchronization Offsets
Multi-carrier modems, like any other digital communication modems, require a reliable
synchronization scheme. Parallel transmission of N symbols results in a longer symbol duration,
consequently there is less sensitivity to the timing offset. In other words, unlike single carrier
systems in which a timing jitter can create inter-symbol interference, it does not violate
orthogonality of transmitted waveforms in a multi-carrier system. However, frequency offset is
detrimental to OFDM systems and has an important role in system design. Phase noise is another
critical impairment in wireless OFDM.
In this section we analyze the synchronization offsets [21, 22]. This analysis will allow us to
individually identify the degradation effects produced by each of the synchronization offsets.
To start, we need to present a reference framework for our nomenclature. For it, we have chosen
the following parameters (which can easily be referenced to any other standard) as follows:-
T Elementary period
N FFT/IFFT size or number of OFDM subcarriers
K Number of transmitted subcarriers
T
U
= N.T Useful OFDM symbol period
G Number of OFDM subcarriers in guard band
= G*T Guard band period
T
S
=T
U
+ OFDM symbol period
ESTIMATION AND SUPRESSION OF CFO
29

T
Null
Null symbol period
L Number of OFDM symbols in a frame
T
F
= LT
S
+ T
NULL
Frame duration
With the above parameters, we define one OFDM symbol transmitted at t
s
= t as follows:
() {(

)
)
( )
c s
j f t t
e
t
}

(

)
()

. (

) (3.1)
Where f
c
is the RF carrier frequency, m is the frame number, l is the OFDM symbol number,
and k is the subcarrier number.
3.1.1. Carrier Frequency Offset Analysis
OFDM does have its drawbacks relative to time domain modulation, most significantly its
extreme sensitivity to time varying multiplicative effects such as fast fading, Doppler shifts, and
oscillator jitter. The latter two effects lead to a mismatch between the carrier frequencies of the
received signal and the local oscillator, so that a frequency offset is created.
OFDM provides an efficient way to combat multipath fading by dividing one high bit rate data
stream into multiple low bit rate streams for simultaneous transmission on multiple subcarriers.
However, as a multicarrier transmission technique, OFDM is more susceptible to the carrier
frequency offset (CFO) than single carrier systems. A carrier offset at the receiver can destroy
the mutual orthogonality between subcarriers and thus introduce inter-channel interference and
cause severe degradation in system performance. Consequently accurate estimation and
compensation of CFO is necessary at the receiver before OFDM demodulation.
The OFDM subcarriers can lose their orthogonality if the transmitter and receiver do not use
exactly the same carrier frequencies. We define frequency offset as the difference between
transmitter and receiver carrier frequencies. This offset produces inter-carrier interference (ICI).
To analyze this effect, we start by looking at the complex envelope of a received baseband
OFDM symbol with frequency offset again at t = ts
() (

)
*
0(

)
(3.2)
ESTIMATION AND SUPRESSION OF CFO
30

where (t) =2

describes the deterministic effect of a frequency offset F = (f


ct
- f
cr
) T
U

that is expressed as a multiplier of the subcarrier bandwidth 1/ T
U
. We can separate F= F
I
+
F
F
into an integer part F
I
and a fractional part F
F
. Considering for now a perfect channel,
the received sampled signal is
() (

()
(3.3)
Where the discrete version of t = t
s
represented by

()

()

(3.4)
Using this received signal as the input of the fast Fourier transform (FFT) operator we obtain

()

().

/
(())
(

(())*

(3.5)
We can express the received complex symbol as follows

()
(

()

().


(())
(

(())*
(3.6)
The first term corresponds to an amplitude distortion and phase rotation. The second term
expresses the inter-carrier interference (ICI) produced.
3.1.2. Sampling Frequency Offset Analysis
We define sampling frequency offset as the difference in sampling instants between the
transmitter and receiver clocks. We use (3.2) with (t) = 0 (since we can ignore the frequency
offset for now) for the analysis. We define a sampling clock offset = (T T) T where T is the
ESTIMATION AND SUPRESSION OF CFO
31

receiver system time, T is the transmitter system time. We usually express in parts per million
(ppm). The sampled time for the received signal is nT where T = (1+ ) T. The discrete received
baseband signal becomes

(

()

(3.7)
The effect of is a function of subcarrier k , i.e., creates ICI. The signal at the output of
the FFT is given on next page.

(())


1
(1 ) )( )
1
sin( ( (1 ) ))
, ,
0
sin( ( (1 ) ))
1 1
( )( ) ( ) ( )
1
sin( ) sin( )
, , , ,
0
sin( ( )) sin( ( (1 ) ))
N
i k
N
i k
N
C e
m l i
i
N i k
N
N N
j k j i k j i
N
k i
N N N
C e C e e
m l k m l i
i
N ik N i k
N N
i k
t |
t |
t
|
t
t | t |
t | t |
t t
| |

+
=
=
+

= +
=
+
=

(3.8)

We need to consider the phase shift that the sampling offset creates through time. For expression
(3.8), we applied the FFT at time n = 0. We need to include the time delay that the difference in
sampling frequencies produces. The time difference between the sampling periods is T ;
therefore, the cumulative time delay at the beginning of an OFDM symbol is T . produces a
double effect. One effect is similar to the frequency offset effect, and the other one is a result of
the cumulative time shift expressed as an integer multiple, , of the sampling period. This time
shift produces an error in the FFT window position. We now include the cumulative time delays
as follows
ESTIMATION AND SUPRESSION OF CFO
32

2 2
(1 ) ( ( ) )
1 1
1
, , , ,
0 0
2 2
( (1 ) ) (( ) )
1 1
1
=
, ,
0 0
ni k
j
j n
N N
N N
R C e e
m l k m l i
N
n i
n k
j
j i k
N N
N N
C e e
m l i
N
i n
t t
| o o |
t t
| o o |

+ +

=
= =
+ +


= =

(2( ) 1)
sin( )
=
, ,
sin( )
k
j N
k
N
C e
m l k
N k
N
t |
o o
t |
t
|
+ +


2 1
(( ) ) ( ) ( )
1
sin( )
+ (ICI)
, ,
0
sin( ( (1 ) ))
k N
j
j i k j i
N
i
N N N
e C e e
m l i
i
N i k
N
i k
t t
o o | t |
t |
t
|

=
+
=

(3.9)
Once again, the first term corresponds to an amplitude distortion and phase rotation. The second
term expresses the inter-carrier interference (ICI) produced.
3.1.3 Symbol Timing Offsets Analysis
Symbol timing for an OFDM signal is significantly different than for a single carrier signal since
there is not an eye opening where a best starting time sample can be found. Rather there are
hundreds or thousands of samples per OFDM symbol since the number of samples necessary is
proportional to the number of subcarriers. Finding the symbol timing for OFDM means finding
an estimate of where the symbol starts. OFDM also requires timing synchronization to preserve
orthogonality. The receiver must obtain the timing information from the received OFDM signal,
but the information obtained includes offsets, estimation errors, and jitter. OFDM is relatively
more robust to these time effects than to the frequency ones produced by the inclusion of a cyclic
prefix or guardband, but a trade-off between timing error sensibility and multipath tolerance is
produced. We define timing offset as the propagation time from transmission to reception. We
denote the time offset with a propagation delay D = + with an integer and a fraction, and we
reference both to the transmitter system time T = T
u
/ N. If an OFDM frame consists of N
symbols and L is the length of the cyclic prefix then the maximum value of the timing can be
ESTIMATION AND SUPRESSION OF CFO
33

written as [N+L-1]. We can show the overall effect of only the time offset D in the received
baseband signal as follows:
( ) ( ) ( )
2
( )
2
=
, ,
2
r t s t t DT
k
j t DT
K
T
u
C e
m l k
K
k
o
t
= -

=
(3.10)

If we consider just the fractional delay , we have the output of the FFT as follows:
2 2
( )
1 1
1
, , , ,
0 0
2
( )( 1)
1
sin( ( ))

, ,
0
sin ( )
2

, ,
ni k
j
j e n
N N
N N
R C e
m l k m l i
N
i n
k
j j i k N
N
i k
N N
e C e
m l i
i
N i k
N
k
j
N
e C
m l k
t t
o
t c t
t
t
t c



=
= =


=
| |
=

|
\ .
=
(3.11)
An important issue is that the phase shift produced is different for each subcarrier since (3.11) is
a function of k . The phase difference between the first and last subcarrier is 2 (N 1) N . For
the fixed time offset derivation, we assumed a perfect knowledge of . We now describe its
effect. To start, we assume a non-dispersive Gaussian channel and no guard band. The received
sampled signal during the time of a complete OFDM symbol is
r
m,l
= { r
m,l,t
, r
m,l,t+1
, ... r
m,l,N-1
, r
m,l+1,0
, r
m,l+1,1
, ., r
m,l+1,t-1
}
Performing the FFT on this vector yields
2 2 2 2
( 1) ( 1)
1 1 1
1 1
. .
, , , , , 1,
0
ni k ni k
j j n j j n N
N N N
N N N N
R C e e C e e
m l k m l i m l i
N N
i n i n i
t t t t | | | | | |
+
| | |

= + | ` | |
+
| = = = | |
|
\ . \ . \ . )
(3.12)
ESTIMATION AND SUPRESSION OF CFO
34

2
, , , ,
2
sin ( )(1
( )( 1 )
1
+ e (ICI)
, ,
0
sin ( )
2
sin ( )( )
( )( 1)
1
+ e
, 1,
0
sin
k
j
N
N
R e C
m l k m l k
N
k
i k
j j i k N
N
N
N N
C e
m l i
i
N i k
N
i k
k
i k
j j i k
N
N
N N
C e
m l i
i
N
N
t t
t
t
t t t
t
t
t
t
t t t
t
t
t

=
| |

+ |

\ .

| |
=

|
\ .
=
| |

\ .

+
=
(ISI)
( ) i k
| |

|
\ .

(3.13)
The signal now consists of disturbances caused by ICI and inter-symbol interference (ISI) and a
useful portion attenuated and rotated by a phasor whose phase is a function of subcarrier k and a
fixed .
3.2 Independent Component Analysis
ICA is a statistical method for searching independent source random variables or signals from a
set of observed linear combinations of them. (In this dissertation we consider only linear ICA.)
One of the main applications of ICA is blind source separation (BSS), which has become an
attractive field of research in statistical signal processing and neural network communities. ICA
performs purely in a blind manner, i.e., without any explicit knowledge of original source
variables or a mixing transformation. It relies on assumption of statistical independency of
sources. Although independency is a very strong assumption from a theoretical point of view, it
is often quite a realistic assumption in practice. Consequently, ICA has drawn a lot of attention in
various application fields lately. The idea of ICA was first introduced in a neurophysiological
setting in the early 1980s by researchers (J. Herault, C. Jutten and B. Ans) who needed a blind
method to separate the neural impulses coming from different parts of the human body [23].
Later on, ICA has been studied and applied in several very different signal processing contexts
like in audio and biomedical signal processing, feature extraction, finance, seismology, etc.
Telecommunications related applications of ICA have been found earlier, e.g., in MIMO
systems.
ESTIMATION AND SUPRESSION OF CFO
35

3.2.1 ICA Algorithms
Basically, ICA algorithms can be characterized, in a few words, as optimization algorithms that
search for extremum points of some suitable non-linear real valued function depending on
observed data. These suitable functions often called as contrast functions- are designed such that
their extreme points equal to the ICA basis. In some of the ICA algorithms, the objective is to
find the de-mixing matrix, B. In that case, contrast functions are defined on

and, ideally, B
is their (global) extremum point. We refer these algorithms as multi-unit algorithms. Multi-unit
contrast functions are based, e.g., on stochastic concepts of likelihood, entropy, mutual
information, higher-order non-linear moments, etc. An extensive survey on different multi-unit
contrast function can be found in [12]. One-unit algorithms, in turn, are meant to search for the
basis vectors or independent components one by one. Their contrast functions are defined on

.
Basically, these contrast functions measure the non-Gaussianity of the inner product

x for w

. Loosely speaking, the inner product that equals to some independent source component has
locally the most non- Gaussian distribution thanks to the central limit theorem. One-unit contrast
functions can be based, e.g., on negentropy, that is, difference between the differential entropies
of a given random variable and a Gaussian random variable with same variance. One popular
class of one-unit contrast functions are based on higher-order cumulants, like a kurtosis of
random variable, or some non-linear generalizations of them. Again, [12] includes good review
on these functions. In the following, we discuss briefly on algorithms for solving the
optimization of ICA contrast functions.
3.2.1a. Pre-Whitening the data
Whitening or a sphering is a common preprocessing task in ICA algorithms, since it simplifies
the remaining separation procedure. It is a linear transform V which de-correlates the observed
mixtures, and normalizes the observations components to have unit variances. Thus, the
whitened data, z := Vx satisfies
(

) (3.14)
Such V exists always, but is not a unique transformation [12]. One way to find such a whitening
transformation is to use a principal component analysis (PCA) [24], which gives the whitening
transform as:
ESTIMATION AND SUPRESSION OF CFO
36

(3.15)
in which (unitary) matrix E have the principal eigenvectors of the observation covariance matrix

) as columns, and the diagonal matrix contains the corresponding eigenvalues on


its diagonal. Clearly, all matrixes of form UV, for any unitary matrix U and whitening matrix V
whiten the data also. Hence, we get a notionally effective whitening matrix by multiplying (3.15)
from left by E. The resulting matrix is, actually, the inverse square root of the covariance

, and
it is denoted, in short, by

[12].
Assuming the basic ICA model i.e. , the whitening (3.14) implies directly that the new
(whitened) mixing matrix, W:= VA, is unitary, since
(

) (

(3.16)
(We have assumed that (

) .) In other words, the new (whitened) ICA basis vectors, i.e.,


the columns of the matrix W are orthogonal vectors lying in the unit sphere. For this reason,
many ICA algorithms assume the observed data to be first whitened and then constraint the
search for ICA basis vectors (or de-mixing vectors) to the unit sphere and assume them to be
mutually orthogonal.
3.2.1b. Basic ICA Algorithms
The basic (noise-free) ICA model has catch the most attention from researchers during the past,
partly because of it was the first model considered and, partly, because it is the simplest (yet
adequate to many applications) model. Consequently, numerous algorithms based on different
criteria have been developed. Here, we list the most important families of algorithms. More a
comprehensive survey on algorithms can be found, e.g., in [12]. In addition, we describe two
algorithms FastICA [25] and JADE algorithm (Joint Approximation Diagonalization
Estimation) [26].
FastICA algorithm [25] is a fixed-point algorithm which operates on a block of observed data
samples. Basically, FastICA algorithm searches for extreme points of ( .|

/* in unit
sphere. Here, F: stand for a one-unit contrast function discussed above. In effect, the
extreme points are, thus, the ICA basis vectors. The FastICA algorithm converges faster than
typical stochastic gradient decent algorithms [12], hence the name. This is also why FastICA has
ESTIMATION AND SUPRESSION OF CFO
37

became popular in application field. A one-unit version of the algorithm recovers one ICA basis
vector in an iterative or recursive manner. Assuming the pre-whitened observations (3.14), the
recursion step is given as[12, 25]

((

) (

)) ((|

))

(3.17)
in which f : is derivative of a given contrast function F. In addition,

must be
normalized to have unit norm before proceeding to the next step. Assuming a simple kurtosis
based contrast function, ()

, we can simplify (4.4) to [12]

((

) (

))

(3.18)
where the scalar coefficient = 3 for real valued data and = 2 for complex valued data. The
iterations steps (3.17) or (3.18) are computed until a convergence, i.e., until

is close
enough to one. After the convergence, the corresponding independent component is

, where
j is the last index of recursion. Several independent components are recovered by repeating the
one unit algorithm successively starting the iterations from different, e.g., randomly selected
initial points (

). However without control, recursions can converge to same basis vector for
several times. This is prevented by making the vector wi orthogonal to all basis vectors already
estimated, between each iteration. Recall, that the basis vectors are orthonormal after the pre-
whitening. The orthogonalization can be accomplished, for instance, with Gram-Schmidt
algorithm [27]. Repeating the one-unit algorithm successively is often referred as deflation. An
another way to estimate several ICA basis vectors is to use, so called, symmetric FastICA [12,
25]. This algorithm runs several, say K, one-unit iterations in parallel manner and, between each
iteration step, orthogonalizes the vectors

(), k = 1 . . .K, using symmetric orthogonalization.


The matrix

()

()- can be orthogonalized symmetrically, e.g., as


(3.19)
The original Jade algorithm/code deals with complex signals in Gaussian noise white and
exploits an underlying assumption that the model of independent components actually holds.
This is a reasonable assumption when dealing with some narrowband signals. In this context, we
may
i) seriously consider dealing precisely with the noise in the whitening process and
ESTIMATION AND SUPRESSION OF CFO
38

ii) Expect to use the small number of significant Eigen matrices to efficiently summarize
all the 4th-order information. All this is done in the JADE algorithm.
In this implementation, we deal with complex-valued signals and we do NOT expect the ICA
model to hold exactly. Therefore, it is pointless to try to deal precisely with the additive noise
and it is very unlikely that the cumulant tensor can be accurately summarized by its first n Eigen-
matrices. Therefore, we consider the joint diagonalization of the whole set of eigen-matrices.
However, in such a case, it is not necessary to compute the Eigen matrices at all because one
may equivalently use `parallel slices' of the cumulant tensor. This part (computing the eigen-
matrices) of the computation can be saved: it suffices to jointly diagonalize a set of cumulant
matrices. Also, since we are dealing with reals signals, it becomes easier to exploit the
symmetries of the cumulants to further reduce the number of matrices to be diagonalized. These
considerations, together with other cheap tricks lead to this version of JADE which is optimized
(again) to deal with real mixtures and to work `outside the model'. As the original JADE
algorithm, it works by minimizing a `good set' of cumulants. The rows of the separating matrix
B are resorted in such a way that the columns of the corresponding mixing matrix A=pinv(B) are
in decreasing order of (Euclidian) norm. This is a simple, `almost canonical' way of fixing the
indetermination of permutation. It has the effect that the first rows of the recovered signals (ie
the first rows of B*X) correspond to the most energetic *components*. Recall however that the
source signals in S=B*X have unit variance. Therefore, when we say that the observations are
unmixed in order of decreasing energy, the energetic signature is found directly as the norm of
the columns of A=pinv(B) [26].
In experiments where JADE is run as B=jadeR(X,m) with m varying in range of values, it is nice
to be able to test the stability of the decomposition. In order to help in such a test, the rows of B
can be sorted as described above. We have also decided to fix the sign of each row in some
arbitrary but fixed way. The convention is that the first element of each row of B is positive
[26].
Contrary to many other ICA algorithms, JADE does not operate on the data themselves but on a
statistic (the full set of 4th order cumulant). This is represented by the matrix CM below, whose
size grows as

where m is the number of sources to be extracted (m could be much


smaller than n). As a consequence, JADE will probably choke on a large' number of sources.
Here `large' depends mainly on the available memory and could be something like 40 or so.
ESTIMATION AND SUPRESSION OF CFO
39

3.2.2. ICA Model in OFDM Reception
In many fields there are mixtures (Y) of independent sources (S) in the form of equation (3.20)
which should be separated.
(3.20)
Prior to the channel estimation algorithms the received signal is centred by subtracting the
estimated mean from the signal:

,- (3.21)
Second-order dependences are removed by de-correlation, which is achieved by the Principal
Component Analysis (PCA) i.e.
*

+
Usually the observed data (X) is then whitened to get Z such that

(3.22)
Suppose ,

- is the matrix whose columns are the unit-norm eigenvectors of the


covariance matrix

+ and ,

- is the diagonal matrix of the


eigenvalues of

then

.This is called the eigenvectors decomposition of the


covariance matrix. The linear whitening transform is expressed as [12]

(3.23)
After performing whitening the mixing matrix V is orthogonal. So if the estimated mixing
matrix is denoted by

, the independent components can be obtained by

(3.24)
3.3 Channel Estimation with CFO
3.3.1. System Model Considering CFO
The block diagram of OFDM system is shown in Figure 2.6. It is assumed that the channel
fading is slow enough; so that the channel remains unchanged much during CFO estimation.The
number of subcarriers and the training symbols used for channel estimation is denoted by N and
Q respectively. At the transmitter, the q-th modulated training symbol
() ,

()

()

()-

is serial to parallel transformed. Then the IDFT of the


ESTIMATION AND SUPRESSION OF CFO
40

resulting symbol is taken to get () ,

()

()

()-

. The resulting symbol is parallel to


serial transformed. In order to eliminate interferences between adjacent OFDM symbols, a cyclic
prefix, whose length is longer than the overall channel impulse response, is appended to each
(). At the receiver after CP removal the symbol is serial to parallel transformed to get
() ,

()

()

()-

which after the DFT results in


() ,

()

()

()-

[11].
Here it is assumed that the channel is represented by ,

, where

are zero
mean independent Gaussian random variables. The length of the channel (L) is supposed to be
shorter than N, such that

. Considering the channel to be stationary


during the estimation process, the received signal is expressed as
()

()() () (3.25)
where w(n) is the additive white Gaussian noise and

is the following matrix [11]

]
Considering CFO, the received symbol is given by
() (

()() () (3.26)
where (

) { (

) (

( ))} and

is the normalized frequency


offset. So () is expressed by
() () (

()() () (3.27)
where F is the N N unitary DFT matrix whose elements are given by


Using the well-known property that every circulant matrix can be diagonalized by
post (pre) multiplication by (I) FFT matrices, it can be rewritten as [11]
() (

() () (3.28)
where () ()

() *

()

()

()+ and
(

) (

()
]

(3.29)
ESTIMATION AND SUPRESSION OF CFO
41

It is seen that in presence of CFO,

() is affected by the transmitted symbols


and the channel coefficients on other subcarriers, as it is pre-multiplied by D.
3.3.2. Channel Estimation
3.3.2a. LMMSE Channel Estimation
The LS estimation technique is needed by many estimation techniques as an initial estimation. In
LS, the pilot symbols are inserted on each of the subcarriers. Denoting the known data on
subcarrier i at time q as

(), we can find a least squares (LS) channel estimate as:

()

()

() (3.30)
where

() is the received value on subcarrier i. Arranging the LS estimates in a vector

() ,

()

()

()-

the corresponding vector of LMMSE estimates


becomes:

()

((

))

() (3.31)
where

is the covariance matrix between channel gains and the LS estimate of channel gains,

is the autocovariance matrix of LS estimates. Given that we have AWGN with


variance

on each subcarrier.
It is shown in [11] that the error can be expressed as
( ) ( ) (3.32)
error due to noise can be reduced by using more training symbols. In other words the error
caused by the CFO, should be reduced by other methods [11].
3.3.2b. ICA-LMMSE Channel Estimation
One of the most prominent methods for ICA technique is FastICA and JADE algorithm. In the
JADE algorithm [26] the de-mixing matrix is obtained by minimization of the contrast function

, - which is a real valued function of the distribution of the source estimates

,-


Specifically

, - is a 4-th order approximation of a mutual information based contrast


function. It can be seen from equation (3.24) that ICA can be used to estimate the mixing matrix
ESTIMATION AND SUPRESSION OF CFO
42

, which under the assumption of gives an estimate of the CFO matrix. After
performing ICA the de-mixed sources have scaling and ordering ambiguities which necessitate
some post processing. Therefore, the estimated signals are given by

(3.33)
where Q is a non-singular diagonal matrix accounting for the scaling indeterminacy and P is a
permutation matrix accounting for the order indeterminacy [13].
Applying ICA algorithms to equation (3.28) and under the assumption of X = I an estimate of the
CFO matrix, as the mixing matrix in the ICA model, is obtained. After this processing and before
performing LMMSE channel estimation, inverse of the estimated CFO matrix is multiplied to the
received vector of symbols.
3.4. Conclusion
In this chapter we have studied about the ICI and various distortion effects caused by CFO
during transmission. Later on these effects are estimated and compensated by the blind
identification algorithm and then channel estimation is performed to get the better BER
performance of the system.

You might also like