You are on page 1of 9

ORIGINAL ARTICLE

A new algorithm to rigid and non-rigid object tracking


in complex environments
A. H. Mazinan & A. Amir-Latifi
Received: 3 October 2011 / Accepted: 2 April 2012 / Published online: 2 May 2012
#Springer-Verlag London Limited 2012
Abstract With a focus on complex environments, the pres-
ent paper describes a new algorithm in rigid and non-rigid
object tracking through color feature. Object tracking in
these environments is taken into consideration as real-time
applications, such as manufacturing, surveillance and mon-
itoring, smart rooms, and so on, where partial or full occlu-
sion sensibly occurs. As is obvious, the best color-based
object tracking algorithm is now known, as the mean shift
(MS) iterative procedure, to find the location of an object in
image sequences. The algorithm performance is not unfor-
tunately acceptable once objects in complex environments
need to be tracked. In fact, the main aim of the present
research is to improve the MS tracking algorithm, by pro-
posing an improved convex kernel function, which is now
realized in association with the Kalman filter approach
(KFA). In the algorithm proposed here, the KFA is
employed to solve the full occlusion problems since the
speed for the objects is constant. Subsequently, the present
investigated robust kernel function has been designed to
dominate the low saturation and partial occlusion problems.
Keywords MS tracker
.
Kalman filter approach
.
Color
feature
.
Rigid and non-rigid objects
.
Bhattacharyya
coefficient
1 Introduction
First of all, it should be noted that object tracking algorithms
in complex environments have been a challenging task in
the area of intelligence-based surveillance systems, up to
now. Real-time applications, such as manufacturing, surveil-
lance and monitoring, perceptual interfaces, smart rooms,
and also video compression, need an efficient tracking per-
formance. With a focus on manufacturing systems, there are
a variety of researches to improve this process. Motavalli et
al. suggest image reconstruction system for reversing engi-
neering in the area of design modifications. In this research,
a methodology for developing the part of image reconstruc-
tion systems is designed to extract 3-D data from the sur-
face. Cheng-Jin et al. present the last developments of the
applications of image processing techniques in food quality
evaluation. In the present work, techniques of image pro-
cessing have been applied to increase the food quality. In
fact, this publication reviews advances in image processing,
which include charge-coupled device camera, ultrasound,
magnetic resonance imaging, computed tomography, and
electrical tomography for image acquisition. In this regard,
Demant, et al. propose industrial image processing for visual
quality control in manufacturing. Also, a 3D reconstruction
of CT image series has been proposed by Klein et. al in
pediatric craniofacial surgery to compare milling and stereo-
lithography. The importance of visualization in manufactur-
ing simulation has been also investigated by Rohrer, where
visualization, as a critical component of simulation technol-
ogy, has been surveyed to help communicate results and get
better understanding of a model's behavior. In the area of
manufacturing systems, an image-based approach in design-
ing and manufacturing the patient-specific craniofacial bio-
material scaffolds from CT or MRI data has been proposed
by Scott et al. as well. In the present investigation, voxel
density distribution is used to define scaffold topology. The
scaffold design topology is created by using image process-
ing techniques. It is also shown that the applicability of the
present research work is in so many real-time academic and
industrial domains. In particular, the present research results
could be useful to find a damage component in a manufac-
turing system since the chosen one could be taken into
account, as an object, in video sequences. With this purpose,
A. H. Mazinan (*)
:
A. Amir-Latifi
Electrical Engineering Department,
Islamic Azad University (IAU), South Tehran Branch,
Tehran, Iran
e-mail: mazinan@azad.ac.ir
A. Amir-Latifi
e-mail: arash_amirlatifi@yahoo.com
Int J Adv Manuf Technol (2013) 64:16431651
DOI 10.1007/s00170-012-4129-9
in the viewpoint of manufacturing system manager, one of
the important objects in video sequences could randomly
be chosen by the operator to be analyzed, under real-time
conditions. It aims the operator to follow all the stages, at
the beginning of productions. In such a case, the process
of production control could be reliable, correspondingly.
Now, for the purpose of improving the object tracker in
the area of manufacturing or other related domains, a
robust tracker needs to be included to be able to work
well in so many difficult situations, like various illumina-
tions, background clutter, and occlusion [16]. Object
tracking problems can be viewed as a state estimation
difficulty of dynamic systems. In this point of view,
algorithms can be divided into two categories [79]. The
first category is probabilistic method. This one views
tracking, as a dynamic state estimation problem, under
the Bayesian framework, provided that the system model
and its measurement bring in uncertainty. The representa-
tive methods are Kalman filter approach (KFA) and its
derivatives, particle filters and Monte Carlo tracking [10].
The second one is deterministic method. This method
compares a model with current frame to find out the most
promising region. The MS iterative procedure is a typical
example [11]. The deterministic methods are hard to deal
with the full occlusion, in an appropriate manner, since
the tracking algorithms are based on the previous investi-
gated results. If the tracked object is lost or occluded,
completely, the deterministic searching methods will cor-
respondingly fail. The algorithm proposed here covers
both categories simultaneously. It is shown that color
feature is extensively used in tracking methods because
it is easy to extract. One of the best methods for color-
based object tracking is to use the MS iterative procedure.
This MS tracker is a non-parametric method of climbing
the density gradient to find the peak of a distribution,
which belongs to the deterministic methods category.
The present MS tracker has been applied to image seg-
mentation, visual tracking, and so on [12, 13]. It works by
searching in each frame an image region to find the
locations, whose color histograms are closest to the
referenced color histogram of the objects. The distance
between two histograms is measured by using their Bhat-
tacharyya coefficient, and the search is performed in seek-
ing the object location via the MS iterations, beginning
from the object location estimated in the previous frame
(outlined in Section 2). In the MS tracker, color cue is
easy to compute. However, it may include some similarly
colored background areas, which distract tracking due to
the heavy noise. So, the MS iterative procedure needs to
be improved. To eliminate this problem, an improved
kernel function is proposed here since the KFA is corre-
spondingly realized for overcoming the existing problems.
The present KFA has been widely used for tracking in so
many areas of control theory, signal processing, computer
vision, and other related fields [14, 15]. The KFA esti-
mates the state of dynamic system, even if the precise
form of the system is completely unknown. This approach
is so powerful in the sense that it supports estimations of
the past, the present, and even the future states. As soon
as the object is partially or fully overlapping with another
one, the approach is superimposed to the algorithm.
This paper is organized as follows. The principle of the
MS iterative procedure is presented in Section 2. The KFA
formulation is then explained in Section 3. Section 4
describes the proposed object tracking algorithm using the
particular kernel function and KFA, as well. Finally, the
experimental results and concluding remark are given in
Section 5 and 6, respectively.
2 The MS iterative procedure
A principle of the MS iterative procedure and its relations
are now considered in these proceeding sub-sections.
A. Histograms, equations, and their likeness measurements
In the MS iterative procedure, the desired object is first
chosen by an operator or corresponding methods, and it is
shown as a rectangle. The inside pixel location of the rec-
tangles is presented as x
*
i
_ _
i1...n
. The selected area is
considered as the object model [1, 2]. The color histogram
of the object model is then calculated by
q
u
C

n
i1
k x
*
i
_
_
_
_
2
_ _
d b x
*
i
_ _
u
_
1
where b x
*
i
_ _
is the bin number (1,,m), associated with the
color, at the pixel of normalized location x, while is the
Kronecker delta function and also C is the normalization
constant. The pixel location of the object condition is cen-
tered at y in current frame denoted as x
i
f g
in
h
. By using the
kernel profile; k, with radius; h, the probability of the color
u, in the object candidate, is given by
p
u
y C
h

n
h
i1
k
y x
i
h
_
_
_
_
_
_
2
_ _
d b x
i
u 2
The radius of the kernel profile is determined via
the number of pixels of the object candidate. Function
k is known as Epanechnikov kernel, and its profile is
given as
kx /
1 x; 0 x 1
0; x > 1
_
3
1644 Int J Adv Manuf Technol (2013) 64:16431651
The metric used in the MS tracker to measure the likeness
between both histograms is given by Bhattacharyya coeffi-
cient as
d
q
y

1 py; q
_
4
y py; q

m
u1

p
^
u
yq
u
_
5
Therefore, if d
q
y 0 or py; q 1, then the max-
imum likeness between both the object and its candidate
models will occur.
B. Object location finding via Bhattacharyya coefficient
In accordance with the previous section, the most prob-
able location y of the object in the current frame is acquired
by optimizing the Bhattacharyya coefficient y. Hence, the
main goal in each frame of video sequences is to estimate
the object translation y that maximizes Eq. (4). Estimated
object location is denoted by y
0
, in the previous frame. The
Bhattacharyya coefficient (4) in the current frame is approx-
imated by its first-order Taylor expansion around the values
p
^
u
y
0
and substituting Eq. (2) for py, which results in
y % C
q;y
0

C
2

n
i1
w
i
k
y x
i
h
_
_
_
_
_
_
2
_ _
6
where
w
i

m
u1

q
u
p
^
u
y
0

d b x
i
u 7
And C
q;y
0
is independent of y [7]. The search for the new
object locationy
1
in the current frame begins at the estimated
object location y
0
in the previous frame. As mentioned in
[5], if the weights w
i
were non-negative and the kernel
profile k(x) was monotonically non-increasing and convex,
then a higher density value is reached by shifting y
1
fromy
0
to the mean of the sample, weighted by the kernel. The
profile is gx k
0
x and centered at y
0
.
y
1

n
h
i1
x
i
w
i
g
y
0
x
i
h
_
_
_
_
_
_
2
_ _

n
h
i1
w
i
g
y
0
x
i
h
_
_
_
_
_
_
2
_ _ 8
Thus, by applying convergence conditions in each frame,
better y
1
in the current frame will be reached. Table 1
illustrates all the steps of the MS iterations in object track-
ing, consecutively.
3 The KFA formulation
The KFA is realized here to estimate and predict an
object's location in current frame in accordance with the
object's location in the previous one. With using primary
state in case of object in image, the KFA can be per-
formed on it. Vectors firstly constitute position (location)
and the speed of the object, and then their initial values
are selected according to the available frames. Initial
value of the location is its object in primary frame, and
the initial velocity can be determined in line with the
motion and also the time difference between two frames.
Primary location of object is considered as points of the
body of an object arbitrary. But it is considered, usually,
as the object's center of gravity. Uncorrected choice of
initial conditions for the KFA brings tracking failure. In
the process of investigating the equations for moving
object in the images, it is important to show the constant
velocity. Motion of the tracked object is modeled by the
following [9]

X FX w 9
Measurements are in the form of linear combination
of the system state variables, corrupted by uncorrelated
noise. The m-dimensional measurement vector is
modeled as
Y HX v 10
The state transition matrix F describes the system dy-
namics and is now given by
F
1 0 1 0
0 1 0 1
0 0 1 0
0 0 0 1
_

_
_

_
11
The measurement matrix H is given by
H
1 0 0 0
0 1 0 0
_ _
12
Also, the position and the speed of a moving object
are considered as states of object. These two parameters
Int J Adv Manuf Technol (2013) 64:16431651 1645
are two-dimentional. Thus, state matrix is formed as
below
X x y V
x
V
y

T
13
Subsequently, the equations of the KFA, in this research,
are formulated by
x t 1
y t 1

x t 1

y t 1
_

_
_

_
F
xt
yt

xt

yt
_

_
_

_
pw 14
xt
yt
_ _
H
xt
yt

xt

yt
_

_
_

_
pv 15
The random variables w and v represent the state and
measurement noise, respectively. They are the zero mean,
white Gaussian noise with assumed known covariance Q
and R, respectively.
pw % N 0; Q 16
pv % N 0; R 17
The KFA is applicable with initial conditions of the state
matrix and chooses the appropriate Q and R.
4 The proposed object tracking algorithm
In this section, the proposed object tracking algorithm is
explained. In such a case, as mentioned earlier, the original
MS tracker has an appropriate ability in several object
tracking. However, this tracker has a poor performance in
facing some problems, such as object rotation, partial or full
occlusion, and so on. As a result, object will be lost under
serious conditions. The reason of this matter is to reduce the
Bhattacharyya coefficient under mentioned problems. For
example, Fig. 1 illustrates the results of the original MS
tracker on video sequence, whose desired object is over-
lapped by another one. Figure 2 illustrates four frames
which track a person from the moment. This person is
placed behind a tree until he gets out of it. In this video
sequence, there is a full occlusion, and the object disappears
during sixty frames. Thus, the object is lost. Figure 3 shows
the curve of the Bhattacharyya coefficient at the last exam-
ple, where its amount dropped because of full occlusion. So,
one of the best methods for detecting the error of tracking is
studying changes of the Bhattacharyya coefficient during
the tracking.
In this research, an improved convex kernel function is
proposed, while the KFA is correspondingly realized to
solve the existing problems. The flowchart of the proposed
algorithm is based on the integration of the MS tracker and
also the used approach, illustrated in Fig. 4. As is shown in
the present flowchart, after detecting the object, manually or
automatically, two processes are performed simultaneously.
The first process is to estimate y
0
. The amount of y
0
is equal
to the center coordinate of the rectangle that has surrounded
the object in previous frame.
The second one is the color model initialization, referring
to Eq. (1). The investigated kernel function, i.e., k(.) in the
original MS tracker is vulnerable. In this research, an im-
proved convex kernel function is now proposed as
ku exp
x x
2
h
y

y y
2
h
x
_ _
18
It assigns an accurate bigger weight to the locations,
which is near to the center of object and also a smaller
weight to the locations, which is farther from the center of
object. Because of this matter, the investigated kernel
Table 1 Algorithm of the original MS iterative procedure in object tracking
The original MS tracking
1. Calculating the probability of the color u in the object model via Eq. (1)
2. Estimating y
0
in the previous frame
3. Calculating the probability of the color u in the candidate model where centered at y
0
via Eq. (2)
4. Calculating the Bhattacharyya coefficient between y
0
and q via Eq. (5)
5. Calculating the weight w
i
f g
i1...nh
via Eq. (7)
6. Finding the new location of object y
2
according to Eq. (8)
7. Computing the distribution p
^
u
y
1

_ _
u1...m
8. Calculating the Bhattacharyya coefficient between y
1
and q via Eq. (5)
9. Applying coefficient conditions until finding a suitable location, while p y
1
; q < p y
0
; q do
1
2
y
1
y
0
!y
1
if y
1
y
0
k k < " stop
iteration y
1
!y
0
and go to step 3
1646 Int J Adv Manuf Technol (2013) 64:16431651
function in Eqs. (1) and (2) is efficient in the partial occlu-
sion problems. The obtained values from color model ini-
tialization and also estimation of y
0
are imported to the MS
iterative procedure. In the next step, y
1
and p y
1
; q are
calculated. If the Bhattacharyya coefficient between both of
them becomes less than a certain limit (due to some prob-
lems), then the tracking window drifts away. Therefore, the
MS iterative procedure cannot calculate y
1
in the current
frame because it is initialized according to the result of yield
from the previous iteration [3].
At this time, the algorithm distinguishes that the KFA
must estimate y
1
for new frame. The state matrix is formed
via Eq. (13) and y
1
is calculated via Eq. (14). The
obtained value is considered as the object's location in
Fig. 1 Object tracking results
of the original MS tracker in
sequence, whose desired object
is overlapped by another
object (partial occlusion)
Fig. 2 Object tracking results
of the original MS tracker,
where the object disappeared
during 60 frames (full
occlusion)
Int J Adv Manuf Technol (2013) 64:16431651 1647
the current frame. The amount of y
1
is estimated by the
present approach until Bhattacharyya coefficient becomes
less than a certain limit. Estimation by the KFA is
stopped, while the coefficient becomes more than a certain
limit. It should be noted that the proposed method now
works well in outdoor people and vehicle tracking, as is
easily obvious.
5 Experimental results
The proposed algorithm has been applied to the task of
tracking a human or a vehicle, marked by a rectangle auto-
matically or manually. Experiments are carried out on PETS
data set. In all experiments, the RGB color space was used.
Each color band was equally divided into 16 bins (1616
Fig. 3 The curve of the
Bhattacharyya coefficient for
Fig. 2
Fig. 4 The flow chart of the
proposed algorithm
1648 Int J Adv Manuf Technol (2013) 64:16431651
16). Algorithms are tested by the video sequences,
whose information is shown in Table 2. All video
sequences are 768576. Table 3 shows the tracking
results using eight video sequences, which are S1
through S8. According to Table 2, objects were placed
in different qualification. In the above sequences, the
object's location is marked in the first frame, manually.
The values of h
x
and h
v
depended on the size of the
rectangle surrounding the object. Also, the values of V
x
and V
v
in the KFA are manually taken in the algorithm.
Figure 5 shows the tracking results, using the proposed
method, in video sequence S6. In this one, the object is
rigid, and its direction is shifted. Color model is initialized
on the car's window in frame 10 of video sequence. Here,
there is no specific problem.
Figure 6 presents six frames from S4. In this video
sequence, there is a full occlusion case, and the object
disappears during 60 frames. It can be seen that the MS
algorithm without the KFA failed, in such a case. Therefore,
the proposed method is applied for person tracking. The red
box shows that the KFA is running for object tracking when
the person is placed behind the tree.
6 Conclusion
A novel algorithm for rigid and non-rigid object track-
ing has been proposed in line with a combination of
the color-based MS iterative procedure and also the
KFA. The proposed algorithm is employed to provide
an optimum solution to object tracking problems. The
original MS tracker does not work well under severe
conditions, and this is vulnerable to partial or full
occlusion, object rotation, and so on. The reason of
this matter is to reduce the Bhattacharyya coefficient
between initial model of color's object q
u
and candidate
model of the location estimated p
^
u
y
1
. Here, a partic-
ular kernel function and also the KFA are correspond-
ingly realized to overcome the existing problems. In the
algorithm presented here, by assuming constant speed
for the objects, the KFA is used to solve the full
occlusion problems. Also, an improved robust kernel
function has been employed to dominate the low satu-
ration and partial occlusion problems. Experimental
results demonstrate that our proposed algorithm is able
to estimate the location of the rigid and non-rigid
objects in video sequences.
Table 2 The video sequences used in the experiments
Sequence Type Target Sequence characteristics
S1 (70 frames) Rigid Car Rotation
S2 (90 frames) Non-rigid Human Partial occlusion
S3 (112 frames) Non-rigid Human Partial occlusion
S4 (226 frames) Non-rigid Human Full occlusion
S5 (200 frames) Non-rigid Human Partial occlusion
S6 (80 frames) Rigid Car Shift
S7 (120 frames) Non-rigid Human Stop and comeback
S8 (220 frames) Non-rigid Human Full occlusion
Table 3 The performance of the
original MS tracker in comparison
with the proposed algorithm
Sequence Proposed
algorithm
Fail
(nth frame)
Successful
rate (%)
Tracking
ability
S1 N 60 83 Normal
Y Ok 100 Good
S2 N 80 88 Normal
Y Ok 100 Good
S3 N 50 42 Bad
Y Ok 100 Good
S4 N 45 18 Bad
Y Ok 100 Good
S5 N 40 37 Bad
Y Ok 100 Good
S6 N 62 85 Normal
Y Ok 100 Good
S7 N 70 60 Normal
Y 110 90 Good
S8 N 100 88 Normal
Y Ok 100 Good
Int J Adv Manuf Technol (2013) 64:16431651 1649
Frame10 Frame80
a b
Fig. 5 Object tracking results
of the proposed method in S6.
The object is rigid, and its
direction is shifted. Color
model is initialized on the car's
window in frame 10 from video
sequence. There is no special
problem in this case
Frame 10 Frame 46
Frame 68 Frame 90
Frame 134 Frame 220
a b
c d
e f
Fig. 6 Object tracking results
in a full occlusion case from S4.
Frames are showing the
tracking of a person from the
moment. He is behind the tree
until he gets out of it. The red
box is shown, in which KFA is
running for object tracking
1650 Int J Adv Manuf Technol (2013) 64:16431651
Acknowledgments We are grateful to the Islamic Azad University
(IAU), South Tehran Branch for supporting the present research. This
work is carried out under contract with the Research Department of the
IAU, South Tehran Branch.
References
1. Motavalli S (1991) A part image reconstruction system for reverse
engineering of design modifications. J Manuf Syst 10(5):383395
2. Cheng-Jin D, Da-Wen S (2004) Recent developments in the appli-
cations of image processing techniques for food quality evaluation.
Trends Food Sci Tech 15(5):230249
3. Demant C, Streicher-Abdel B, Waszkewit P (1999) Industrial
image processing: visual quality control in manufacturing. Springer,
Berlin. ISBN 3-540-66410-6
4. Klein HM, Schneider W, Alzen G, Voy ED, Gnther RW (1992)
Pediatric craniofacial surgery: comparison of milling and stereo-
lithography for 3D model manufacturing. Pediatr Radiol 22
(6):458460. doi:10.1007/BF02013512
5. Rohrer MW (2000) Seeing is believing: the importance of visual-
ization in manufacturing simulation. IEEE Proceedings of Simula-
tion Conference, USA, 10.1109/WSC.2000.899087, vol. 2, pp.
12111216
6. Hollister SJ, Levy RA, Chu T-M, Halloran JW, Feinberg SE (2000)
An image-based approach for designing and manufacturing cra-
niofacial scaffolds. Int J Oral Maxillofac Surg 29(1):6771
7. Mazinan AH, AmirLatifi A (2012) Improvement of mean shift
tracking performance using a convex kernel function and extract-
ing motion information. Comput Electr Eng (in press)
8. Comaniciu D, Ramesh V, Meer P (2000) Real-time tracking of
non-rigid objects using mean shift. IEEE Conference on Computer
Vision and Pattern Recognition 2:142149
9. Liu H, Yu Z, Zha H, Zou Y, Zhang L (2009) Robust human
tracking based on multi-cue integration and mean-shift. Pattern
Recogn Lett 30(9):827837
10. Sanjeev M, Maskell S, Gordon N, Clapp T (2002) A tutorial on
particle filters for online nonlinear-non-Gaussian Bayesian tracking.
IEEE Trans Signal Process 50(2):174188
11. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward
feature space analysis. IEEE Trans Pattern Anal Mach Intell 24
(5):603619
12. Li S, Chang H, Zhu C (2010) Adaptive pyramid mean shift for
global real-time visual tracking. Image Vision Comput 28(3):424
437
13. Leichter I, Lindenbaum M, Rivlin E (2010) Mean shift track-
ing with multiple reference color histograms. Comput Vis
Image Understand 114:400408
14. Maghami M, Zoroofi RA, Araabi BN, Shiva M, Vahedi E
(2007) Kalman filter tracking for facial expression recognition
using noticeable feature selection. International Conference on
Intelligent and Advanced Systems, Kuala Lumpur, Malaysia,
pp. 587590
15. Welch G, Bishop G (2006) An introduction to the Kalman filter.
Department of Computer Science University of North Carolina at
Chapel Hill, July
Int J Adv Manuf Technol (2013) 64:16431651 1651

You might also like