Professional Documents
Culture Documents
e
l
f
t
U
n
i
v
e
r
s
i
t
y
o
f
T
e
c
h
n
o
l
o
g
y
Bachelor Thesis
UAV Camera System for Object Searching and Tracking
Harald Homulle 4032578
Jrn Zimmerling 4047621
June 2012
Faculty of Electrical Engineering,
Mathematics and Computer Science
UAV Thesis (June 22, 2012)
This Bachelor thesis is made using L
A
T
E
X in Nimbus Roman 11 pt.
Preface
The third year of the Bachelor Electrical Engineering is completed with the bachelor thesis. With the
bachelor thesis students show their academic capability, and ability to design a electronic system in a
structured manner. At the TU Delft the bachelor thesis is embedded in the nal bachelors project where
students design and build a prototype for a client.
From this perspective Johan Melis and Thijs Durieux are our clients. The two master students from
Aerospace Engineering are setting up a competition for Unmanned Aerial Vehicles (UAV). The contes-
tants of the competition have to built a UAV with the ability to perform the subsequent tasks:
track a ground vehicle;
track an aerial vehicle;
y a track as fast as possible.
Our team, consisting of six Electrical Engineering students, developed an electronic system which per-
forms the tracking of a ground vehicle. At the same time a group of ten Aerospace Engineering students
designed and built a xed wing UAV that can y as fast as possible. Although our electronic system
was tested on a quadrocopter UAV, it was designed for the xed wing UAV developed at the Aerospace
faculty.
Within a time span of seven weeks, the goal of this project was to implement a tracking system on a
UAV with the ability to send a picture of the tracked vehicle to a ground station. Our team was split up
in three groups: a transmission and visualization, a position and control and a camera system group.
The authors of this thesis have developed and implemented the imaging part. A camera system was
developed which has the ability to detect an object and generate control signals for the UAV in order to
follow the object. Furthermore, formulas are deduced for the estimation of the location and velocity of
the ground vehicle.
The design choices and decisions regarding the camera system are described in this thesis . An
overview of the hardware and software of the camera system is given. The choices made in the design
trajectory are motivated and the performance of the implemented system is shown.
Harald Homulle & Jrn Zimmerling
Delft, June 2012
Acknowledgements
We would like to thank our supervisor Dr. Ir. C.J.M. Verhoeven for the nancial support and his enthused
way of supervising. We would like to acknowledge our clients B.Sc. J. Melis and B.Sc. T. Durieux
for their support and giving us the opportunity to participate in the UAV Project. The Microelectronics
Department deserves special thanks for the provision of budget and two quadrocopters. We would like to
express our thanks to M.Sc. M. van Dongen for his support on producing a PCB for the Toshiba Camera.
Ir. S. Engelen for the technical discussions and the support of a Leopard Imaging Camera Module. We
are grateful to M. de Vlieger for handling our orders and her help with various administrative tasks.
Furthermore we like to thank the Raspberry Pi foundation for their support with a Raspberry Pi. Thanks
also to our team members, whose team spirit, technical discussions and support were invaluable.
i
ii
Summary
An embedded electronic camera system for unmanned aerial vehicles (UAV) was developed in the scope
of the nal bachelor project at the TU Delft in 2012. The aim of the project was to develop a prototype
for a new UAV competition.
The electronic system of the prototype had to able to track an object, make a picture of it and send
it to a ground station, where the picture and ight data is visualized. To whole project is divided into
three subgroups each designing a specic part of the system. This thesis provides the design choices and
performance results regarding the image processing and object detection of the whole system.
It was considered that a literature study would usefully supplement the choice of hard- and software.
The literature study was extended by creating an overview of existing hardware for embedded video
processing.
To perform embedded image processing the hardware components were selected rst. It was decided
that the system consists out of a camera, a processing board and a power converter. As camera a webcam
was chosen, because it is robust, fast, small and it has the ability to buffer data, so no buffering on the
processing board was needed. The Beagleboard, an ARM based prototyping board, was picked as a pro-
cessing board, because it is the fastest of the compared boards. Furthermore it doesnt use much power, is
compatible with the hardware used by the other project groups and enough drivers for peripheral devices
are available. A buck converter was selected to down convert the voltage of a battery to the lower input
voltage needed by the Beagleboard, as it had the highest efciency of the considered converters.
The Beagleboard had to be capable of multiple sensor communication, real time image processing and
running location and velocity estimation algorithms. It was decided to run Linux on the Beagleboard,
because it allows high level programming.
For prototyping reasons, C++ was used to implement the algorithms, because it is easy to implement,
fast and can be handled well by the Beagleboard. The Open Computer Vision library was selected to
make the implementation of the readout of the webcam fast and the implementation of the algorithms
facile. Algorithms were implemented for colour detection, real location estimation of the object and
velocity estimation.
Different object detection approaches were considered and colour detection by thresholding each
pixel of the captured video frame, was selected as the most convenient algorithm, because it is the fastest
and least complex algorithm of the compared methods. The design brief laid explicit constraints on a
high frame rate such that colour thresholding was selected, as being the fastest algorithm.
The pixel coordinate of the object in the frame was found by a centre of mass calculation of the
selected pixels. A moving average lter was implemented to remove noise from the object detection
algorithm.
Trigonometric based algorithms were deduced to estimate the real location from the pixel coordi-
nates. The velocity of the object was found by the displacement of the object between two frames over
the time between those frames.
The designed prototype was tested in two different approaches; a test setup with a static camera and one
with a moving camera, mounted on a quadrocopter. In both congurations it was shown the built system
is capable of reliable object detection, location and velocity estimation.
iii
From the results of these experiments, it could be concluded that the system can achieve an accuracy
of up to 99 % in the static and 90 % in the moving camera setup for the estimation of both the location
and the velocity. Furthermore, around 14 frames per second on a resolution of 320 240 could be pro-
cessed using the Beagleboard and running all algorithms described above.
The prototype built within this bachelor project and documented in this thesis, is able to full the tasks
dened in the assignment. It therefore meets the specications regarding the image processing and
object detection dened in the design brief of the clients. Some suggestions for future improvements
of the system are testing it on the xed wing UAV and further enhancing and optimizing the various
algorithms.
Although the system is not ready for the UAV competition yet, the rst milestone on the way to a
UAV with camera system has been reached within this thesis. The work performed in the scope of this
thesis clearly contributed to the electronic system of the nal xed wing UAV demonstrating the task of
the new UAV competition.
iv
Contents
Preface i
Acknowledgments ii
Summary iii
List of Figures vii
List of Tables ix
List of Abbreviations xi
1 Introduction 1
2 UAV Camera Systems in Literature 3
2.1 Object Detection Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Stabilization of the Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Object Detection with a Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 Velocity and Motion Vector Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.5 Object Tracking with a Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Build Setup & System Overview 7
3.1 System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.1 Design Brief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.2 Camera Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1.3 Processing Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.1 Analogue (CMOS) Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.2 Digital (CMOS) Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.3 USB Webcam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Video Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3.1 CMUcam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.2 Leopardboard DM365 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.3 Beagleboard XM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.4 Raspberry Pi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.5 FPGA Spartan-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4 Selection of the Hardware Components . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4.1 Selection of the Processing Unit . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4.2 Selection of the Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5 Power Converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.6 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
v
4 Object Detection & Tracking Algorithms 19
4.1 Proposed Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Colour Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2.1 RGB or HSV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2.2 Thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3 Stabilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4 Position Estimation of the Detected Object . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.5 Velocity Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.6 Optical Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.7 Generation of steering signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.7.1 Quadrocopter Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.7.2 Fixed Wing Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.8 Implementation and Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.8.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.8.2 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5 System Results 31
5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.1.1 Static Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.1.2 Moving Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.2.1 Statistical Experiment Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.2.2 Static Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.2.3 Moving Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.2.4 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2.5 Image Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.2.6 Power and Weight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.3 Experimental Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6 Conclusion 41
6.1 Evaluation of the Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.2 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Bibliography 45
A Design Brief 49
B System 51
C C++ Code 53
D Results 63
vi
List of Figures
3.1 Analogue CMOS camera 20B44P. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Digital CMOS camera Toshiba TCM8230MD & additional needed breakout board. . . . 10
3.3 Leopard digital camera module LI-VM34LP. . . . . . . . . . . . . . . . . . . . . . . . 11
3.4 Conrad mini webcam. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.5 CMUcam 4 (Board & Omnivision 9665 camera). . . . . . . . . . . . . . . . . . . . . . 12
3.6 Leopard board DM365. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.7 Beagleboard XM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.8 Raspberry Pi. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.9 Spartan-3 Develompent Board. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.10 Buck converter TPS5430 and AR.Drone battery. . . . . . . . . . . . . . . . . . . . . . . 17
3.11 Hardware setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.12 Overview of the full electronic UAV system. . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1 Comparison of the RGB and HSV colour spaces. . . . . . . . . . . . . . . . . . . . . . 21
4.2 Example of a set of pixels that are checked by the algorithm. . . . . . . . . . . . . . . . 22
4.3 3D sketch used to estimate the location of the target. . . . . . . . . . . . . . . . . . . . 23
4.4 Top view of the sketch given in Figure 4.3. . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.5 Steering signals generation; quadrocopter approach. . . . . . . . . . . . . . . . . . . . . 27
4.6 Image from a demo video of a red car driving on a road. . . . . . . . . . . . . . . . . . . 28
4.7 Image from a demo video; colour detection. . . . . . . . . . . . . . . . . . . . . . . . . 28
4.8 Image from a demo video; optical ow. . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.1 Semi-Outdoor testing setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2 Indoor testing setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.3 Moving camera testing setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.4 Two captured images from the semi-outdoor umbrella tracking experiment. . . . . . . . 34
5.5 Objects on the predened matrix as seen by the webcam. . . . . . . . . . . . . . . . . . 35
5.6 Indoor path of object. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.7 Indoor time line. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.8 Noisy and ltered pixel location in the static camera setup. . . . . . . . . . . . . . . . . 36
5.9 Path of the red cap in the moving camera experiment. . . . . . . . . . . . . . . . . . . . 37
B.1 Functional block diagram of the UAV system. . . . . . . . . . . . . . . . . . . . . . . . 51
D.1 Projection points of pixels under different resolutions. . . . . . . . . . . . . . . . . . . . 66
vii
viii
List of Tables
2.1 Overview of object detection / recognition techniques . . . . . . . . . . . . . . . . . . . 3
3.1 Overview of cameras. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Overview of processing boards. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Camera calibration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4 Overview of power converters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1 Overview of available object recognition algorithms. . . . . . . . . . . . . . . . . . . . 20
5.1 Performance overview on two test setups. . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Performance overview for software resizing. . . . . . . . . . . . . . . . . . . . . . . . . 38
5.3 Performance overview for hardware resizing. . . . . . . . . . . . . . . . . . . . . . . . 38
5.4 Size overview for different image storage formats. . . . . . . . . . . . . . . . . . . . . . 39
5.5 Overview of the measured power consumption and weight of the system. . . . . . . . . . 39
6.1 The various system requirements that are achieved in the present prototype. . . . . . . . 41
D.1 Measurement results accompanying Figure 5.7. . . . . . . . . . . . . . . . . . . . . . . 63
ix
x
List of Abbreviations
AR.Drone Parrot AR.Drone quadrocopter (Prototyping)
ARM Advanced RISC Machine
B byte
b bit
BMP Bitmap
CCD Charge-Coupled Device
CMOS Complementary Metal Oxide Semiconductor
COCOA Moving Object Detection Framework
D/A Digital/Analogue conversion
DC Direct Current
FoV Field of View
FPGA Field-Programmable Gate Array
fps Frames per Second
GPIO General Purpose Input/Output port
GPS Global Positioning System
GPU Graphical processing unit
GV Ground Vehicle
HSV Hue, Saturation and Value
Hz Hertz
I
2
C Inter-Integrated Circuit
I/O Input/Output ports
IR Infra Red
JPEG Joint Photographic Experts Group
LDO Low-Dropout Regulator
LDR Laser Detection Radar
LVFG Lyapunov Vector Field Guidance
MODAT Moving Objects Detection and Tracking Framework
N.A. Not Applicable
OpenCV Open Computer Vision Library
OS Operating System
PCB Printed Circuit Board
PNG Portable Network Graphics
RGB Red, Green and Blue
RMS Root Mean Square
SD Secure Digital (Multimedia Flash Memory Card)
Sonar Sound Navigation and Ranging
TLD Track, Learn and Detect
TVFG Tangent Vector Field Guidance
UART Universal Asynchronous Receiver/Transmitter
UAV Unmanned Aerial Vehicle
USB Universal Serial Bus
V Voltage
xi
VHDL VHSIC Hardware Description Language
VPSS Video Processing Subsystem
W Watt
YUV Luminance and two Colour Chrominances
xii
Chapter 1
Introduction
Unmanned aerial vehicles, also known as drones, already perform a wide range of tasks in the military
sector [1]. These tasks include surveillance, territory exploration, reconnaissance and even attacking.
The history of many military inventions, like GPS or the internet, shows that the civil market often
adapts or enhances military products for civil use [2]. The same seems to happen with UAVs as the
growing number of non-military applications shows. Applications like wildre protection [3], dike or
pipeline surveillance or even lming are growing in the last years.
The newUAVcompetition is an initiative by Johan Melis and Thijs Durieux. The competition consists
of three tasks, rst the contestants have to show the speed and manoeuvrability of their UAV, second a
ground object has to be found in a eld and nally two UAVs will have to nd each other. To show the
achievability of these goals, a prototype for the competition is desired.
Two teams will be working on this prototype: ten students of Aerospace Engineering and six students
of Electrical Engineering. The aerospace team will design and produce a xed wing UAV
1
, the goal of
the electrical team is to develop the electronics of the system.
The electronic system of the developed UAV is roughly divided into three parts. One team deals with
the connection of the UAV to the ground station. Furthermore, they handle and visualize the information
exchanged between ground station and UAV. The second team works on the control of the UAV, the
sensors and the merging of the subsystems. The last team, formed by the authors of this thesis, develops
a camera system that recognizes and tracks objects. The electronic system cannot be tested on the xed
wing UAV of the Aerospace team and will therefore be tested on a quadrocopter
2
AR.Drone.
It is the aim of this thesis to describe the design and implementation process of the camera system and
the design choices made.
An overview of existing object detection and tracking methods is given to investigate which algo-
rithm, camera and processing unit are most suitable for a UAV. Thereafter the system specications are
given as well as the experimental performance of the camera system. Furthermore the chosen algorithm
used to detect and track objects is explained.
The requirements for the whole system are listed in the design brief in Appendix A. From those
requirements, a ow chart for the various tasks is gathered in Appendix B, where all tasks of the nal
system are shown. According to Appendix B the aims of the camera subproject are
to record a video in real time;
to adapt the video for further processing on an embedded platform;
to detect a ground vehicle;
to estimate the location of the ground vehicle;
and to estimate the velocity of the ground vehicle.
1
A xed wing UAV is essentially a small unmanned air plane with wings rather than rotors.
2
A quadrocopter UAV is a helicopter-like aircraft with four horizontal rotors, often arranged in a square layout.
1
2
The studies of UAV systems were usefully supplemented with a literature study of related elds. A lot of
research has been carried out on these subjects. In order for the raw camera data to be processable with
common object detection and tracking algorithms, video stabilization may be required.
Therefore a wide range of stabilization algorithms, which differ in complexity and accuracy, have
been proposed by various research groups [47].
Object recognition for UAVs can be either based on colour [810] or based on motion [6, 11, 12].
Stabilization can also be done after the object is detected by implementing a moving average lter [13]
on the detection data.
The estimation of the motion of objects can be calculated by optical ow methods [14], so the motion
vector of an object can be extracted from a video stream.
All papers report computational limitations of their algorithms due to the embedded system and re-
quirement of real time data analysis, therefore special attention will be paid to the speed and complexity
of the algorithms used in this thesis.
This thesis is organized as follows. In chapter 2 the current technological level of all related elds is
discussed. Design choices are made in chapter 3 regarding a camera, a processing board and a power
supply. The overviewof the camera systemand the total systemis also given in that chapter. The software
and algorithms are reported in chapter 4; formulas for the calculation of the position and velocity of the
object are deduced. The performance of the camera system is given in chapter 5. The conclusions of this
thesis and some suggestions for further work are provided in chapter 6.
UAV Thesis UAV Camera System
Chapter 2
UAV Camera Systems in Literature
In this chapter a brief overview of the current technical level with respect to related elds is given. By
analysing the literature and research performed on the eld of UAV camera systems and image process-
ing, this chapter seeks to point out the state of the art. Using the overview given in this literature study,
design, implementation and algorithm choices can be made in the following chapters.
In section 2.1 a discussion of various object detection methods shows that object detection with a
camera is desirable for the goals of this project. Subsequently section 2.2 gives an overview of software
and hardware methods in order to stabilize camera pictures. Several camera object detection algorithms
are a topic of discussion in section 2.3. Section 2.4 discusses techniques to extract the motion vector of
an object from a video stream. Finally, section 2.5 shows different options to track an object once its
location and motion are known.
2.1 Object Detection Methods
Depending on their application, camera, sonar, or laser systems are used to detect objects. Differences
between the named methods can be found in Table 2.1 where these methods are listed and compared.
The size and price range from small and cheap solutions up to large and expensive ones.
Table 2.1: Overview of object detection / recognition techniques
Attribute Camera Camera Sonar [1517] LDR [18]
Colour IR
Dimensions Two dimensional Two dimensional Three dimensional Three dimensional
Information Three colour layers One heat layer Shape / Distance Shape / Distance
Detection basis Colour / Motion Heat Shape Shape
Size Very small Very small Large Large
Weight Very small Very small Large Large
High Data Rate Low Resolution Costly
Disadvantages Good Contrast Needed High heat contrast needed Range 20 m Mainly Distance Acquaintance
Ambient Light Needed Mainly used under water Large and heavy
Advantages High Detail Works at night Works at night High resolution
It can be seen that camera systems have the advantage of being very small, cheap and having a high
resolution. However camera systems produce a much higher data rate than needed for object detection
and a colour contrast is needed for reliable detection.
For Sonar (Sound Navigation and Radar) and LDR (Laser Detection Radar) systems, the detection is
based on the shape of the object. Sonar detection is done with sound and is normally used underwater,
LDR is based on the same principle, but uses lasers instead of sound. The reection of either the sound
or light waves on objects is used to create a 3D shape view of the area of interest. If the exact shape of
the object is known, this shape can then be detected on the 3D shape view.
3
4 2.2. STABILIZATION OF THE CAMERA
The resolution of these systems is related to the used wavelength in both applications. The wavelength
of light waves used for LDR is much smaller than the wavelength of acoustic waves used for Sonar so
the resolution of LDR is much higher. However both systems are quite large and more expensive than
camera systems.
As described in the design brief (in Appendix A) a camera is required. Therefore optical object
recognition based on cameras is studied in more detail in the following sections.
2.2 Stabilization of the Camera
The camera can be stabilized both in hardware and in software. A (more) stable image results in better
tracking results and thus a higher accuracy of the algorithm. The hardware approach will increase the
weight of the system, but will result in a stable image. The software approach will increase the processing
power needed and results are mostly worse than using hardware to control the motion.
Li and Ding [7] suggest the use of servos to control and minimize the camera movement. Stabilization
inside the camera can be performed using gyroscopes to measure camera movement and shift the sensor
in the opposite direction to balance the movement [4].
In order to stabilize a video in software the movement of the camera is computed by comparing mul-
tiple frames with each other. The video is stabilized by shifting the frames to correct for the computed
movement. The approaches found in literature differ by the algorithm to estimate the camera move-
ment. [19] suggests to estimate the camera movement by a spacio-temporal approach, where blocks are
matched in subsequent frames.
Another approach implemented in the COCOA framework for UAV object tracking proposed by [5,6]
is Ego Motion Compensation. [5, 6] combine a feature based approach similar to [19] with a gradient
based approach as proposed by [20]. The algorithm tends to adept itself during the process and learns to
discriminate shapes, stabilization is based on shifting the image according to where the shape is detected.
A combination of hardware and software video stabilization is described by [4]. Instead of estimating
the movement with an algorithm, the movement is measured with gyroscopes. The frame correction is
similar to software approaches and is based on image shifting.
A total different approach is not stabilizing the camera or the camera sensor, but stabilizing the mea-
surement data. This can be done by implementing a moving average lter as described by [13] and
weighting the previously measured values with a certain factor to establish a weighted average.
2.3 Object Detection with a Camera
In general two types of object recognition are described in literature: Colour recognition [810] and
Movement recognition [6, 11, 12]. Colour recognition is based on the detection of particular colour bulbs
of the object. For recognition based on colour [8], the processing is quite easy.
In each single frame the pixels with Red, Green and Blue (RGB) components are checked for a
particular colour condition. After this colour thresholding the centre of mass of all pixels satisfying the
condition is calculated. This algorithm is quite fast and needs just a few calculations per pixel. Similar
is detection based on a certain brightness threshold, a black and white image is used in this method. A
bright object can be tracked by thresholding the luminance of the pixels [9].
Depending on the tracked object, converting the video to other colour spaces, such as Hue, Saturation
and Value (HSV) or Luminance and two Chrominances for the colours (YUV) , can be benecial for
the reliability of the algorithm. This technique can be implemented on a FPGA for fast parallel process-
ing [10].
To detect moving objects in a video stream, subsequent frames are compared to each other in order to
detect movement. This requires more processing power since multiple frames are correlated to each
other whereas colour recognition can be performed by thresholding single frames.
Recognition based on motion is more complex than recognition based on colour [11]. Several frames
UAV Thesis UAV Camera System
CHAPTER 2. UAV CAMERA SYSTEMS IN LITERATURE 5
are buffered to extract the motion of objects within those frames. Assuming the targeted object is the
moving object in the processed frames results in object recognition. On steady cameras, a reference
frame is stored and pixels are checked for new objects appearing in the scene [12]. On moving cameras
(like on UAVs) this is even more difcult, since the camera itself is already moving.
The Moving Object Detection and Tracking (MODAT) framework stitches subsequent camera frames
to a map, on which motion is detected using a Gaussian Mixture Learning technique. This technique
learns to discriminate background movements from the real target objects. A fast parallel processing unit
is needed to calculate the motion in real time [6].
In both methods the camera and processing adds noise to the image, which leads to the detection of
undesired objects (noise). Various noise reduction methods have been suggested in literature. Noise
reduction can be achieved by applying some blur to the image, which removes detail and noise. Similar
a median lter can remove variations between pixels in the image and eliminate noise as well as huge
differences [6, 9]. A method of noise compensation used after object detection is called binary erosion,
which checks the pixels and their neighbours for connectivity. Assuming the object consist of more than
one pixel, only connected pixels are recognized as an object [10].
2.4 Velocity and Motion Vector Calculation
Besides the location of the object its velocity is needed, in order to track an object. The estimation of the
objects location is discussed in section 2.3, whereas the estimation of the velocity and motion is covered
in this section.
The calculation of a motion vector eld from a video stream is called optical ow. The method is
based on the assumption that the changes in two subsequent image frames are small and the colour of the
object stays the same. Now the displacement of each pixel in two subsequent frames is calculated. The
resulting vector eld contains the direction and magnitude of movement of each pixel. On the UAV used
in this project, the camera itself moves, thus the motion vector of the object is given by the difference
between the background and object motion.
In literature many different ways of calculating the motion vector eld on video streams are discussed.
[14] discusses an optical ow algorithm to estimate the motion of objects. The proposed algorithm is
focused on the motion of edges since the error rate of common optical ow algorithms is quite high at
the edges of objects.
The calculation of the motion eld of a wavelet transformed video is discussed by Lui et al. [21]
and [22]. The wavelet transform of an image consists of the contour lines of an image. By calculating
the movement of the contour lines of objects it is easier to assign a single motion vector to an object and
to separate different objects.
Another accurate but computational intense method of motion estimation is proposed by Farnebck
[23]. This algorithm is based on intense tensor computations and is not suited for embedded solutions.
However Farnebck suggests to only calculate the displacement of pixels in the near neighbourhood of
the object to save computation time.
Overall the methods to calculate the motion of objects discussed in literature are computationally intense.
The development of motion detection algorithms suitable for embedded platforms is beyond the scope
of this thesis. So only simple optical ow methods have been studied within the scope of this thesis.
2.5 Object Tracking with a Camera
Object detection over multiple frames can lead to object tracking and is necessary for autonomous UAVs.
The generation of steering signals from a captured video in order to track an object is a complex task,
especially for large velocity differences between the object and UAV. The camera system has to calculate
UAV Camera System UAV Thesis
6 2.5. OBJECT TRACKING WITH A CAMERA
the movement of the tracked object in order to y in its direction.
[7, 9, 24] propose the use of servomotors to change the position of the camera relative to the UAV.
This minimizes the chance of losing the tracked object.
[24] describes the main two algorithms for path planning with UAVs: Tangent vector eld guidance
(TVFG) and Lyapunov vector eld guidance (LVFG). According to Chen et al. [24] a combination of
both algorithms is desired for UAVs with larger turning limit circles than the tracked object.
Common UAV autopilots, such as the Paparazzi project [25], implement such path planning algo-
rithms. By supplying GPS coordinates and object speed to the autopilot, it is able to delineate a path
to the specied location. Even complex tasks, like ying in an eight shape above a location, can be
performed by Paparazzi.
UAV Thesis UAV Camera System
Chapter 3
Build Setup & System Overview
The literature study in chapter 2 points out that the accuracy and reliability of real time object detection
and motion estimation is mainly limited by the computational power of the processing unit. It might
therefore not be unexpected that the choice of hardware components and especially the core processing
unit of the system is one of the key points of this thesis. The algorithms presented in chapter 4 are directly
affected by the hardware setup presented in the following sections.
In this chapter an overview of the hardware setup and the overall system is given. Section 3.1 points
out the system requirements as dened in the design brief. Based on these requirements several cameras
and processing units are analysed. The hardware conguration is explained in section 3.2 for the camera
and in section 3.3 for the processing unit. Both sections give an overview of existing hardware in order
to rate each of them with respect to the system requirements. The hardware choices made are motivated
in section 3.4 The selection of the power converter is discussed in section 3.5. In section 3.6 an overview
of the nal system is provided.
3.1 System Requirements
In this section the requirements regarding the camera system are explained. First the explicit require-
ments as noted in the design brief (Appendix A) are listed. Thereafter requirements from a technical
point of view based on ndings in the literature study are stated.
3.1.1 Design Brief
In the design brief the following requirements regarding both the camera and the processing unit are
listed:
Speed
The speed of the system is high enough to track vehicles up to 55 km/h at a maximum UAV
speed of 100 km/h;
The speed of the image processing reaches a minimum amount of at least 15 frames per
second (fps) , implying that a picture is taken at least every 2 m.
Object Tracking
The object can be detected and followed from at least 50 m.
Power
The system can y one hour on a separate battery;
Therefore it was decided that the systems power consumption is below 5 W .
Weight
The maximum payload of the quadrocopter UAV is 100 g;
The maximum payload of the xed wing UAV is 250 g;
The power source is excluded from this weight.
7
8 3.1. SYSTEM REQUIREMENTS
Finalizing
The system is documented in a manner which allows other persons to understand and expand
the system;
The system has enough computational power and free I/O ports in order to extend the system
after delivery.
Test platform
As a test platform the quadrocopter AR.Drone was used. Although the system is designed for a xed
wing UAV, a quadrocopter is used for several reasons. First of all the destination platform, a xed wing
UAV, is developed at the aerospace faculty simultaneously with the camera system, such that the xed
wing UAV cannot be used as a test platform. Since a quadrocopter is able to hover, it can be used indoor
in contrast to a xed wing UAV which needs to move forward to take off. This allows weather and wind
independent indoor testing. Finally, in the case the control over a quadrocopter is lost, it simply keeps
hovering or crashes down. Whereas a xed wing UAV becomes a projectile it is more likely for people
to get hurt or for the UAV to be damaged. Because of these reasons a quadrocopter is used as testing
platform.
3.1.2 Camera Requirements
Besides the requirements mentioned in the design brief, extra requirements derived from literature, the
available timespan and the framework of the project are given:
Usable resolution;
Available drivers for a camera;
Field of View (FoV) < 45
;
Auto colour / light adjustment;
Maximum price of e250 (Processing board, camera and power converter);
Delivery in less than one week.
In literature the difculty of processing large resolutions in real time on embedded devices is described.
Therefore only cameras with a maximum resolution of around 640 480 are examined.
In the context of this thesis real time is dened as a soft real time criterion. A system is a soft real
time system in the case the result of an algorithm loses its value after a certain deadline. In this context
the term real time is interpreted as directly processing the image data before capturing a new frame. Thus
the location of the target object is updated in the speed of the frame rate. A lower frame rate or delays
in the algorithm result in a lower quality of service and accuracy of the system. But the system can still
operate with a low frame rate, so that the real time criterion is not a hard criterion.
The eld of view should be in the range 30-45
. Lower than 30
Yes
Weight Not specied 54 g 74 g 44 g Not specied
Advantage
Cheap
Low power
Easy interface with LI-camera
Fast GPU
Camera interface
Light
Cheap
Fast parallel processing
Optimal use of Hardware
Disadvantage Slow processor Not enough I/O ports
Heavy
Expensive
Unknown delivery time
Slow prototyping
No drivers for peripherals
arctan
_
w
2 h
_
and =
360
arctan
_
l
2 h
_
(3.1)
For the used Conrad mini webcam, the results are given in Table 3.3.
Table 3.3: Camera calibration.
Horizontal Angle 44.9
FT
p+l;p+l+1
=
l
n=1
a
n
FT
p+n;p+n+1
(4.1)
where a
n
is the nth lter coefcient. The lter coefcients themselves have to satisfy normalization
criteria
l
n=1
a
n
= 1. (4.2)
UAV Thesis UAV Camera System
CHAPTER 4. OBJECT DETECTION & TRACKING ALGORITHMS 23
A disadvantage of using a moving average lter is that the speed of the system is directly related to the
length of the moving average lter. The length of the moving average lter and the weighting coefcients
is a trade-off between stability and speed of the system.
4.4 Position Estimation of the Detected Object
After the detection of the object in a camera frame (section 4.2), the location of the target vehicle can be
estimated. This location is needed in order to calculate its velocity and to generate steering signals for
the UAV.
The aim of this section is to derive a simple algebraic expression to estimate the location of the target
Ground Vehicle (GV) relative to the position of the UAV. These expressions depend on given parameters
like the sensor outputs and the position of the target in the image frame.
M
P
F
h
P
B
A
D C
B'
A'
D'
C'
d
T
x
y
- Captured image
Figure 4.3: 3D sketch used to estimate the location of the target.
F
B
A
D
C
T
x
y
FTx
FTy
FT
x
- Vertical target distance
FT
y
- Horizontal target distance
Figure 4.4: Top view of the sketch given in Figure 4.3.
In Figure 4.3 a sketch of the conguration of UAV and GV is shown. A top view of the conguration is
depicted in Figure 4.4. The UAV is located at the point P, its rectangular projection on the ground, the
UAV Camera System UAV Thesis
24 4.4. POSITION ESTIMATION OF THE DETECTED OBJECT
foot point, is called F and the target GV is located at point T. The barometer measures the height h of
the UAV and the GPS sensor measures the location of point P, so the location F is known as well.
Furthermore, the horizontal eld of view angle , and vertical eld of view angle are given by
calibrating the camera. The camera is mounted in an angle of . The horizontal angle of the UAV to the
ground can be taken into account by redening as the relative mounting angle without effecting the
following derivation. The relative mounting angle is dened as the sum of the absolute mounting angle
and the angle of the UAV to the horizontal ground. For simplicity the UAV is assumed to be parallel to
the ground at all time. However for xed wing UAVs the horizontal angle of the vehicle to the ground is
not neglectable, since xed wing UAVs can y under large horizontal angles.
The trapezium ABCD is dened as the projection of the captured image A
X =
_
X
obj
centre
Y
obj
centre
_
. (4.4)
This coordinate is the output of the object detection algorithm and denotes the centre of mass of the
object in the captured frame. The projection of this coordinate on the ground is the relative position of
the target with respect to the UAV. Thus the pixel coordinate (0, 0) corresponds to an object located at A
and a pixel coordinate of (N
x
pixels
, N
y
pixels
) to an object located at C.
According to this the two dimensional reference coordinate system xy is dened where
x is dened
along the ight direction
FM and
y is dened along
BC. The relative coordinates of the target in the
dened coordinate system can be written as
FT =
_
FT
x
FT
y
_
. (4.5)
Since the pixels form a uniform grid in the camera image, one can assign a specic camera angle to
each pixel. The angle between neighbour pixels is dened as
in the vertical
direction. The pixels are uniformly distributed over the total eld of view angle so the angle between
neighbour pixels is dened as
=
N
y
pixels
1
and
=
N
x
pixels
1
. (4.6)
Since the exact angle of each vertical pixel is known, FT
x
can be calculated similar to Equation 4.3 as
FT
x
=h tan
_
+
2
(X
obj
centre
)
N
x
pixel
1
_
. (4.7)
From the top view shown in Figure 4.4, it can be seen that the target angle with respect to the central
camera axis can be written as
=
_
Y
obj
centre
N
y
pixels
2
_
=
N
y
pixel
1
_
Y
obj
centre
N
y
pixels
2
_
, (4.8)
where N
y
pixels
denotes the total number of pixels in the y direction.
Thus the relative location FT
y
can be written as
FT
y
=FT
x
tan
_
_
=FT
x
tan
_
N
y
pixel
1
_
Y
obj
centre
N
y
pixels
2
__
. (4.9)
UAV Thesis UAV Camera System
CHAPTER 4. OBJECT DETECTION & TRACKING ALGORITHMS 25
4.5 Velocity Estimation
Next to the position of the tracked ground vehicle its velocity is required in order to generate steering sig-
nals. A rst approach based on the time derivation of the position estimation is presented. In section 4.6
a more advanced optical ow algorithm is given.
The velocity of the tracked object is extracted from the displacement of the object in two subsequent
frames. To determine the objects velocity the displacement of the tracked object with respect to the UAV
and of the UAV itself is required.
The displacement of the object relative to the UAV can be calculated with the algorithm given in
section 4.4. The GPS coordinate is measured at each captured frame so that the absolute position and its
displacement are known as well.
The relative positions of the object are denoted as
FT
1
=
_
FT
x
1
FT
y
1
_
and
FT
2
=
_
FT
x
2
FT
y
2
_
, (4.10)
in the rst and second frame, respectively. The averaged positions as found by the moving average lter
are used. Furthermore the location of the of the UAV is given by
P
1
=
_
P
x
1
P
y
1
_
and
P
2
=
_
P
x
2
P
y
2
_
, (4.11)
in the same two frames. The xdirection of the coordinate system is dened by the UAVs ight direction,
whereas the ydirection is dened orthogonal to the xdirection.
The displacement vector of the object with respect to the ground between frame one and two
D
1;2
,
can be calculated as
D
1;2
=
_
FT
2
+
P
2
_
FT
1
+
P
1
_
. (4.12)
In the capturing algorithm the time between two frames is measured with a timer in order to be able to
calculate the frame rate. In the context of velocity estimation the time between two frames is denoted as
t. This time is needed to calculate the velocity of the object from its displacement.
The velocity, dened as the displacement per time interval, is calculated with
V
1;2
=
D
1;2
t
(4.13)
The absolute velocity is given by the length of
V
1;2
denoted as,
V
1;2
=
_
V
x
1;2
2
+
V
y
1;2
2
. (4.14)
4.6 Optical Flow
As stated by [14] the motion can be calculated with an optical ow algorithm. For all optical ow
methods the brightness of the object is assumed to be constant and the displacement is said to be small.
The basic idea of optical ow is nding the displacement of the object: the motion between two image
frames. This motion is between the rst frame at time t and the second frame at time t +t. Because the
intensity (I) of the object is assumed to be constant, this gives
I(x, y, t) = I(x +x, y +y, t +t). (4.15)
With the assumption that the movement is small, the above equation can be expanded using Taylor series,
this results in
I(x +x, y +y, t +t) = I(x, y, t) +
I
x
x +
I
y
y +
I
t
t. (4.16)
UAV Camera System UAV Thesis
26 4.7. GENERATION OF STEERING SIGNALS
Assuming the higher order terms of the Taylor expansion are neglectable. Combining 4.15 and 4.16 it
holds that
I
x
x +
I
y
y +
I
t
t = 0 (4.17)
and dividing the above equation by t yields
I
x
x
t
+
I
y
y
t
+
I
t
t
t
=
I
x
V
x
+
I
y
V
y
+
I
t
= 0. (4.18)
The V
x
and V
y
are components of the displacement over time of I(x, y, t), better known as the image
velocity or optical ow. This nally leads to its most simple form
I
x
V
x
+I
Y
V
y
= I
t
. (4.19)
However this equation has two unknowns and thus cannot be solved.
Various implementations of the basic algorithm are given, which introduce additional conditions, so
the system can be solved. The most used method is the differential method of calculating the optical
ow. For this method partial derivatives of the image signal are used. The most common implementation
is the one given by Lucas and Kanade: the so called Lucas-Kanade method [47]. It is beyond the scope
of this thesis to give a detailed description, but the algorithm basically works on a local area around the
point of which the ow needs to be calculated I(x, y, t). In the second image the pixels of a certain local
area (with size
x
and
y
) around the I(x, y, t) are investigated. Then V (V
x
and V
y
) is found to be the
vector that minimizes the residual function [47]
(V
x
,V
y
) =
x+
x
i=x
x
y+
y
j=y
y
_
I(i, j, t) I(i +x, j +y, t +t)
_
2
(4.20)
Now that the tracking is basically explained, it is still unclear what will or can be tracked. In an image
not all blocks of pixels are as traceable as other ones. This has to do with the surrounding features, on
a uniform road for example, the old pixel can be anywhere in the new image. Therefore nding high
contrasts is the basis of nding good features. On those high contrast lines, the highest contrast points
can be chosen to be tracked in the algorithm.
The nal result is a vector eld of the selected pixels, a vector points in the direction of the displace-
ment between the two frames. This vector eld can be used to estimate the motion of the object by
comparing its motion vector to that of the surroundings and subtracting them. This will yield the speed
of the object relative to the ground. The advantage is that the own speed is not needed for this technique
compared to the simple velocity algorithm of section 4.5.
4.7 Generation of steering signals
With the algorithms described above, the position of the object in the image is found. Although two algo-
rithms for path planning were studied in section 2.5, only the generation of steering signals is discussed
below. For the quadrocopter a simple rst order approach is given, for the xed wing UAV this approach
is not usable and therefore only path planning is possible.
4.7.1 Quadrocopter Approach
As a rst test approach the hovering property of the quadrocopter is used. The steering signals are
generated according to Figure 4.5.
The picture is divided into nine quadrants. When the object is in one of those quadrants, steering
signals will be generated belonging to that quadrant. The goal is to get the object in the centre of the
UAV Thesis UAV Camera System
CHAPTER 4. OBJECT DETECTION & TRACKING ALGORITHMS 27
frame, so steering signals are generated to full this. When the object is in quadrant LF the UAV has to
go left and forward, in this way the object will be shifting towards the centre. When the object is in the
centre the UAV can hover at its current position till the object moves away.
This method is quite easy to implement and can have good results on the quadrocopter, on the xed
wing UAV more complex algorithms should be applied to the known position of the object, because it
cannot hover on a spot.
L
LF
LB
S
F
B
RF
R
RB
L - Move Left
R - Move Right
F - Move Forward
B - Move Backward
And logic combinations
S - No Movement
Figure 4.5: Steering signals generation; quadrocopter approach.
4.7.2 Fixed Wing Approach
On the xed wing UAV the usage of steering signals is not useful, therefore path planning is needed to
derive a smooth ight path to the object. Path planning can be done by an autopilot, such as Paparazzi
[25], if given an estimated GPS and velocity of the object. The autopilot calculates the optimal ight
path to its destination. This technique is therefore optimal for the use on a xed wing UAV, which cannot
hover on a spot.
4.8 Implementation and Simulation
The algorithms needed to be tested before they were implemented on the Beagleboard. Below both the
implementation and simulation are described.
4.8.1 Implementation
C++ was used for the implementation of the algorithms, cause of its easy implementation, its high pro-
gramming level and primarily the availability of the Open Computer Vision Library (OpenCV). OpenCV
is an extensive C++ library that holds many functions regarding image processing, feature tracking, video
stabilization and more. Because OpenCV has many built in functions, readout of a webcam and process-
ing of video data is simplied, which speed ups the time required for writing video capture functions and
memory allocation.
4.8.2 Simulation
The colour detection algorithm and optical ow have been tested on a demo video. The other algorithms
require unknown parameters, like height and camera angles. Those algorithms were only tested in the
experimental test setup, given and described in chapter 5.
A demo video of a red car driving on a road is used to test these two algorithms. The video is taken
from a helicopter, which is similar to a video from a quadrocopter. An image of the demo video is shown
in Figure 4.6.
UAV Camera System UAV Thesis
28 4.8. IMPLEMENTATION AND SIMULATION
Figure 4.6: Image from a demo video of a red car driving on a road.
Testing was done using C++ with OpenCV. The algorithms were directly implemented and tested on a
pc. Below the simulation results using the demo video in OpenCV are given.
Colour Detection
The basic implementation of section 4.2 was followed and tested on the demo video. In the algorithm the
threshold values were chosen for H > 160 and H < 15. This gives the result as depicted in Figure 4.7.
On the image the pixels that satisfy the given threshold values are coloured black and the centre of mass
white (enlarged for better visibility). It can be seen that the car is the only object detected on the image,
Figure 4.7: Image from a demo video; colour detection.
the centre of the object is coloured white, with this the operation of the colour detection algorithm is
demonstrated.
UAV Thesis UAV Camera System
CHAPTER 4. OBJECT DETECTION & TRACKING ALGORITHMS 29
Optical Flow
Various implementations of the optical ow algorithm are predened in OpenCV. But before the optical
ow can be calculated, some points in the image have to be selected. This is done with the nd good
trackable features algorithm, also dened as an OpenCV function.
Figure 4.8: Image from a demo video; optical ow.
On Figure 4.8 the vector eld of the tracked features is given. The red dots are the current positions
of the features the algorithm selected and is now tracking. The white lines are the vectors connecting
the current positions to where they were in the last frame. It can be seen that the vectors of the car
are pointing in the direction the car is driving in. The surrounding vectors are pointing in the opposite
direction. From this it follows that the car is being chased and the chopper is ying in the direction the
car is driving in.
UAV Camera System UAV Thesis
30 4.8. IMPLEMENTATION AND SIMULATION
UAV Thesis UAV Camera System
Chapter 5
System Results
In chapter 4 the algorithms were tested with a demo video. In this chapter the performance of the
hardware setup described in chapter 3 together with the algorithms explained in chapter 4 is tested. First
of all the experimental setup is described in section 5.1. Two setups are described: a static and a moving
camera setup. In the following section 5.2 the results of each setup are presented. The results are nally
discussed in section 5.3.
5.1 Experimental Setup
The system was tested in two setups. Firstly using a static camera and secondly using a moving camera
on the quadrocopter. The static setup was split up into an indoor and an outdoor experiment to test the
system under various light conditions.
5.1.1 Static Camera
The outdoor setup is described rst, followed by the indoor setup. The outdoor setup is called semi-
outdoor, as the camera was placed inside a building looking outside, since the setup was not mobile at
the time of testing.
Semi-Outdoor
On the semi-outdoor setup, the camera was placed inside a building, behind a window, looking outside.
The height was set to 6.5 m and the camera angle to 72
FT
x
- Vertical target distance
FT
y
- Horizontal target distance
T - Target = Umbrella
h - Height = 6.5 m
- Camera angle = 72
P
p=1
1
X
2
p
_
X
p
X
p
_
2
P
_
_
_
_
100% (5.1)
where
X
p
is the location measured in the p-th measurement and X
p
is the actual location. Furthermore P
is the number of measured locations.
5.2.2 Static Camera
The results of the two static camera settings are given below. The results of the semi-outdoor approach
are given rst, followed by the indoor experiment.
UAV Camera System UAV Thesis
34 5.2. EXPERIMENTAL RESULTS
Semi-Outdoor
The semi-outdoor approach showed that the algorithms can detect objects in various outdoor light condi-
tions. The umbrella used as target object was detected in the bright sunlight as well as in the shadow of a
tree. It may thus be concluded that the object detection algorithm works under outdoor light conditions.
This experiment was mainly used as a qualitative experiment, since it is hard to measure the camera
height and object location in the given setup. However some quantitative result are given as well.
The real distance between the camera and the umbrella was measured and compared to the distance
estimated by the algorithm. Captured images are shown in Figure 5.4. On 5.4(a) the measured distance
was 16 m; the estimated distance 19 m. In 5.4(b) the real distance was set on 21 m, estimated 25 m. The
difference of some meters is caused by the inaccuracy in camera height, in the angle , and the difculty
of measuring the real distance.
(a) Tracking Close (b) Tracking Far
Figure 5.4: Two captured images from the semi-outdoor umbrella tracking experiment.
Indoor
At the indoor setup the accuracy of the algorithm was determined, because distances and angles could be
measured better than at the semi-outdoor setup. A red cap was used as target, as depicted in Figure 5.5,
where the matrix as seen by the camera is shown. The red cap was moved along a path to test the
reliability of the system on moving objects. The path of the red cap is visualized in Figure 5.6(a).
During the experiment, the location and velocity as estimated by the system were stored in a log le
and plotted with Matlab later. The estimated positions can be seen in Figure 5.6(b). The estimated points
on the matrix match with the real points on the ground. The path between two matrix spots is not straight,
as the red cap was moved by hand without the use of a ruler.
To show the path over time, a Matlab plot was made of the objects estimated position versus the frame
counter, shown in Figure 5.7(a). The measurement data from the experiment that covers this section is
listed in Appendix D Table D.1. The estimated velocity of the object can be found in Figure 5.7(b). It
can be seen that the object is rst found on location (x, y) = (3 m, -1 m), this is in accordance with the
starting point set in Figure 5.6(a). Around frame 60 the object starts moving towards the point (x, y) =
(1 m, 0 m). So the same time the x-coordinate changes 2 m, the y-coordinate changes only 1 m. Looking
at the speed, the velocity in the x-direction is twice the velocity in the y-direction, which is consistent to
the double displacement over the same time as found in the distance estimation. The second movement
is found to be in only in the x-direction, this can be found in both velocity and distance plots. On the
velocity plot some noise can be seen, this is caused by very small movements over multiple frames which
are not smoothed by the moving average lter.
The accuracy of the location estimation was calculated from the log le (Table D.1) according to
Equation 5.1. In the horizontal direction an accuracy of 99,3 % and in the vertical direction an accuracy
UAV Thesis UAV Camera System
CHAPTER 5. SYSTEM RESULTS 35
Figure 5.5: Objects on the predened matrix as seen by the webcam.
1 m
1 m
(a) Drawing of the path of the Ob-
ject.
(b) Objects estimated distance (Matlab plot).
Figure 5.6: Indoor path of object.
of 98,6 % was reached in the static indoor setup. The errors made are mainly due to measurement errors
in the camera angle and the alignment of the camera with respect to the matrix. The discretisation error
due to the projection of the discrete pixels on the ground plays a minor role.
The accuracy in the horizontal direction is higher than in the vertical direction, as the projection of
the image pixels on the ground have an higher spacing in the vertical direction for large distances. The
projection of the image pixels on the ground is shown in Figure D.1 in Appendix D. One can clearly see
that the distance between neighbouring pixel projections increases with increasing distance between the
target and the camera.
The accuracy is strongly dependent on the measurement error of the camera angle. Suppose a target
has been detected in the middle of the image and a camera angle of 66
causes an error in the vertical position estimation of 4.2 %. This error is even higher for
greater camera angles or targets positioned in the top row of the image. This shows the high sensitivity
of the position estimation algorithm with respect to the camera angle. This sensitivity can be reduced by
choosing lower camera angles.
UAV Camera System UAV Thesis
36 5.2. EXPERIMENTAL RESULTS
(a) Object Estimated Location.
(b) Object Estimated Velocity.
Figure 5.7: Indoor time line.
In the case noise is added to the static camera system by shaking the camera the impact of the moving
average lter can be analysed. In gure Figure 5.8 the horizontal pixel location of the target is shown
before and after the moving average lter. The averaged pixel location does not vary more than 3 pixels
even though the noisy pixel location varies more than 14 pixels. One can clearly see that the noise applied
by shaking the camera is removed by averaging over multiple samples. In this example a moving average
lter of length 10 has been used. The drawback however can be found in the speed of the system. The
system gets slower as the lter becomes longer, as changes in the coordinate are averaged with previous
coordinates. The accuracy of the algorithm drops by 1,1% in x- and y-direction when applying noise.
Figure 5.8: Horizontal pixel location of the target as detected by the camera system before (blue) and after (black)
the moving average lter. Noise has been applied to the system by shaking the camera. The length of the moving
average lter is 10.
5.2.3 Moving Camera
As described in the setup of the moving camera, the webcam was mounted on the AR.Drone. In the
experiment the AR.Drone was hovering around a spot. The estimated location of the target object with
respect to the UAV is depicted in Figure 5.9. The red cap was placed 1 m in front of the AR.Drone, thus
on (x, y) = (1 m, 0 m). Thereafter it was moved to (x, y) = (2 m, 1 m) and the last point can be found on
(x, y) = (2 m, -1 m).
UAV Thesis UAV Camera System
CHAPTER 5. SYSTEM RESULTS 37
The AR.Drone its position and camera angle were estimated, however the real values vary due to
drifting of the AR.Drone. Due to this drifting the object seems to be shifting around its location, which
can be seen as noise and is partially corrected by the moving average lter. A second problem discovered
in the experiment is that the AR.Drone cannot maintain height, which implies the object also seems to
shift forward and backward with respect to its real position. This problem can of course not be xed by
the moving average lter.
One can clearly see that the results of the moving camera system are less accurate than the results of
the static camera; thus the accuracy is lower. The accuracy drops to 85.7 % in the x-direction and 89.1 %
in the y-direction.
Figure 5.9: Path of the red cap in the moving camera experiment on the ground as estimated by the camera system.
The camera was mounted on the AR.Drone, such that camera angle and height cannot be assumed constant.
5.2.4 Performance
The performance is independent of the experimental setup, as it depends on the algorithm, the operating
system (OS) and the platform used.
The same algorithm was benchmarked on the Beagleboard XM using Linux Ubuntu and on a laptop
using the same camera and OS. For this comparison, a resolution OF 640 480 was used. The results
of this performance benchmarking are given in Table 5.1.
Table 5.1: Performance overview on two test setups; resolution of 640 480.
Platform Beagleboard XM Dell Vostro 1710
Processor ARM Cortex A8 @ 1 GHz Intel Core2Duo @ 2.6 GHz
RAM 512 MB 3 GB
OS Ubuntu ARMEL (11.10) Ubuntu 32 Bit (12.04)
Capturing [fps] 13 28
HSV tracking [fps] 4.5 25
Optical Flow [fps] 3 18
Because the number of frames that can be handled at full resolution on the Beagleboard is not suf-
cient according to the design brief, a second test with images scaled in software was performed. Results
UAV Camera System UAV Thesis
38 5.2. EXPERIMENTAL RESULTS
for those smaller resolutions are given in Table 5.2. It can be seen that the capture rate is xed at 13-
14 fps. Under this condition, the processor is always fully loaded at 80 %, where the other 20 % are used
by the operating system.
Table 5.2: Performance overview for software resizing.
Resolution Capturing [fps] HSV tracking [fps] Optical Flow
[fps]
Full 640 480 13 4.5 4-2
Half 320 240 13 8.5 9-6
Third 213 160 13.5 10 10-8
Quarter 160 120 14 11 10-8
= Optical Flow is divided in no movement (upper bounds) and movement (lower bounds).
While enabling HSV colour tracking, the number of frames per second drops to 4.5 fps at full resolution,
whereas it is possible to do the calculations at 8.5 fps at the half resolution of 320 240. It can be further
increased by scaling to lower resolution, but it is limited to around 13-14 fps (the capture rate).
The main problem with this technique is the fact that the scaling is done after capturing a frame. Thus
a frame is captured at 640 480 and afterwards scaled down. Therefore, after testing software scaling,
also hardware scaling was implemented. The camera is set to capture images at a resolution lower than
640 480, so that no software scaling is needed. In Table 5.3 the performance results for hardware
scaling are given. It can be seen that the number of frames that can be captured is not any more limited
to 13 fps. Furthermore the performance increase for HSV tracking at half resolution is almost 50 %, for
a quarter of the original resolution this is even almost 120 %.
Table 5.3: Performance overview for hardware resizing.
Resolution Capturing [fps] HSV tracking [fps]
Full 640 480 13 4.5
352 288 24 12
Half 320 240 26 14
176 144 29 18
Quarter 160 120 31 24
The optical ow algorithm gets more intensive with increasing movement, therefore the number of
frames per second drops while increasing the camera or object movement. Furthermore the optical ow
algorithm is too slow to be used in this real time application, this can be improved by either decreasing
the resolution the Optical Flow is working on, or by decreasing the number of points to track. However
an algorithm with varying computational time is not desirable in real time applications. Thus the velocity
estimation is based on trigonometric functions.
However the fps benchmarking listed in Table 5.2 and Table 5.3 indicate the maximum possible
frame rate of the setup. In the nal prototype, the object detection algorithm is not the only program
running on the microprocessor. The sensor readout, transmission of data and steering of the UAV are
tasks of the processing unit as well. The programs performing the listed tasks have not been provided
at the day of testing. The benchmarking was performed with object detection, location calculation and
velocity estimation only. However the object detection is the most intense application running on the
Beagleboard.
Although it is suggested to use a resolution of 320 240 it order to detect even small objects, a
resolution of 176 144 has to be used to reach the system requirement of 15 fps. In the case a lower
frame rate does not affect the application signicantly a resolution of 320 240 should be used.
UAV Thesis UAV Camera System
CHAPTER 5. SYSTEM RESULTS 39
5.2.5 Image Size
The camera system has the ability to capture, compress and transmit an image from the UAV to the
ground station. Therefore the data size of different image compression algorithms is a topic of interest.
In Table 5.4 an overview of the size of stored image frames is given. The following three formats
were compared: JPEG, PNG and BMP. Those formats are the most common compressed image formats.
OpenCV has a standard function to write a captured frame to any of these formats. The frames were
stored under their default compression ratios. The JPEG images clearly have the smallest le size and
are therefore preferred as image storage format for sending pictures through the XBees. The compression
of an image takes the same time for all image format, such that choice of format is only based on data
size.
Table 5.4: Size overview for different image storage formats; default compression.
Resolution JPEG [kB] PNG [kB] BMP [kB]
Full 640 480 60 335 900
Half 320 240 18 93 225
Quarter 160 120 6 27 57
5.2.6 Power and Weight
The nal system consumes 4.5 W and weights 175.5 g as shown in Table 5.5. A maximum power
consumption was not specied in the design, but a requirement of 5 W was added as a requirement for
the system. The system consumes less than 4.5 W and therefore meets design criteria. In the short time
span of the project is was not possible to design a system under 100 g such that only subsystems could be
tested on the AR.Drone. The design criterion of 250 g was reached as the system only weights 175.5 g.
Table 5.5: Overview of the measured power consumption and weight of the system.
Device Power Weight
Beagleboard 2.5 W 72 g
Camera 1.5 W 25 g
GPS Sensor 75 mW 10 g
Barometer 25 W 0.5 g
XBee 200 mW 20 g
USB to UART N.A. 25 g
Buck converter N.A. 23 g
Efciency
Buck Converter
95% N.A.
Total 4.5 W 175.5 g
5.3 Experimental Discussion
In section 5.2 the experimental results were listed. The accuracy, performance, image size and power
were calculated or measured. In this section the outcomes of the experiment are a topic of discussion.
Firstly, the accuracy is derived for both the static camera and the moving camera. As expected the
accuracy for the static camera is higher than for the moving camera. Whereas the static setup reaches up
to 99 %, the moving setup is limited to around 90 %. This is mainly caused by the disturbance and drift
UAV Camera System UAV Thesis
40 5.3. EXPERIMENTAL DISCUSSION
of the AR.Drone. On the xed wing it is suspected that a higher accuracy can be achieved, because the
xed wing UAV is more stable when ying on its cruise speed.
In the nal prototype the camera angles and ight height are measured by the Position and Control
system developed by another subgroup of the project. Thus the position estimation is based on actual
ight data and not on a static height and a xed camera angle. This has a positive effect on the accuracy
as well.
Furthermore it was found that the accuracy is highly dependent on the camera angle . Changing the
camera angle from 66
to 67
to 44
) should be chosen. But in order to control a large area the opposite holds
true. So the choice of the camera angle is a trade-off between accuracy and view area. For the prototype
on the AR.Drone a good trade-off between view area and accuracy has been found empirically with a
camera angle of 45
.
On the xed wing UAV the use of smaller camera angles is suggested, as a xed wing UAV operates
at greater heights such that the covered area is larger than on the AR.Drone.
The performance was measured in the second step. As showed, changing the resolution of the captured
frames, affects the frame rate directly. As stated in the design brief, around 15 fps need to be captured.
By halving the resolution to 320 240 a frame rate of 14 fps has been achieved. It has been shown that
hardware resizing is more effective than software resizing.
Changing the resolution however, directly affects the size of the object that can be detected, as can
be seen in Appendix D Figure D.1, a height h of 0.77 m and a camera angle = 66
, around 45
.
Continued on next page
UAV Thesis UAV Camera System
APPENDIX D. RESULTS 67
Continued from previous page
(b) Resolution 320 240
(c) Resolution 176 144
Figure D.1: Projection Points of Pixels under different resolutions.
UAV Camera System UAV Thesis