Professional Documents
Culture Documents
System
NILESH GOEL, ADITYA AGARWAL, SUBHAYAN BANERJEE, CHANDRA VEER SINGH
Department of Electronics and Communication Engineering,
MNNIT, Allahabad- 211 004, India
Email: nilesh.goel@gmail.com
Abstract In this paper; a system for reliable detection an algorithm to track sound direction and in
of the direction and efficient recognition of human section IV an algorithm for faithful recognition
voice is proposed that is to be used in an autonomous of voice commands is proposed. The
humanoid robot. After detection and recognition, the experimental results are given in section V to
system will perform the tasks according to the given show the improvement in the proposed algorithm
commands. Compared with previous researches, this
system comprises simpler, faster and more accurate
as compare to previous reported algorithms.
algorithms. The system consists of a microphone
assembly of three microphones for sound detection II. PROPOSED SYSTEMS
and one separate microphone for voice recognition,
band pass filter, microcontroller as processing unit for The block diagram of the proposed systems
sound detection and PC(Personal Computer) as for reliable detection of the direction of the voice
processing unit for speech recognition using MATLAB commands and its recognition is shown in Fig. 1
7.0.1, motor controller unit and mechanical assembly.
The robot senses the human voice lying in the
and Fig. 2 respectively.
frequency range (300 Hz-3400 Hz) and detects the
direction of human voice using delay of arrival (DOA)
mechanism and then recognizes the voice commands.
In order to show the viability of the proposed
algorithms, these are being implemented in an
experimental autonomous robot named ‘ACFRO’
(Autonomous Command Following RObot).
I. INTRODUCTION
The power spectral density of noise shows Figure 4: Bottom view of mechanical
that maximum power is concentrated at low assembly
frequencies as compared to higher frequencies.
Therefore the band pass filter is used to eliminate Apart from processing unit all the modules used
environmental noise(<300 Hz) as well as to pass in speech recognition system are same as sound
the audio signal .Here it is an active second-order detection system and need no further description.
Butterworth band pass filter which is designed For voice recognition system PC (using
using operational amplifiers [4] (IC LM324). MATLAB 7.0.1) acts as a processing and
The pass band of this filter lies with in the audio decision making unit and explained in detail in
range (300Hz – 3400Hz). section 4.
The processing and decision making is done The Delay of Arrival (DOA) [1] mechanism
by ATMEL’s AVR family ATMEGA32L is used for efficient detection of the direction of
microcontroller [3], [9]. This microcontroller has sound. This mechanism uses the time delay from
32k programmable flash memory and maximum the sound source to each microphone. The
clock frequency 8 MHz. It has an inbuilt 8 microphones are connected as the vertices of an
channel, 10 bit A/D converter. Here A/D equilateral triangle. The microcontroller samples
converter is used for converting analog signal analog electrical signal from each microphone
from the output of band pass filter to digital one by one in a predefined cyclic manner (...M1,
signal which is processed by the processing unit M2, M3 M1, M2, M3….) and simultaneously
of microcontroller and accordingly, it will converts this to digital signal with the help of
generate appropriate control signals to drive the inbuilt A/D converter.
motors used in mechanical assembly. First, microcontroller takes some predefined
MOTOR CONTROLLER UNIT number of samples from each microphone to set
the threshold level that depends on the amplitude
of local disturbances. This makes the system
immune to local disturbances as now
microcontroller recognizes only those signals
that are having higher amplitude than the set
threshold level.
Now the microcontroller samples the three
microphones continuously and detects to which
microphone the sound comes first (having
amplitude higher than the threshold level).
After determining the first microphone that
receives the sound, the microcontroller sets the
offset angle (0 degree for M1, 120 degree for M2
and 240 for M3) according to the orientations of Figure 5: Angular range of microphone
microphones M1,M2 and M3 and then it only
samples the rest two microphones so that Now, as the sound is being detected by M2
problems due to echo are ignored. next, therefore, the angular region from which
Now the rest two microphones (say M2 and sound is coming is limited to θ 2 . The DOA of
M3) are monitored continuously and the
microcontroller determines to which microphone sound Td between M2 and M3 will determine the
the sound comes next and accordingly it
deviation from the offset angle that is 0° in this
determines whether the robot has to take a
case as M1 is the first microphone to detect the
clockwise or anticlockwise turn and generates
sound.
the control signals for motor controller IC .After
As M2 is the second microphone after M1
this, it only samples the remaining microphone
to detect the sound so the microcontroller will
and calculates the delay in the arrival of sound
generate control signals for motor controller IC
between the two microphones (M2 and M3)
to turn the robot in clockwise direction for the
using the inbuilt timer. This delay Td determines specified time duration determined by the
the angle of deviation from the offset angle and conversion formula..
that will be the final angle at which the robot has For reliable calculation of Td , it is necessary
to rotate to reach the operator. Conversion
that
formula is derived to convert the final angle to
1
corresponding time duration for which robot has Td > (1)
to take a turn. To explain the above stated f
algorithm, consider the following example.
Suppose the sampling of three microphones where f is the frequency at which the
is being done in the sequence of M1, M2 and M3 microcontroller samples the digital data received
and this sequence is repeated continuously. The from the microphones.
sequence entirely depends on the programming If condition (1) is not satisfied, i.e., the
of the microcontroller. time taken by the microcontroller to sample one
There are six possible sequences in which
sound reaches the microphones. 1
microphone ( ) is more than the DOA Td
1. M1, M2, M3 f
2. M1, M3, M2 ,the microcontroller will be not able to judge that
3. M2, M1, M3 which microphone received the sound first and
4. M2, M3, M1 proposed algorithm will be not efficient.
5. M3, M1, M2 To avoid this situation, the sampling frequency,
6. M3, M2, M1 f, is kept as high as possible.
Here in sequence 1 M1, M2, M3 represents The maximum delay of arrival can be
that the sound first reaches the microphone M1 observed in the case when the sound source is
then M2 and then M3. As in this case, M1 will just between two microphones for example when
detect the sound first which means that the sound
the angle between the source and the M1 is 60°.
source must be in the angular range ( θ1 ) as In this situation, the maximum delay of arrival
shown in Fig. 5. between M2 and M3 obtained from Fig. 6 is
given by
fT
= 3 L sin θ × (6)
Vs
where use has been made of (4) and (5).
Equation (6) can be rewritten as
⎛ NVs ⎞
θ = sin −1 ⎜⎜ ⎟⎟ . (7)
⎝ 3 f T L ⎠
a = 2 L cos 30°sin θ
= 3 L sin θ
(9)
Time Delay = a / Vs
Fig. 10: Plot of θ error vs. n
= 3 L sin θ / Vs
IV. ALGORITHM TO RECOGNIZE VOICE
=1/f (for one sampling time period) (10) COMMANDS
EXPERIMENTAL OBSERVATIONS FOR Algorithms for the proposed interface have been
SOUND DIRECTION DETECTION developed and from experimental results shown
it can be proved that the proposed algorithms are
Location Living Room efficient to fulfill the objective of successful
interfacing of a robot with its operator through
Distance Angle Angle Error voice direction detection and its recognition. As
Detected compare to previous researches [1] this system is
0° 6° +6° more accurate and simpler and faster.