Beamforming and microphone arrays

RAMIN ANUSHIRAVANI
ECE 551
FALL 2014
Sound Source Localization
with Microphone Arrays
Red box
ignore me, if you wish!
BOLD and GREEN
LOOK AT ME!
1
I need 7 minutes and 45 second to finish up 

Outline
 Background
 Application
 Human Sound Localization
 Time Delay Model
 Beamforming
 Signal Model
 Criteria
 Microphone Arrays
 Uniform Linear Array (ULA)
 Beampattern
 Spatial Aliasing
 Sound Source Localization
 Conventional Beamforming
 MUSIC algorithm
 Results
Where are you
technology? 
Where is my
hearing aid?
2

Application
3
 Why localizing a
sound source is
useful?
 Improving Speech
recognition
 Speech Enhancement
 Hearing aids
 Audio Surveillance
 Teleconferencing
 Spatial Audio
Background

How do we localize sound?
 Interaural time
difference - ITD
 Interaural level
difference - ILD
 Spectral information
Background
4

A Time Delay Model
 τ =
d sin(θ)
𝑐
 τ is the time delay
between the two sensors.
(.-)τ -> exp(jωτ)
Time Domain Frequency Domain
 Where c is the speed of
sound.
 Θ : Angle of arrival
d
θ
d sin(θ)
Ref
Background
Far Field Assumption
: Distance between the two sensors
5

 Delay the reference signal until the
sum of the energy of the two signals
is at its maximum (or undo the
delay from the delayed signal).
Example
𝑡𝑟𝑢𝑒 𝑑𝑒𝑙𝑎𝑦 = arg max(||𝑟𝑒𝑓(𝑛 + 𝑚)+delayed(n)||) = arg 𝑚𝑎𝑥 𝐶 𝑥𝑦 [m]
= 𝑚=0
𝑛
𝑑𝑒𝑙∗(𝑛)ref(n+m)
For x(n) =ref (n) , y(n) = delayed signal (n)
m (samples) -> τ (seconds) = d sin(θ)/ c
Background
6

Simulated
Signals
Direction
Of Arrival
(DOA)
Example
Background
Undo the delay 7

Beamforming
 Spatial Filtering
 Detect and estimate the
output of a sensor array
 Types
 Fixed vs Adaptive
Beamformer
 Delay and Sum (Filter and
Sum)
 MVDR (Capon)
 Narrowband vs
Broadband
 Beamformer
 Z(k) = 𝑾 𝑯Y(k)
Z(k)
Y1(k)
Y2(k)
Source
Recorded
at mics
Filters
8
Beamforming [S P. Boyd]

Signal Model
 𝒚 𝒏 𝒕 = 𝒈 𝒏 𝒕 ∗ 𝒔 𝒕 + 𝒗 𝒏 𝒕 = 𝒙 𝒏 𝒕 + 𝒗 𝒏 𝒕
y : received signal at each microphone
g : spatial response corresponding to
the source location
s : source signal
v: noise
 𝑌𝑛 𝑓 = 𝐺 𝑛 𝑓 𝑆 𝑓 + 𝑉𝑛 𝑓 = 𝑋 𝑛 𝑓 + 𝑉𝑛 𝑓 = d(f)𝑋1(𝑓)+v(f)
Where,
d: steering (direction) vector
X1: recorded signal at the first (ref) microphone
Beamforming
For simplicity we assume x
and v are uncorrelated.
9

Signal Model
Beamforming
𝒁 𝒇 = 𝒉 𝑯 𝒇 𝒚 𝒇
ℎ 𝐻
𝑓 [𝑑 𝑓 𝑋1 𝑓 + 𝑣(𝑓)] =
𝑋1,𝑓 𝑓 + 𝑉𝑟𝑛 𝑓
Where,
𝑋1,𝑓 𝑓 = ℎ 𝐻
𝑓 𝑑 𝑓 𝑋1 𝑓
𝑉𝑟𝑛 𝑓 = ℎ 𝐻 𝑓 𝑣(𝑓)
10
BeamformerRecorded Signal Beamformer
output

Beamforming Criteria
 Signal to Noise Ratio
 Array Gain
Output SNR over the input SNR.
 Noise Rejection
Amount of noise being rejected by
the beamformer.
 Beampattern
Represent the response of the beamformer to an arbitrary input signal as
a function of the steering vectors (microphone array impulse response).
Beamforming [Benesty et al.]
11

Microphone Array
Linear array2D array
Hexagon array
Spherical array
Ad-Hoc arrays
12

Microphone Array
Background
That parasite can
localize sound
better than me?!
WTB?
Sigh…
13
• Biologically inspired
Sonistic’s MEMs array
Ormia

ULA
 Collecting signal from a source with
microphone where the spacing
between each element is Δ.
 Signal received at the 𝑚 𝑡ℎ
microphone:
𝑥 𝑚 𝑡 = 𝑠𝑖(𝑡)
𝑖=1
𝑑
𝑒 𝑗(𝑚−1)μ𝑖 + 𝑛 𝑚(𝑡)
Where, 𝜇𝑖 = (-2π/λ) Δsin(θ𝑖) : Spatial frequency
𝑥 = 𝐴𝑠 𝑡 + 𝑛(𝑡)
Microphone Array [Bhuiy et al.]
14
Steering vectors based on a
time delay model for one
frequency (narrowband)
MIC
Angles
S
x
A

Beampattern
 𝑒−𝑖𝜔𝑡
= 𝑒−𝑖2π𝑓𝑡
=
𝑒−𝑖2π𝑘(𝑆)𝑑sin(θ)/𝑐𝑝
Where,
k : discrete frequency
S : Sampling Rate
p : Number of DFT samples.
 We can visualize the
steering vector by
plotting the steering
vectors over all
angles for an arbitrary
input for any number of
microphones.
Microphone Array
[𝒆−𝒊𝝎𝟎
⋯ ⋯ 𝒆−𝒊𝝎𝟎
] [𝒆−𝒊𝝎𝝉 𝟏 ⋯ ⋯ 𝒆−𝒊𝝎𝝉 𝒏]
Ref Delayed by 𝜏𝑖 span over
all angles and frequencies
Add and normalized by the number of
microphones for some arbitrary input.
Steering
Vectors
15

ITD Polar Pattern
Main lobe
Grating lobe
Spatial
Aliasing
16
1000 Hz 4000 Hz
2 Mic
22 cm
apart
2 Mic
2 cm
apart
Microphone Array

ITD Polar Pattern
17
1000 Hz 4000 Hz
4 Mic
22 cm
apart
10 Mic
2 cm
apart
Microphone Array

Spatial Aliasing
 Aliasing
“If the bandwidth of the signal exceeds half of the
sampling frequency, the spectral replicas overlap,
leading to a distortion in the observed spectrum.”
 Spatial Aliasing
“The spacing between adjacent microphone elements should
be less than half of the wavelength corresponding to the
highest temporal frequency of interest.”
Microphone Arrays [J. P. Dmochowski et al.]
18

If distance between adjacent
microphones > λ/2
Where λ = speed of sound/ frequency
Spatial Aliasing Frequency > 1600 Hz
Spatial Aliasing
Beamforming
Main lobe
Grating lobeOmni Response
19

Sound Source Localization
 Using beamforming and subspace methods to
localize a sound source.
 Delay and Sum - Classical Beamformer
 Capon – Minimum Variance Distortionless Response (MVDR)
beamformer
 Multiple signal classification (MUSIC) - A Subspace Algorithm
20

Delay and Sum
 𝒀 𝒕 = 𝑾 𝑯
𝑿 𝒕
 Output power:
𝑃 𝑤 = 1/𝑘 𝑘=1
𝐾
|𝑌(𝑡 𝑘 )2
| =
𝑤 𝐻
𝑅 𝑥𝑥 𝑤
Where,
𝑅 𝑥𝑥 = 𝑋(𝑡 𝑘 )𝑋 𝐻
(𝑡 𝑘 )
𝑤 = 𝐴(θ)
𝑷 𝑫𝑺 = 𝑨(𝜽)𝑹 𝒙𝒙 𝑨(𝜽) 𝑯
Source Localization [Richter]
Y(k)
X1(k)
X2(k)
Source
W
⋮
21

Minimum Variance Distortionless Response
 A delay and sum beamformer with an additional
constraint on the output power,
𝒘 𝑴𝑽𝑫𝑹 = 𝒂𝒓𝒈 𝒎𝒊𝒏 (𝒘 𝑯 𝑹 𝒙𝒙 𝒘) s.t. 𝒘 𝑯A(θ) = 1
Constrain the look direction gain to be, g(φ𝑖
) = 1 and
minimize the output power of the beamformer. φ
Source Localization [J. Capon]
22

Minimum Variance Distortionless Response
 𝑤 𝑀𝑉𝐷𝑅 = 𝑎𝑟𝑔 𝑚𝑖𝑛 (𝑤 𝐻
𝑅 𝑥𝑥 𝑤) s.t. 𝑤 𝐻
A(θ) = 1.
This lead to the Lagrangian,
 𝐽 𝑤, λ = 𝑤 𝐻
𝑅 𝑥𝑥w+λ(𝑤 𝐻
A(θ)-1)(A(θ)
𝐻
w-1)
After having lots of fun it turns out that,
𝒘 𝑴𝑽𝑫𝑹(𝜽) =
𝑹 𝒙𝒙
−𝟏
A(θ)
(A(θ)
𝑯
𝑹 𝒙𝒙
−𝟏
A(θ))
𝑷 𝑴𝑽𝑫𝑹 𝜽 =
𝟏
(A(θ)
𝑯
𝑹 𝒙𝒙
−𝟏
A(θ))
Source Localization
23

MUSIC
 𝑿 = 𝑨 𝜽 𝑺 + 𝑽
Where,
X: Collected samples (N samples)
A: Steering vectors
S: Source signals
V: Gaussian noise model with mean zero and variance 𝜎 𝑁
2.
𝑅 𝑥𝑥 =
1
N 𝑛=1
𝑁
𝑥 𝑛 𝑥 𝐻 𝑛 =
1
𝑁
𝑋𝑋 𝐻
𝐸{𝑅 𝑥𝑥 }= A(θ) 𝑅 𝑠𝑠 𝐴 𝐻
θ + 𝜎 𝑁
2
𝐼
Where,
𝑅 𝑠𝑠 =
1
𝑁
𝑆𝑆 𝐻
, signal covariance matrix.
Source Localization [Kawitkar]
24

MUSIC
 𝑹 𝒙𝒙 = [𝑼 𝒔 𝑼 𝒏 ]
𝝀 𝟏 ⋯ 𝟎
⋮ ⋱ ⋮
𝟎 ⋯ 𝝀 𝑴
𝑼 𝒔
𝑯
𝑼 𝒏
𝑯
Where,
𝑈𝑠 = signal subspace
𝑈 𝑛 = noise subspace
And λ1>λ2 > ⋯>λ 𝑀.
Span(𝑈𝑠 ) = span(A(θ))
MUSIC uses the orthogonally between the noise subspace and the
steering vectors.
𝑈 𝑛 ⊥ A(θ) => 𝑈 𝑛
𝐻A(θ)= 0.
Source Localization
You need to know how many sources you have.
25

MUSIC
 MUSIC Pseudo Spectrum is defined as,
𝑷 𝑴𝑼𝑺𝑰𝑪 𝜽 =
𝟏
||𝑼 𝒏
𝑯A(θ)||
=
𝟏
A(θ)
𝑯
𝑼 𝒏 𝑼 𝒏
𝑯A(θ)
 MUSIC Spatial Spectrum is defined as,
𝑷 𝑴𝑼𝑺𝑰𝑪 𝜽 =
𝟏
||𝑼 𝒏
𝑯A(θ)||
=
A(θ)A(θ)
𝑯
A(θ)
𝑯
𝑼 𝒏 𝑼 𝒏
𝑯A(θ)
=> MUSIC measures of the orthogonality between steering
vectors of the array and the noise subspace. The poles of this
expression points to the direction of the signal source.
Source Localization
26

Experiment Setup
≃ 8cm long
4 microphones
16 KHz sampling rate
PS EYE
(More like
PS ears :p
27
Results

Results
 1 source, @ 15 degree ish
 2 Microphones
28
1 Source, 2 Microphone

Results
• 2 sources at 15 and
-25 degree
• 4 Microphones
29
2 Source, 4 Microphone

Results
RMSE Delay and Sum MVDR MUSIC
Accuracy 2 sources 0.7035 0.1012 0.0851
Accuracy 4 sources 0.4992 0.4990 0.1903
30
• Which one is more robust to noise?
• Which one is more robust to reverberation?
• Which one give out a higher SNR for
enhancing speech?
• …etc.
Localization accuracy[Bhuiya et al.]
𝑅𝑀𝑆𝐸 =
1
𝑘
𝑘=1
𝐾
(Θ 𝑒𝑠𝑡_𝑘 − Θ 𝑡𝑟𝑢𝑒_𝑘)2
K : number of audio blocks (group of frames)

Citation
 Benesty, Jacek P. Dmochowski , Microphone Arrays: Fundamental
Concepts , Springer
 Bhuiya, F. Islam, M , Analysis of Direction of Arrival Techniques Using
Uniform Linear Array , International Journal of Computer Theory and
Engineering
 J. Capon. High-resolution frequency-wavenumber spectrum analysis.
Proc. IEEE, 57(8), 1408–1418 (1969).
 Kawitkar, R , Performance of Different Types of Array Structures Based
on Multiple Signal Classification (MUSIC) algorithm, International
Conference on MEMS NANO, and Smart Systems
 Richter, I , Spatial Filtering and DoA Estimation MVDR Beamformer
and MUSIC Algorithm , Sensor Array Signal Processing
 S P. Boyd, R , ROBUST MINIMUM VARIANCE BEAMFORMING
31

Beamforming and microphone arrays

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (9)

Similar a Beamforming and microphone arrays

Similar a Beamforming and microphone arrays (20)

Más de Ramin Anushiravani

Más de Ramin Anushiravani (6)

Último

Último (20)

Beamforming and microphone arrays

Notas del editor