Koyama ASA ASJ joint meeting 2016

Super-resolution in sound field recording and
reproduction based on sparse representation
Shoichi Koyama1,2, Naoki Murata1, and Hiroshi Saruwatari1
1The University of Tokyo
2Paris Diderot University / Institute Langevin

November 29, 2016
Sound field reproduction for audio system
Microphone array Loudspeaker array
 Large listening area can be achieved
 Listeners can perceive source distance
 Real-time recording and reproduction can be achieved
without recording engineers
Recording area Target area

November 29, 2016
Telecommunication system
NW
Home Theatre
Live broadcasting
Applications

November 29, 2016
Improve reproduction accuracy when # of array elements is small
 # of microphones > # of loudspeakers
– Higher reproduction accuracy within local region of target area
 # of microphones < # of loudspeakers
– Higher reproduction accuracy of sources in local region of recording area
[Koyama+ IEEE JSTSP2015], [Koyama+ ICASSP 2014, 2015]
[Ahrens+ AES Conv. 2010], [Ueno+ ICASSP 2017 (submitted)]

Sound Field Recording and Reproduction
November 29, 2016
Obtain driving signals of secondary sources (= loudspeakers)
arranged on to reconstruct desired sound field inside
 Inherently, sound pressure and its gradient on is required to obtain
, but sound pressure is usually only known
 Signal conversion for sound field recording and reproduction with ordinary
acoustic sensors and transducers is necessary
Primary
sources

November 29, 2016
Conventional: WFR filtering method
Secondary source planeReceiving plane
Primary
sources
Signal
conversion
[Koyama+ IEEE TASLP 2013]
Received
signals
Driving signals
Plane wave Plane wave
Each plane wave determines entire sound field
Signal conversion can be achieved in spatial frequency domain

November 29, 2016
Conventional: WFR filtering method
Target area
Received
signals
Driving signals
Plane wave Plane wave
Each plane wave determines entire sound field
Spatial aliasing artifacts due to plane wave decomposition
Significant error at high freq even when microphone < loudspeaker
Recording area
Signal
conversion
Secondary source planeReceiving plane
Primary
sources
[Koyama+ IEEE TASLP 2013]

Sound field representation for super-resolution
 Plane wave decomposition suffers from spatial aliasing artifacts because
many basis functions are used
 Observed signals should be represented by a few basis functions for accurate
interpolation of sound field
 Appropriate basis function may be close to pressure distribution originating
from sound sources
 To obtain driving signals of loudspeakers, basis functions must be
fundamental solutions of Helmholtz equation (e.g. Green functions)
November 29, 2016
Basis functionReceived
signals
Sound field decomposition into fundamental solutions
of Helmholtz equation is necessary
Sound field decomposition

Generative model of sound field
 Inhomogeneous and homogeneous Helmholtz eq. Distribution of
source components
November 29, 2016
[Koyama+ ICASSP 2014]
Sound field is divided into two regions

 Inhomogeneous and homogeneous Helmholtz eq.
November 29, 2016
Green’s function
Inhomogeneous + homogeneous terms
Plane wave

November 29, 2016
 Observe sound pressure distribution on plane
 Conversion into driving signals
Synthesize monopole sources [Spors+ AES Conv. 2008]
Ambient componentsDirect source components
Applying WFR filtering method [Koyama+ IEEE TASLP 2013]
Decomposition into two components can lead to
higher reproduction accuracy above spatial Nyquist freq

November 29, 2016
Sparse sound field representation
・・・・・・・・
Microphone array
Source components
Grid points Sparsity-based signal decomposition
Discretization
Ambient components
Dictionary matrix of Green’s functions
Observed signal Distribution of source components
A few elements of has non-zero values
under the assumption of spatially sparse source distribution

Sparse signal decomposition
 Sparse signal representation in vector form
 Signal decomposition based on sparsity of
November 29, 2016
Minimize -norm of

Group sparsity based on physical properties
November 29, 2016
Group sparse signal models for robust decomposition
• Multiple time frames
• Temporal frequencies
• Multipole components
Decomposition algorithm extending FOCUSS
 Sparse signal representation in vector form Structure of sparsity induced
by physical properties

Block diagram of signal conversion
 Decomposition stage
– Group sparse decomposition of
 Reconstruction stage
– and are respectively converted into driving signals
– is obtained as sum of two components
November 29, 2016

Simulation Experiment
 Proposed method (Proposed), WFR filtering method (WFR), and Sound
Pressure Control method (SPC) were compared
 32 microphones (6 cm intervals）and 48 loudspeakers (4 cm intervals)
 : Rectangular region of 2.4x2.4 m, Grid points: (10cm, 20cm) intervals
 Source directivity: unidirectional
 Source signal: single frequency sinewave
November 29, 2016

Simulation Experiment
 Signal-to-distortion ratio of reproduction (SDRR)
November 29, 2016
Original pressure distribution
Reproduced pressure distribution

November 29, 2016
Frequency vs. SDR
SDRRs above spatial Nyquist frequency were improved
 Source location: (-0.32, -0.84, 0.0) m

Reproduced sound pressure distribution (1.0 kHz)PressureError
November 29, 2016
Proposed WFR SPC
18.1 dB 18.0 dB 19.4 dB
SDRR:

Reproduced sound pressure distribution (4.0 kHz)PressureError
November 29, 2016
19.7 dB 6.8 dB 7.8 dB
Proposed WFR SPC
SDRR:

Frequency response of reproduced sound field
November 29, 2016
 Frequency response at (0.0, 1.0, 0.0) m
Reproduced frequency response was improved

Conclusion
 Super-resolution sound field recording and
reproduction based on sparse representation
– Conventional plane wave decomposition is suffered from
spatial aliasing artifacts
– Sound field representation using source and plane wave
components
– Sound field decomposition based on spatial sparsity of
source components
– Group sparsity based on physical properties of sound field
– Experimental results indicated that reproduction accuracy
above spatial Nyquist frequency can be improved
November 29, 2016
Thank you for your attention!

Related publications
• S. Koyama and H. Saruwatari, “Sound field decomposition in reverberant environment
using sparse and low-rank signal models,” Proc. IEEE ICASSP, 2016.
• N. Murata, S. Koyama, et al. “Sparse sound field decomposition with multichannel
extension of complex NMF,” Proc. IEEE ICASSP, 2016.
• S. Koyama, et al. “Sparse sound field decomposition using group sparse Bayesian
learning,” Proc. APSIPA ASC, 2015.
• N. Murata, S. Koyama, et al. “Sparse sound field decomposition with parametric
dictionary learning for super-resolution recording and reproduction,” Proc. IEEE
CAMSAP, 2015.
• S. Koyama, et al. “Source-location-informed sound field recording and reproduction
with spherical arrays,” Proc. IEEE WASPAA, 2015.
• S. Koyama, et al. “Source-location-informed sound field recording and reproduction,”
IEEE J. Sel. Topics Signal Process., vol. 9, no. 5, pp. 881-894, 2015.
• S. Koyama, et al. “Structured sparse signal models and decomposition algorithm for
super-resolution in sound field recording and reproduction,” Proc. IEEE ICASSP, 2015.
• S. Koyama, et al. “Sparse sound field representation in recording and reproduction for
reducing spatial aliasing artifacts,” Proc. IEEE ICASSP, 2014.
November 29, 2016

Koyama ASA ASJ joint meeting 2016

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (11)

Similar a Koyama ASA ASJ joint meeting 2016

Similar a Koyama ASA ASJ joint meeting 2016 (20)

Último

Último (20)

Koyama ASA ASJ joint meeting 2016

Notas del editor