SlideShare una empresa de Scribd logo
1 de 23
Super-resolution in sound field recording and
reproduction based on sparse representation
Shoichi Koyama1,2, Naoki Murata1, and Hiroshi Saruwatari1
1The University of Tokyo
2Paris Diderot University / Institute Langevin
November 29, 2016
Sound field reproduction for audio system
Microphone array Loudspeaker array
 Large listening area can be achieved
 Listeners can perceive source distance
 Real-time recording and reproduction can be achieved
without recording engineers
Recording area Target area
November 29, 2016
Sound field reproduction for audio system
Microphone array Loudspeaker array
Telecommunication system
NW
Home Theatre
Live broadcasting
Applications
Recording area Target area
November 29, 2016
Sound field reproduction for audio system
Microphone array Loudspeaker array
Improve reproduction accuracy when # of array elements is small
 # of microphones > # of loudspeakers
– Higher reproduction accuracy within local region of target area
 # of microphones < # of loudspeakers
– Higher reproduction accuracy of sources in local region of recording area
[Koyama+ IEEE JSTSP2015], [Koyama+ ICASSP 2014, 2015]
[Ahrens+ AES Conv. 2010], [Ueno+ ICASSP 2017 (submitted)]
Recording area Target area
Sound Field Recording and Reproduction
November 29, 2016
Recording area Target area
Obtain driving signals of secondary sources (= loudspeakers)
arranged on to reconstruct desired sound field inside
 Inherently, sound pressure and its gradient on is required to obtain
, but sound pressure is usually only known
 Signal conversion for sound field recording and reproduction with ordinary
acoustic sensors and transducers is necessary
Primary
sources
November 29, 2016
Conventional: WFR filtering method
Recording area Target area
Secondary source planeReceiving plane
Primary
sources
Signal
conversion
[Koyama+ IEEE TASLP 2013]
Received
signals
Driving signals
Plane wave Plane wave
Each plane wave determines entire sound field
Signal conversion can be achieved in spatial frequency domain
November 29, 2016
Conventional: WFR filtering method
Target area
Received
signals
Driving signals
Plane wave Plane wave
Each plane wave determines entire sound field
Spatial aliasing artifacts due to plane wave decomposition
Significant error at high freq even when microphone < loudspeaker
Recording area
Signal
conversion
Secondary source planeReceiving plane
Primary
sources
[Koyama+ IEEE TASLP 2013]
Sound field representation for super-resolution
 Plane wave decomposition suffers from spatial aliasing artifacts because
many basis functions are used
 Observed signals should be represented by a few basis functions for accurate
interpolation of sound field
 Appropriate basis function may be close to pressure distribution originating
from sound sources
 To obtain driving signals of loudspeakers, basis functions must be
fundamental solutions of Helmholtz equation (e.g. Green functions)
November 29, 2016
Basis functionReceived
signals
Sound field decomposition into fundamental solutions
of Helmholtz equation is necessary
Sound field decomposition
Generative model of sound field
 Inhomogeneous and homogeneous Helmholtz eq. Distribution of
source components
November 29, 2016
[Koyama+ ICASSP 2014]
Sound field is divided into two regions
Generative model of sound field
 Inhomogeneous and homogeneous Helmholtz eq.
November 29, 2016
[Koyama+ ICASSP 2014]
Green’s function
Inhomogeneous + homogeneous terms
Plane wave
November 29, 2016
Generative model of sound field
 Observe sound pressure distribution on plane
 Conversion into driving signals
Synthesize monopole sources [Spors+ AES Conv. 2008]
Ambient componentsDirect source components
Applying WFR filtering method [Koyama+ IEEE TASLP 2013]
Decomposition into two components can lead to
higher reproduction accuracy above spatial Nyquist freq
November 29, 2016
Sparse sound field representation
・・・・・・・・
Microphone array
Source components
Grid points Sparsity-based signal decomposition
Discretization
Ambient components
Dictionary matrix of Green’s functions
Observed signal Distribution of source components
A few elements of has non-zero values
under the assumption of spatially sparse source distribution
Sparse signal decomposition
 Sparse signal representation in vector form
 Signal decomposition based on sparsity of
November 29, 2016
Minimize -norm of
Group sparsity based on physical properties
November 29, 2016
Group sparse signal models for robust decomposition
• Multiple time frames
• Temporal frequencies
• Multipole components
Decomposition algorithm extending FOCUSS
[Koyama+ ICASSP 2015]
 Sparse signal representation in vector form Structure of sparsity induced
by physical properties
Block diagram of signal conversion
 Decomposition stage
– Group sparse decomposition of
 Reconstruction stage
– and are respectively converted into driving signals
– is obtained as sum of two components
November 29, 2016
Simulation Experiment
 Proposed method (Proposed), WFR filtering method (WFR), and Sound
Pressure Control method (SPC) were compared
 32 microphones (6 cm intervals)and 48 loudspeakers (4 cm intervals)
 : Rectangular region of 2.4x2.4 m, Grid points: (10cm, 20cm) intervals
 Source directivity: unidirectional
 Source signal: single frequency sinewave
Recording area Target area
November 29, 2016
Simulation Experiment
 Signal-to-distortion ratio of reproduction (SDRR)
Recording area Target area
November 29, 2016
Original pressure distribution
Reproduced pressure distribution
November 29, 2016
Frequency vs. SDR
SDRRs above spatial Nyquist frequency were improved
 Source location: (-0.32, -0.84, 0.0) m
Reproduced sound pressure distribution (1.0 kHz)PressureError
November 29, 2016
Proposed WFR SPC
18.1 dB 18.0 dB 19.4 dB
 Source location: (-0.32, -0.84, 0.0) m
SDRR:
Reproduced sound pressure distribution (4.0 kHz)PressureError
November 29, 2016
19.7 dB 6.8 dB 7.8 dB
Proposed WFR SPC
 Source location: (-0.32, -0.84, 0.0) m
SDRR:
Frequency response of reproduced sound field
November 29, 2016
 Frequency response at (0.0, 1.0, 0.0) m
Reproduced frequency response was improved
Conclusion
 Super-resolution sound field recording and
reproduction based on sparse representation
– Conventional plane wave decomposition is suffered from
spatial aliasing artifacts
– Sound field representation using source and plane wave
components
– Sound field decomposition based on spatial sparsity of
source components
– Group sparsity based on physical properties of sound field
– Experimental results indicated that reproduction accuracy
above spatial Nyquist frequency can be improved
November 29, 2016
Thank you for your attention!
Related publications
• S. Koyama and H. Saruwatari, “Sound field decomposition in reverberant environment
using sparse and low-rank signal models,” Proc. IEEE ICASSP, 2016.
• N. Murata, S. Koyama, et al. “Sparse sound field decomposition with multichannel
extension of complex NMF,” Proc. IEEE ICASSP, 2016.
• S. Koyama, et al. “Sparse sound field decomposition using group sparse Bayesian
learning,” Proc. APSIPA ASC, 2015.
• N. Murata, S. Koyama, et al. “Sparse sound field decomposition with parametric
dictionary learning for super-resolution recording and reproduction,” Proc. IEEE
CAMSAP, 2015.
• S. Koyama, et al. “Source-location-informed sound field recording and reproduction
with spherical arrays,” Proc. IEEE WASPAA, 2015.
• S. Koyama, et al. “Source-location-informed sound field recording and reproduction,”
IEEE J. Sel. Topics Signal Process., vol. 9, no. 5, pp. 881-894, 2015.
• S. Koyama, et al. “Structured sparse signal models and decomposition algorithm for
super-resolution in sound field recording and reproduction,” Proc. IEEE ICASSP, 2015.
• S. Koyama, et al. “Sparse sound field representation in recording and reproduction for
reducing spatial aliasing artifacts,” Proc. IEEE ICASSP, 2014.
November 29, 2016

Más contenido relacionado

La actualidad más candente

リアルタイムDNN音声変換フィードバックによるキャラクタ性の獲得手法
リアルタイムDNN音声変換フィードバックによるキャラクタ性の獲得手法リアルタイムDNN音声変換フィードバックによるキャラクタ性の獲得手法
リアルタイムDNN音声変換フィードバックによるキャラクタ性の獲得手法Shinnosuke Takamichi
 
[DL輪読会]Discriminative Learning for Monaural Speech Separation Using Deep Embe...
[DL輪読会]Discriminative Learning for Monaural Speech Separation Using Deep Embe...[DL輪読会]Discriminative Learning for Monaural Speech Separation Using Deep Embe...
[DL輪読会]Discriminative Learning for Monaural Speech Separation Using Deep Embe...Deep Learning JP
 
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...Daichi Kitamura
 
音声の声質を変換する技術とその応用
音声の声質を変換する技術とその応用音声の声質を変換する技術とその応用
音声の声質を変換する技術とその応用NU_I_TODALAB
 
Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討Shinnosuke Takamichi
 
GAN-based statistical speech synthesis (in Japanese)
GAN-based statistical speech synthesis (in Japanese)GAN-based statistical speech synthesis (in Japanese)
GAN-based statistical speech synthesis (in Japanese)Yuki Saito
 
音情報処理における特徴表現
音情報処理における特徴表現音情報処理における特徴表現
音情報処理における特徴表現NU_I_TODALAB
 
雑音環境下音声を用いた音声合成のための雑音生成モデルの敵対的学習
雑音環境下音声を用いた音声合成のための雑音生成モデルの敵対的学習雑音環境下音声を用いた音声合成のための雑音生成モデルの敵対的学習
雑音環境下音声を用いた音声合成のための雑音生成モデルの敵対的学習Shinnosuke Takamichi
 
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法Daichi Kitamura
 
やさしく音声分析法を学ぶ: ケプストラム分析とLPC分析
やさしく音声分析法を学ぶ: ケプストラム分析とLPC分析やさしく音声分析法を学ぶ: ケプストラム分析とLPC分析
やさしく音声分析法を学ぶ: ケプストラム分析とLPC分析Shinnosuke Takamichi
 
【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...
【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...
【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...Deep Learning JP
 
ドローンによる収音のためのビームフォーミング
ドローンによる収音のためのビームフォーミングドローンによる収音のためのビームフォーミング
ドローンによる収音のためのビームフォーミングItaru Kaneko
 
画像認識のための深層学習
画像認識のための深層学習画像認識のための深層学習
画像認識のための深層学習Saya Katafuchi
 
Neural text-to-speech and voice conversion
Neural text-to-speech and voice conversionNeural text-to-speech and voice conversion
Neural text-to-speech and voice conversionYuki Saito
 
分布あるいはモーメント間距離最小化に基づく統計的音声合成
分布あるいはモーメント間距離最小化に基づく統計的音声合成分布あるいはモーメント間距離最小化に基づく統計的音声合成
分布あるいはモーメント間距離最小化に基づく統計的音声合成Shinnosuke Takamichi
 
信号の独立性に基づく多チャンネル音源分離
信号の独立性に基づく多チャンネル音源分離信号の独立性に基づく多チャンネル音源分離
信号の独立性に基づく多チャンネル音源分離NU_I_TODALAB
 
ウィナーフィルタと適応フィルタ
ウィナーフィルタと適応フィルタウィナーフィルタと適応フィルタ
ウィナーフィルタと適応フィルタToshihisa Tanaka
 
Interspeech2022 参加報告
Interspeech2022 参加報告Interspeech2022 参加報告
Interspeech2022 参加報告Yuki Saito
 

La actualidad más candente (20)

リアルタイムDNN音声変換フィードバックによるキャラクタ性の獲得手法
リアルタイムDNN音声変換フィードバックによるキャラクタ性の獲得手法リアルタイムDNN音声変換フィードバックによるキャラクタ性の獲得手法
リアルタイムDNN音声変換フィードバックによるキャラクタ性の獲得手法
 
[DL輪読会]Discriminative Learning for Monaural Speech Separation Using Deep Embe...
[DL輪読会]Discriminative Learning for Monaural Speech Separation Using Deep Embe...[DL輪読会]Discriminative Learning for Monaural Speech Separation Using Deep Embe...
[DL輪読会]Discriminative Learning for Monaural Speech Separation Using Deep Embe...
 
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
 
音声の声質を変換する技術とその応用
音声の声質を変換する技術とその応用音声の声質を変換する技術とその応用
音声の声質を変換する技術とその応用
 
Ea2015 7for ss
Ea2015 7for ssEa2015 7for ss
Ea2015 7for ss
 
Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討
 
GAN-based statistical speech synthesis (in Japanese)
GAN-based statistical speech synthesis (in Japanese)GAN-based statistical speech synthesis (in Japanese)
GAN-based statistical speech synthesis (in Japanese)
 
音情報処理における特徴表現
音情報処理における特徴表現音情報処理における特徴表現
音情報処理における特徴表現
 
雑音環境下音声を用いた音声合成のための雑音生成モデルの敵対的学習
雑音環境下音声を用いた音声合成のための雑音生成モデルの敵対的学習雑音環境下音声を用いた音声合成のための雑音生成モデルの敵対的学習
雑音環境下音声を用いた音声合成のための雑音生成モデルの敵対的学習
 
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
 
やさしく音声分析法を学ぶ: ケプストラム分析とLPC分析
やさしく音声分析法を学ぶ: ケプストラム分析とLPC分析やさしく音声分析法を学ぶ: ケプストラム分析とLPC分析
やさしく音声分析法を学ぶ: ケプストラム分析とLPC分析
 
【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...
【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...
【DL輪読会】NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics vi...
 
ドローンによる収音のためのビームフォーミング
ドローンによる収音のためのビームフォーミングドローンによる収音のためのビームフォーミング
ドローンによる収音のためのビームフォーミング
 
画像認識のための深層学習
画像認識のための深層学習画像認識のための深層学習
画像認識のための深層学習
 
Neural text-to-speech and voice conversion
Neural text-to-speech and voice conversionNeural text-to-speech and voice conversion
Neural text-to-speech and voice conversion
 
Visual slam
Visual slamVisual slam
Visual slam
 
分布あるいはモーメント間距離最小化に基づく統計的音声合成
分布あるいはモーメント間距離最小化に基づく統計的音声合成分布あるいはモーメント間距離最小化に基づく統計的音声合成
分布あるいはモーメント間距離最小化に基づく統計的音声合成
 
信号の独立性に基づく多チャンネル音源分離
信号の独立性に基づく多チャンネル音源分離信号の独立性に基づく多チャンネル音源分離
信号の独立性に基づく多チャンネル音源分離
 
ウィナーフィルタと適応フィルタ
ウィナーフィルタと適応フィルタウィナーフィルタと適応フィルタ
ウィナーフィルタと適応フィルタ
 
Interspeech2022 参加報告
Interspeech2022 参加報告Interspeech2022 参加報告
Interspeech2022 参加報告
 

Destacado

HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価Shinnosuke Takamichi
 
数値解析と物理学
数値解析と物理学数値解析と物理学
数値解析と物理学すずしめ
 
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...Daichi Kitamura
 

Destacado (11)

Hybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invited
 
ILRMA 20170227 danwakai
ILRMA 20170227 danwakaiILRMA 20170227 danwakai
ILRMA 20170227 danwakai
 
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
 
Asj2017 3 bileveloptnmf
Asj2017 3 bileveloptnmfAsj2017 3 bileveloptnmf
Asj2017 3 bileveloptnmf
 
Ica2016 312 saruwatari
Ica2016 312 saruwatariIca2016 312 saruwatari
Ica2016 312 saruwatari
 
Dsp2015for ss
Dsp2015for ssDsp2015for ss
Dsp2015for ss
 
Apsipa2016for ss
Apsipa2016for ssApsipa2016for ss
Apsipa2016for ss
 
Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016
 
Discriminative SNMF EA201603
Discriminative SNMF EA201603Discriminative SNMF EA201603
Discriminative SNMF EA201603
 
数値解析と物理学
数値解析と物理学数値解析と物理学
数値解析と物理学
 
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
 

Similar a Koyama ASA ASJ joint meeting 2016

SoundSense
SoundSenseSoundSense
SoundSensebutest
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceDaichi Kitamura
 
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATIONHUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATIONIRJET Journal
 
Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...JunjieShi3
 
Future Proof Surround Sound Mixing using Ambisonics
Future Proof Surround Sound Mixing using AmbisonicsFuture Proof Surround Sound Mixing using Ambisonics
Future Proof Surround Sound Mixing using AmbisonicsBruce Wiggins
 
Plane wave decomposition and beamforming for directional spatial sound locali...
Plane wave decomposition and beamforming for directional spatial sound locali...Plane wave decomposition and beamforming for directional spatial sound locali...
Plane wave decomposition and beamforming for directional spatial sound locali...Muhammad Imran
 
Sound Source Localization
Sound Source LocalizationSound Source Localization
Sound Source LocalizationMuhammad Imran
 
Adaptive noise estimation algorithm for speech enhancement
Adaptive noise estimation algorithm for speech enhancementAdaptive noise estimation algorithm for speech enhancement
Adaptive noise estimation algorithm for speech enhancementHarshal Ladhe
 
Audio spot light
Audio spot lightAudio spot light
Audio spot lightMansi Gupta
 
Blind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsKitamura Laboratory
 
Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Daichi Kitamura
 
Transferring Singing Expressions from One Voice to Another for a Given Song
Transferring Singing Expressions from One Voice to Another for a Given SongTransferring Singing Expressions from One Voice to Another for a Given Song
Transferring Singing Expressions from One Voice to Another for a Given SongNAVER Engineering
 
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...April Smith
 
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...奈良先端大 情報科学研究科
 
Audiospotlighting
AudiospotlightingAudiospotlighting
AudiospotlightingBiji Raju
 
A Computational Framework for Sound Segregation in Music Signals using Marsyas
A Computational Framework for Sound Segregation in Music Signals using MarsyasA Computational Framework for Sound Segregation in Music Signals using Marsyas
A Computational Framework for Sound Segregation in Music Signals using MarsyasLuís Gustavo Martins
 

Similar a Koyama ASA ASJ joint meeting 2016 (20)

SoundSense
SoundSenseSoundSense
SoundSense
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
 
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATIONHUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
 
Audio spotlighting
Audio spotlightingAudio spotlighting
Audio spotlighting
 
Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...
 
Future Proof Surround Sound Mixing using Ambisonics
Future Proof Surround Sound Mixing using AmbisonicsFuture Proof Surround Sound Mixing using Ambisonics
Future Proof Surround Sound Mixing using Ambisonics
 
T26123129
T26123129T26123129
T26123129
 
B45010811
B45010811B45010811
B45010811
 
Plane wave decomposition and beamforming for directional spatial sound locali...
Plane wave decomposition and beamforming for directional spatial sound locali...Plane wave decomposition and beamforming for directional spatial sound locali...
Plane wave decomposition and beamforming for directional spatial sound locali...
 
Sound Source Localization
Sound Source LocalizationSound Source Localization
Sound Source Localization
 
Adaptive noise estimation algorithm for speech enhancement
Adaptive noise estimation algorithm for speech enhancementAdaptive noise estimation algorithm for speech enhancement
Adaptive noise estimation algorithm for speech enhancement
 
Audio spot light
Audio spot lightAudio spot light
Audio spot light
 
Blind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure models
 
F010334548
F010334548F010334548
F010334548
 
Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...
 
Transferring Singing Expressions from One Voice to Another for a Given Song
Transferring Singing Expressions from One Voice to Another for a Given SongTransferring Singing Expressions from One Voice to Another for a Given Song
Transferring Singing Expressions from One Voice to Another for a Given Song
 
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
 
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
 
Audiospotlighting
AudiospotlightingAudiospotlighting
Audiospotlighting
 
A Computational Framework for Sound Segregation in Music Signals using Marsyas
A Computational Framework for Sound Segregation in Music Signals using MarsyasA Computational Framework for Sound Segregation in Music Signals using Marsyas
A Computational Framework for Sound Segregation in Music Signals using Marsyas
 

Último

Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Silpa
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIADr. TATHAGAT KHOBRAGADE
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxSilpa
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxSilpa
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptxArvind Kumar
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...Scintica Instrumentation
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry Areesha Ahmad
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Silpa
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.Silpa
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Silpa
 

Último (20)

PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 

Koyama ASA ASJ joint meeting 2016

  • 1. Super-resolution in sound field recording and reproduction based on sparse representation Shoichi Koyama1,2, Naoki Murata1, and Hiroshi Saruwatari1 1The University of Tokyo 2Paris Diderot University / Institute Langevin
  • 2. November 29, 2016 Sound field reproduction for audio system Microphone array Loudspeaker array  Large listening area can be achieved  Listeners can perceive source distance  Real-time recording and reproduction can be achieved without recording engineers Recording area Target area
  • 3. November 29, 2016 Sound field reproduction for audio system Microphone array Loudspeaker array Telecommunication system NW Home Theatre Live broadcasting Applications Recording area Target area
  • 4. November 29, 2016 Sound field reproduction for audio system Microphone array Loudspeaker array Improve reproduction accuracy when # of array elements is small  # of microphones > # of loudspeakers – Higher reproduction accuracy within local region of target area  # of microphones < # of loudspeakers – Higher reproduction accuracy of sources in local region of recording area [Koyama+ IEEE JSTSP2015], [Koyama+ ICASSP 2014, 2015] [Ahrens+ AES Conv. 2010], [Ueno+ ICASSP 2017 (submitted)] Recording area Target area
  • 5. Sound Field Recording and Reproduction November 29, 2016 Recording area Target area Obtain driving signals of secondary sources (= loudspeakers) arranged on to reconstruct desired sound field inside  Inherently, sound pressure and its gradient on is required to obtain , but sound pressure is usually only known  Signal conversion for sound field recording and reproduction with ordinary acoustic sensors and transducers is necessary Primary sources
  • 6. November 29, 2016 Conventional: WFR filtering method Recording area Target area Secondary source planeReceiving plane Primary sources Signal conversion [Koyama+ IEEE TASLP 2013] Received signals Driving signals Plane wave Plane wave Each plane wave determines entire sound field Signal conversion can be achieved in spatial frequency domain
  • 7. November 29, 2016 Conventional: WFR filtering method Target area Received signals Driving signals Plane wave Plane wave Each plane wave determines entire sound field Spatial aliasing artifacts due to plane wave decomposition Significant error at high freq even when microphone < loudspeaker Recording area Signal conversion Secondary source planeReceiving plane Primary sources [Koyama+ IEEE TASLP 2013]
  • 8. Sound field representation for super-resolution  Plane wave decomposition suffers from spatial aliasing artifacts because many basis functions are used  Observed signals should be represented by a few basis functions for accurate interpolation of sound field  Appropriate basis function may be close to pressure distribution originating from sound sources  To obtain driving signals of loudspeakers, basis functions must be fundamental solutions of Helmholtz equation (e.g. Green functions) November 29, 2016 Basis functionReceived signals Sound field decomposition into fundamental solutions of Helmholtz equation is necessary Sound field decomposition
  • 9. Generative model of sound field  Inhomogeneous and homogeneous Helmholtz eq. Distribution of source components November 29, 2016 [Koyama+ ICASSP 2014] Sound field is divided into two regions
  • 10. Generative model of sound field  Inhomogeneous and homogeneous Helmholtz eq. November 29, 2016 [Koyama+ ICASSP 2014] Green’s function Inhomogeneous + homogeneous terms Plane wave
  • 11. November 29, 2016 Generative model of sound field  Observe sound pressure distribution on plane  Conversion into driving signals Synthesize monopole sources [Spors+ AES Conv. 2008] Ambient componentsDirect source components Applying WFR filtering method [Koyama+ IEEE TASLP 2013] Decomposition into two components can lead to higher reproduction accuracy above spatial Nyquist freq
  • 12. November 29, 2016 Sparse sound field representation ・・・・・・・・ Microphone array Source components Grid points Sparsity-based signal decomposition Discretization Ambient components Dictionary matrix of Green’s functions Observed signal Distribution of source components A few elements of has non-zero values under the assumption of spatially sparse source distribution
  • 13. Sparse signal decomposition  Sparse signal representation in vector form  Signal decomposition based on sparsity of November 29, 2016 Minimize -norm of
  • 14. Group sparsity based on physical properties November 29, 2016 Group sparse signal models for robust decomposition • Multiple time frames • Temporal frequencies • Multipole components Decomposition algorithm extending FOCUSS [Koyama+ ICASSP 2015]  Sparse signal representation in vector form Structure of sparsity induced by physical properties
  • 15. Block diagram of signal conversion  Decomposition stage – Group sparse decomposition of  Reconstruction stage – and are respectively converted into driving signals – is obtained as sum of two components November 29, 2016
  • 16. Simulation Experiment  Proposed method (Proposed), WFR filtering method (WFR), and Sound Pressure Control method (SPC) were compared  32 microphones (6 cm intervals)and 48 loudspeakers (4 cm intervals)  : Rectangular region of 2.4x2.4 m, Grid points: (10cm, 20cm) intervals  Source directivity: unidirectional  Source signal: single frequency sinewave Recording area Target area November 29, 2016
  • 17. Simulation Experiment  Signal-to-distortion ratio of reproduction (SDRR) Recording area Target area November 29, 2016 Original pressure distribution Reproduced pressure distribution
  • 18. November 29, 2016 Frequency vs. SDR SDRRs above spatial Nyquist frequency were improved  Source location: (-0.32, -0.84, 0.0) m
  • 19. Reproduced sound pressure distribution (1.0 kHz)PressureError November 29, 2016 Proposed WFR SPC 18.1 dB 18.0 dB 19.4 dB  Source location: (-0.32, -0.84, 0.0) m SDRR:
  • 20. Reproduced sound pressure distribution (4.0 kHz)PressureError November 29, 2016 19.7 dB 6.8 dB 7.8 dB Proposed WFR SPC  Source location: (-0.32, -0.84, 0.0) m SDRR:
  • 21. Frequency response of reproduced sound field November 29, 2016  Frequency response at (0.0, 1.0, 0.0) m Reproduced frequency response was improved
  • 22. Conclusion  Super-resolution sound field recording and reproduction based on sparse representation – Conventional plane wave decomposition is suffered from spatial aliasing artifacts – Sound field representation using source and plane wave components – Sound field decomposition based on spatial sparsity of source components – Group sparsity based on physical properties of sound field – Experimental results indicated that reproduction accuracy above spatial Nyquist frequency can be improved November 29, 2016 Thank you for your attention!
  • 23. Related publications • S. Koyama and H. Saruwatari, “Sound field decomposition in reverberant environment using sparse and low-rank signal models,” Proc. IEEE ICASSP, 2016. • N. Murata, S. Koyama, et al. “Sparse sound field decomposition with multichannel extension of complex NMF,” Proc. IEEE ICASSP, 2016. • S. Koyama, et al. “Sparse sound field decomposition using group sparse Bayesian learning,” Proc. APSIPA ASC, 2015. • N. Murata, S. Koyama, et al. “Sparse sound field decomposition with parametric dictionary learning for super-resolution recording and reproduction,” Proc. IEEE CAMSAP, 2015. • S. Koyama, et al. “Source-location-informed sound field recording and reproduction with spherical arrays,” Proc. IEEE WASPAA, 2015. • S. Koyama, et al. “Source-location-informed sound field recording and reproduction,” IEEE J. Sel. Topics Signal Process., vol. 9, no. 5, pp. 881-894, 2015. • S. Koyama, et al. “Structured sparse signal models and decomposition algorithm for super-resolution in sound field recording and reproduction,” Proc. IEEE ICASSP, 2015. • S. Koyama, et al. “Sparse sound field representation in recording and reproduction for reducing spatial aliasing artifacts,” Proc. IEEE ICASSP, 2014. November 29, 2016

Notas del editor

  1. These are the reproduced sound pressure and time-averaged squared error distributions of the three methods at 1 kHz. The white region indicates the region of high reproduction accuracy. Because 1.0 kHz is below the spatial Nyquist frequency, the sound field was accurately reproduced using these three methods.