SlideShare una empresa de Scribd logo
1 de 32
Descargar para leer sin conexión
Convolutional Neural Network to Model
Articulation Impairments in Patients with
Parkinson’s Disease
Juan Camilo V´asquez-Correa1,2
Juan Rafael Orozco-Arroyave1,2, Elmar N¨oth2
1GITA research group, University of Antioquia UdeA.
2Pattern recognition Lab. Friedrich Alexander Universit¨at. Erlangen-N¨urnberg.
jcamilo.vasquez@udea.edu.co
18th INTERSPEECH, 2017
1 / 32
Outline
Introduction
Methods
Experimental framework
Results
Conclusion
2 / 32
Outline
Introduction
Methods
Experimental framework
Results
Conclusion
3 / 32
Introduction: Parkinson’s Disease
Second most prevalent neurologi-
cal disorder worldwide.
Patients develop several motor
and non-motor impairments. (O.
Hornykiewicz 1998).
Speech impairments are one of
the earliest manifestations.
4 / 32
Introduction: Speech impairments
Speech impairments in PD patients: hypokinetic dysarthria
Phonation
Articulation
Prosody Intelligibility
pataka pataka
5 / 32
Introduction: Imprecise articulation
One of the most deviant speech dimensions in PD.
Reduced velocity of lip, tongue, and jaw movements.
Strong indication of the literature statement: imprecise con-
sonants caused by reduced range of movements of ar-
ticulators
pa ta ka
6 / 32
Introduction: Hypothesis
PD patients have difficulties to begin and to stop the vocal fold
vibration, and such difficulties can be observed on speech sig-
nals by modeling the transitions between voiced and unvoiced
sounds
Voiced
Voiced UnvoicedVoicedUnvoiced
Unvoiced
Onset transition Offset transition
7 / 32
Introduction: Aims
To model the time-frequency (TF) information provided by
the onset and offset transitions: short-time Fourier trans-
form (STFT) and continuous wavelet transform (CWT).
To “learn” features from time-frequency representations: con-
volutional neural network (CNN).
Why TF and feature-learning? both have been successfully
used in several paralinguistics tasks: emotion, deception,
depression, and others.
8 / 32
Outline
Introduction
Methods
Experimental framework
Results
Conclusion
9 / 32
Methods
Transitions
detection
Time frequency
representations
Convolutional
neural network
10 / 32
Methods: Transitions detection
Transitions
detection
Time frequency
representations
Convolutional
neural network
Onset transition Offset transition
Onset and offset are detected according to the presence
of the fundamental frequency.
11 / 32
Methods: Time-frequency representation
Transitions
detection
Time frequency
representations
Convolutional
neural network
0.000.020.040.060.080.100.120.14
Time (s)
500
1000
1500
2000
2500
3000
3500
4000
Frequency(Hz)
0.000.020.040.060.080.100.120.14
Time (s)
STFT of onset for a PD patient (left) and a HC subject (right)
Play PD Play HC
12 / 32
Methods: Time-frequency representation
Transitions
detection
Time frequency
representations
Convolutional
neural network
0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14
Time (s)
100
200
300
400
500
Scale
0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14
Time (s)
CWT of onset for a PD patient (left) and a HC subject (right)
13 / 32
Methods: Convolutional neural network
Transitions
detection
Time frequency
representations
Convolutional
neural network
Convolution layer I Convolution layer IIMax-pool. layer 1 Max-pool layer 2 Fully conected MLP
Input layer
PD
vs.
HC
Feature maps 1
Feature maps 2
CNN learns high–level representations from the low–level raw data
14 / 32
Outline
Introduction
Methods
Experimental framework
Results
Conclusion
15 / 32
Data
Three databases with recordings in three languages: Span-
ish, German, and Czech.
Diadochokinetic exercises, isolated sentences, read texts,
and monologues.
16 / 32
Data
Language Description
Spanish 50 Patients and 50 Healthy controls.
Balanced in age (60 years old) and gender.
Patients in middle state of the disease.
German 88 Patients and 88 Healthy controls.
Balanced in age (64 years old).
patients in low and middle state of the disease.
Czech 20 Patients and 15 Healthy controls.
All male speakers.
Patients diagnosed during recording session.
Table: Databases
17 / 32
Experiments and validation
Classification of PD patients vs. HC subjects in the same
language.
10 fold cross-validation: 8 for training, 1 to optimize
hyper-parameters, and 1 for test.
Cross-language classification.
One language used for train and validation and other
language used for test.
18 / 32
Experiments and validation
Results are compared respect to previous studies1.
Support vector machine
1Juan Camilo V´asquez-Correa et al. “Effect of acoustic conditions on algorithms to
detect Parkinson’s disease from speech”. In: International Conference on Acoustics,
Speech and Signal Processing (ICASSP),. 2017, pp. 5065–5069.
19 / 32
Outline
Introduction
Methods
Experimental framework
Results
Conclusion
20 / 32
Results: same language for train and test
TFR Onset Offset Onset+Offset
Spanish
CNN-STFT 85.3 81.6 85.9
CNN-CWT 84.2 81.8 85.2
Baseline 69.3 69.6 71.6
German
CNN-STFT 70.3 68.0 75.0
CNN-CWT 68.0 66.9 70.5
Baseline 72.7 70.9 74.0
Czech
CNN-STFT 77.9 80.4 84.4
CNN-CWT 89.2 87.7 89.4
Baseline 75.3 74.4 78.8
21 / 32
Results: same language for train and test
TFR Onset Offset Onset+Offset
Spanish
CNN-STFT 85.3 81.6 85.9
CNN-CWT 84.2 81.8 85.2
Baseline 69.3 69.6 71.6
German
CNN-STFT 70.3 68.0 75.0
CNN-CWT 68.0 66.9 70.5
Baseline 72.7 70.9 74.0
Czech
CNN-STFT 77.9 80.4 84.4
CNN-CWT 89.2 87.7 89.4
Baseline 75.3 74.4 78.8
22 / 32
Results: same language for train and test
TFR Onset Offset Onset+Offset
Spanish
CNN-STFT 85.3 81.6 85.9
CNN-CWT 84.2 81.8 85.2
Baseline 69.3 69.6 71.6
German
CNN-STFT 70.3 68.0 75.0
CNN-CWT 68.0 66.9 70.5
Baseline 72.7 70.9 74.0
Czech
CNN-STFT 77.9 80.4 84.4
CNN-CWT 89.2 87.7 89.4
Baseline 75.3 74.4 78.8
23 / 32
Results: same language for train and test
50 100 150
Time (ms)
0
1000
2000
3000
4000Frequency(Hz)
50 100 150
Time (ms)
Low Energy High Energy
Figure: Output of the CNN after the last max–pool layer: PD patient
(left) and a HC speaker (right)
24 / 32
Results: same language for train and test
Speech tasks Spanish German Czech
read text 85.0 70.3 88.5
monologue 85.6 70.3 89.1
/pa-ta-ka/ 85.4 70.7 89.2
25 / 32
Results: different language for train and test
Test Lang. TFR onset offset onset+offset
Train with Spanish
German CNN-STFT 51.7 50.2 54.7
German Baseline 53.7 55.0 54.1
Czech CNN-CWT 55.2 55.4 57.9
Czech Baseline 60.3 57.4 60.4
Train with German
Spanish CNN-STFT 58.0 55.7 55.8
Spanish Baseline 53.5 53.5 53.6
Czech CNN-STFT 53.0 52.4 53.0
Czech Baseline 50.9 51.7 52.6
Train with Czech
Spanish CNN-CWT 53.8 56.3 56.7
Spanish Baseline 53.4 51.6 52.4
German CNN-STFT 54.0 51.8 54.0
German Baseline 51.2 51.0 50.7
26 / 32
Results: different language for train and test
Test Lang. TFR onset offset onset+offset
Train with Spanish
German CNN-STFT 51.7 50.2 54.7
German Baseline 53.7 55.0 54.1
Czech CNN-CWT 55.2 55.4 57.9
Czech Baseline 60.3 57.4 60.4
Train with German
Spanish CNN-STFT 58.0 55.7 55.8
Spanish Baseline 53.5 53.5 53.6
Czech CNN-STFT 53.0 52.4 53.0
Czech Baseline 50.9 51.7 52.6
Train with Czech
Spanish CNN-CWT 53.8 56.3 56.7
Spanish Baseline 53.4 51.6 52.4
German CNN-STFT 54.0 51.8 54.0
German Baseline 51.2 51.0 50.7
27 / 32
Results: different language for train and test
Test Lang. TFR onset offset onset+offset
Train with Spanish
German CNN-STFT 51.7 50.2 54.7
German Baseline 53.7 55.0 54.1
Czech CNN-CWT 55.2 55.4 57.9
Czech Baseline 60.3 57.4 60.4
Train with German
Spanish CNN-STFT 58.0 55.7 55.8
Spanish Baseline 53.5 53.5 53.6
Czech CNN-STFT 53.0 52.4 53.0
Czech Baseline 50.9 51.7 52.6
Train with Czech
Spanish CNN-CWT 53.8 56.3 56.7
Spanish Baseline 53.4 51.6 52.4
German CNN-STFT 54.0 51.8 54.0
German Baseline 51.2 51.0 50.7
28 / 32
Outline
Introduction
Methods
Experimental framework
Results
Conclusion
29 / 32
Conclusion
A deep learning approach is proposed to model articulation
impairments of PD patients.
Voiced-Unvoiced transitions are modeled with CNNs using
STFT and CWT.
30 / 32
Conclusion
The proposed method is able to classify PD patients and
HC subjects and improves the baseline when the language
used for train and test is the same.
Additional approaches should be proposed when the train
and test language are different.
Recurrent neural networks and other architectures may be
considered to assess co-articulation.
Deep learning approaches trained with phonation, articula-
tion, and prosody information may be addressed to evaluate
specific speech impairments.
31 / 32
Convolutional Neural Network to Model
Articulation Impairments in Patients with
Parkinson’s Disease
Juan Camilo V´asquez-Correa1,2
Juan Rafael Orozco-Arroyave1,2, Elmar N¨oth2
1GITA research group, University of Antioquia UdeA.
2Pattern recognition Lab. Friedrich Alexander Universit¨at. Erlangen-N¨urnberg.
jcamilo.vasquez@udea.edu.co
18th INTERSPEECH, 2017
32 / 32

Más contenido relacionado

Similar a Convolutional Neural Network to Model Articulation Impairments in Patients with Parkinson’s Disease

ICT-GroupProject-Report2-NguyenDangHoa_2
ICT-GroupProject-Report2-NguyenDangHoa_2ICT-GroupProject-Report2-NguyenDangHoa_2
ICT-GroupProject-Report2-NguyenDangHoa_2Minh Tuan Nguyen
 
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014multimediaeval
 
VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...
VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...
VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...niranjan kumar
 
6-8-2015 AACC Poster HIV p24 S-PLEX - Stengelin_final
6-8-2015 AACC Poster HIV p24 S-PLEX - Stengelin_final6-8-2015 AACC Poster HIV p24 S-PLEX - Stengelin_final
6-8-2015 AACC Poster HIV p24 S-PLEX - Stengelin_finalLawrence Hwang
 
SF Big Analytics20170706: What the brain tells us about the future of streami...
SF Big Analytics20170706: What the brain tells us about the future of streami...SF Big Analytics20170706: What the brain tells us about the future of streami...
SF Big Analytics20170706: What the brain tells us about the future of streami...Chester Chen
 
OHDSI Symposium 2016 - ATLAS Demo loop
OHDSI Symposium 2016 - ATLAS Demo loopOHDSI Symposium 2016 - ATLAS Demo loop
OHDSI Symposium 2016 - ATLAS Demo loopjmbanda
 
2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vupload2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vuploadProf. Wim Van Criekinge
 
台灣人工智慧學校北部智慧醫療專班開學典禮 - 主題演講:邁向智慧醫療新時代(陳昇瑋執行長)
台灣人工智慧學校北部智慧醫療專班開學典禮 - 主題演講:邁向智慧醫療新時代(陳昇瑋執行長)台灣人工智慧學校北部智慧醫療專班開學典禮 - 主題演講:邁向智慧醫療新時代(陳昇瑋執行長)
台灣人工智慧學校北部智慧醫療專班開學典禮 - 主題演講:邁向智慧醫療新時代(陳昇瑋執行長)AI.academy
 
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐCHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐlykhnh386525
 
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...Servio Fernando Lima Reina
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030GenomeInABottle
 
Multisite UTE 31P Rosette MRSI(PETALUTE)
Multisite UTE 31P Rosette MRSI(PETALUTE)Multisite UTE 31P Rosette MRSI(PETALUTE)
Multisite UTE 31P Rosette MRSI(PETALUTE)Uzay Emir
 
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Databricks
 
邁向智慧醫療新時代_台灣人工智慧學校中部智慧醫療專班開學主題演講
邁向智慧醫療新時代_台灣人工智慧學校中部智慧醫療專班開學主題演講邁向智慧醫療新時代_台灣人工智慧學校中部智慧醫療專班開學主題演講
邁向智慧醫療新時代_台灣人工智慧學校中部智慧醫療專班開學主題演講AI.academy
 
Development and verification of an Ion AmpliSeq TP53 Panel
Development and verification of an Ion AmpliSeq TP53 PanelDevelopment and verification of an Ion AmpliSeq TP53 Panel
Development and verification of an Ion AmpliSeq TP53 PanelThermo Fisher Scientific
 
A rapid library preparation method with custom assay designs for detection of...
A rapid library preparation method with custom assay designs for detection of...A rapid library preparation method with custom assay designs for detection of...
A rapid library preparation method with custom assay designs for detection of...Thermo Fisher Scientific
 

Similar a Convolutional Neural Network to Model Articulation Impairments in Patients with Parkinson’s Disease (20)

ICT-GroupProject-Report2-NguyenDangHoa_2
ICT-GroupProject-Report2-NguyenDangHoa_2ICT-GroupProject-Report2-NguyenDangHoa_2
ICT-GroupProject-Report2-NguyenDangHoa_2
 
Telemonitoring of Four Characteristic Parameters of Acoustic Vocal Signal in...
Telemonitoring of Four Characteristic Parameters of Acoustic  Vocal Signal in...Telemonitoring of Four Characteristic Parameters of Acoustic  Vocal Signal in...
Telemonitoring of Four Characteristic Parameters of Acoustic Vocal Signal in...
 
transformers_multimodal_ehr.pdf
transformers_multimodal_ehr.pdftransformers_multimodal_ehr.pdf
transformers_multimodal_ehr.pdf
 
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
 
VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...
VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...
VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...
 
6-8-2015 AACC Poster HIV p24 S-PLEX - Stengelin_final
6-8-2015 AACC Poster HIV p24 S-PLEX - Stengelin_final6-8-2015 AACC Poster HIV p24 S-PLEX - Stengelin_final
6-8-2015 AACC Poster HIV p24 S-PLEX - Stengelin_final
 
SF Big Analytics20170706: What the brain tells us about the future of streami...
SF Big Analytics20170706: What the brain tells us about the future of streami...SF Big Analytics20170706: What the brain tells us about the future of streami...
SF Big Analytics20170706: What the brain tells us about the future of streami...
 
BioSB meeting 2015
BioSB meeting 2015BioSB meeting 2015
BioSB meeting 2015
 
OHDSI Symposium 2016 - ATLAS Demo loop
OHDSI Symposium 2016 - ATLAS Demo loopOHDSI Symposium 2016 - ATLAS Demo loop
OHDSI Symposium 2016 - ATLAS Demo loop
 
2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vupload2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vupload
 
台灣人工智慧學校北部智慧醫療專班開學典禮 - 主題演講:邁向智慧醫療新時代(陳昇瑋執行長)
台灣人工智慧學校北部智慧醫療專班開學典禮 - 主題演講:邁向智慧醫療新時代(陳昇瑋執行長)台灣人工智慧學校北部智慧醫療專班開學典禮 - 主題演講:邁向智慧醫療新時代(陳昇瑋執行長)
台灣人工智慧學校北部智慧醫療專班開學典禮 - 主題演講:邁向智慧醫療新時代(陳昇瑋執行長)
 
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐCHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
 
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030
 
Multisite UTE 31P Rosette MRSI(PETALUTE)
Multisite UTE 31P Rosette MRSI(PETALUTE)Multisite UTE 31P Rosette MRSI(PETALUTE)
Multisite UTE 31P Rosette MRSI(PETALUTE)
 
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
 
NGS and the molecular basis of disease: a practical view
NGS and the molecular basis of disease: a practical viewNGS and the molecular basis of disease: a practical view
NGS and the molecular basis of disease: a practical view
 
邁向智慧醫療新時代_台灣人工智慧學校中部智慧醫療專班開學主題演講
邁向智慧醫療新時代_台灣人工智慧學校中部智慧醫療專班開學主題演講邁向智慧醫療新時代_台灣人工智慧學校中部智慧醫療專班開學主題演講
邁向智慧醫療新時代_台灣人工智慧學校中部智慧醫療專班開學主題演講
 
Development and verification of an Ion AmpliSeq TP53 Panel
Development and verification of an Ion AmpliSeq TP53 PanelDevelopment and verification of an Ion AmpliSeq TP53 Panel
Development and verification of an Ion AmpliSeq TP53 Panel
 
A rapid library preparation method with custom assay designs for detection of...
A rapid library preparation method with custom assay designs for detection of...A rapid library preparation method with custom assay designs for detection of...
A rapid library preparation method with custom assay designs for detection of...
 

Último

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Ramkumar k
 
Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxDigital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxpritamlangde
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Servicemeghakumariji156
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilVinayVitekari
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesMayuraD1
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdfAldoGarca30
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksMagic Marks
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxmaisarahman1
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdfKamal Acharya
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...drmkjayanthikannan
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxSCMS School of Architecture
 
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...jabtakhaidam7
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfsumitt6_25730773
 

Último (20)

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxDigital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptx
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech Civil
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic Marks
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
 

Convolutional Neural Network to Model Articulation Impairments in Patients with Parkinson’s Disease

  • 1. Convolutional Neural Network to Model Articulation Impairments in Patients with Parkinson’s Disease Juan Camilo V´asquez-Correa1,2 Juan Rafael Orozco-Arroyave1,2, Elmar N¨oth2 1GITA research group, University of Antioquia UdeA. 2Pattern recognition Lab. Friedrich Alexander Universit¨at. Erlangen-N¨urnberg. jcamilo.vasquez@udea.edu.co 18th INTERSPEECH, 2017 1 / 32
  • 4. Introduction: Parkinson’s Disease Second most prevalent neurologi- cal disorder worldwide. Patients develop several motor and non-motor impairments. (O. Hornykiewicz 1998). Speech impairments are one of the earliest manifestations. 4 / 32
  • 5. Introduction: Speech impairments Speech impairments in PD patients: hypokinetic dysarthria Phonation Articulation Prosody Intelligibility pataka pataka 5 / 32
  • 6. Introduction: Imprecise articulation One of the most deviant speech dimensions in PD. Reduced velocity of lip, tongue, and jaw movements. Strong indication of the literature statement: imprecise con- sonants caused by reduced range of movements of ar- ticulators pa ta ka 6 / 32
  • 7. Introduction: Hypothesis PD patients have difficulties to begin and to stop the vocal fold vibration, and such difficulties can be observed on speech sig- nals by modeling the transitions between voiced and unvoiced sounds Voiced Voiced UnvoicedVoicedUnvoiced Unvoiced Onset transition Offset transition 7 / 32
  • 8. Introduction: Aims To model the time-frequency (TF) information provided by the onset and offset transitions: short-time Fourier trans- form (STFT) and continuous wavelet transform (CWT). To “learn” features from time-frequency representations: con- volutional neural network (CNN). Why TF and feature-learning? both have been successfully used in several paralinguistics tasks: emotion, deception, depression, and others. 8 / 32
  • 11. Methods: Transitions detection Transitions detection Time frequency representations Convolutional neural network Onset transition Offset transition Onset and offset are detected according to the presence of the fundamental frequency. 11 / 32
  • 12. Methods: Time-frequency representation Transitions detection Time frequency representations Convolutional neural network 0.000.020.040.060.080.100.120.14 Time (s) 500 1000 1500 2000 2500 3000 3500 4000 Frequency(Hz) 0.000.020.040.060.080.100.120.14 Time (s) STFT of onset for a PD patient (left) and a HC subject (right) Play PD Play HC 12 / 32
  • 13. Methods: Time-frequency representation Transitions detection Time frequency representations Convolutional neural network 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 Time (s) 100 200 300 400 500 Scale 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 Time (s) CWT of onset for a PD patient (left) and a HC subject (right) 13 / 32
  • 14. Methods: Convolutional neural network Transitions detection Time frequency representations Convolutional neural network Convolution layer I Convolution layer IIMax-pool. layer 1 Max-pool layer 2 Fully conected MLP Input layer PD vs. HC Feature maps 1 Feature maps 2 CNN learns high–level representations from the low–level raw data 14 / 32
  • 16. Data Three databases with recordings in three languages: Span- ish, German, and Czech. Diadochokinetic exercises, isolated sentences, read texts, and monologues. 16 / 32
  • 17. Data Language Description Spanish 50 Patients and 50 Healthy controls. Balanced in age (60 years old) and gender. Patients in middle state of the disease. German 88 Patients and 88 Healthy controls. Balanced in age (64 years old). patients in low and middle state of the disease. Czech 20 Patients and 15 Healthy controls. All male speakers. Patients diagnosed during recording session. Table: Databases 17 / 32
  • 18. Experiments and validation Classification of PD patients vs. HC subjects in the same language. 10 fold cross-validation: 8 for training, 1 to optimize hyper-parameters, and 1 for test. Cross-language classification. One language used for train and validation and other language used for test. 18 / 32
  • 19. Experiments and validation Results are compared respect to previous studies1. Support vector machine 1Juan Camilo V´asquez-Correa et al. “Effect of acoustic conditions on algorithms to detect Parkinson’s disease from speech”. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP),. 2017, pp. 5065–5069. 19 / 32
  • 21. Results: same language for train and test TFR Onset Offset Onset+Offset Spanish CNN-STFT 85.3 81.6 85.9 CNN-CWT 84.2 81.8 85.2 Baseline 69.3 69.6 71.6 German CNN-STFT 70.3 68.0 75.0 CNN-CWT 68.0 66.9 70.5 Baseline 72.7 70.9 74.0 Czech CNN-STFT 77.9 80.4 84.4 CNN-CWT 89.2 87.7 89.4 Baseline 75.3 74.4 78.8 21 / 32
  • 22. Results: same language for train and test TFR Onset Offset Onset+Offset Spanish CNN-STFT 85.3 81.6 85.9 CNN-CWT 84.2 81.8 85.2 Baseline 69.3 69.6 71.6 German CNN-STFT 70.3 68.0 75.0 CNN-CWT 68.0 66.9 70.5 Baseline 72.7 70.9 74.0 Czech CNN-STFT 77.9 80.4 84.4 CNN-CWT 89.2 87.7 89.4 Baseline 75.3 74.4 78.8 22 / 32
  • 23. Results: same language for train and test TFR Onset Offset Onset+Offset Spanish CNN-STFT 85.3 81.6 85.9 CNN-CWT 84.2 81.8 85.2 Baseline 69.3 69.6 71.6 German CNN-STFT 70.3 68.0 75.0 CNN-CWT 68.0 66.9 70.5 Baseline 72.7 70.9 74.0 Czech CNN-STFT 77.9 80.4 84.4 CNN-CWT 89.2 87.7 89.4 Baseline 75.3 74.4 78.8 23 / 32
  • 24. Results: same language for train and test 50 100 150 Time (ms) 0 1000 2000 3000 4000Frequency(Hz) 50 100 150 Time (ms) Low Energy High Energy Figure: Output of the CNN after the last max–pool layer: PD patient (left) and a HC speaker (right) 24 / 32
  • 25. Results: same language for train and test Speech tasks Spanish German Czech read text 85.0 70.3 88.5 monologue 85.6 70.3 89.1 /pa-ta-ka/ 85.4 70.7 89.2 25 / 32
  • 26. Results: different language for train and test Test Lang. TFR onset offset onset+offset Train with Spanish German CNN-STFT 51.7 50.2 54.7 German Baseline 53.7 55.0 54.1 Czech CNN-CWT 55.2 55.4 57.9 Czech Baseline 60.3 57.4 60.4 Train with German Spanish CNN-STFT 58.0 55.7 55.8 Spanish Baseline 53.5 53.5 53.6 Czech CNN-STFT 53.0 52.4 53.0 Czech Baseline 50.9 51.7 52.6 Train with Czech Spanish CNN-CWT 53.8 56.3 56.7 Spanish Baseline 53.4 51.6 52.4 German CNN-STFT 54.0 51.8 54.0 German Baseline 51.2 51.0 50.7 26 / 32
  • 27. Results: different language for train and test Test Lang. TFR onset offset onset+offset Train with Spanish German CNN-STFT 51.7 50.2 54.7 German Baseline 53.7 55.0 54.1 Czech CNN-CWT 55.2 55.4 57.9 Czech Baseline 60.3 57.4 60.4 Train with German Spanish CNN-STFT 58.0 55.7 55.8 Spanish Baseline 53.5 53.5 53.6 Czech CNN-STFT 53.0 52.4 53.0 Czech Baseline 50.9 51.7 52.6 Train with Czech Spanish CNN-CWT 53.8 56.3 56.7 Spanish Baseline 53.4 51.6 52.4 German CNN-STFT 54.0 51.8 54.0 German Baseline 51.2 51.0 50.7 27 / 32
  • 28. Results: different language for train and test Test Lang. TFR onset offset onset+offset Train with Spanish German CNN-STFT 51.7 50.2 54.7 German Baseline 53.7 55.0 54.1 Czech CNN-CWT 55.2 55.4 57.9 Czech Baseline 60.3 57.4 60.4 Train with German Spanish CNN-STFT 58.0 55.7 55.8 Spanish Baseline 53.5 53.5 53.6 Czech CNN-STFT 53.0 52.4 53.0 Czech Baseline 50.9 51.7 52.6 Train with Czech Spanish CNN-CWT 53.8 56.3 56.7 Spanish Baseline 53.4 51.6 52.4 German CNN-STFT 54.0 51.8 54.0 German Baseline 51.2 51.0 50.7 28 / 32
  • 30. Conclusion A deep learning approach is proposed to model articulation impairments of PD patients. Voiced-Unvoiced transitions are modeled with CNNs using STFT and CWT. 30 / 32
  • 31. Conclusion The proposed method is able to classify PD patients and HC subjects and improves the baseline when the language used for train and test is the same. Additional approaches should be proposed when the train and test language are different. Recurrent neural networks and other architectures may be considered to assess co-articulation. Deep learning approaches trained with phonation, articula- tion, and prosody information may be addressed to evaluate specific speech impairments. 31 / 32
  • 32. Convolutional Neural Network to Model Articulation Impairments in Patients with Parkinson’s Disease Juan Camilo V´asquez-Correa1,2 Juan Rafael Orozco-Arroyave1,2, Elmar N¨oth2 1GITA research group, University of Antioquia UdeA. 2Pattern recognition Lab. Friedrich Alexander Universit¨at. Erlangen-N¨urnberg. jcamilo.vasquez@udea.edu.co 18th INTERSPEECH, 2017 32 / 32