SlideShare una empresa de Scribd logo
1 de 12
Automatic Coherence Analysis of Dutch
Henk van den Heuvel
Jet Hoek, Micha Hulsbosch, Erwin Komen, Ted Sanders, Wilbert Spooren
Student assistants: Iris Hofstra, Patrick Sonsma
Coherence relations in discourse
A CR consists of two discourse segments and, optionally, a connective
[S1 The temperature rose] [connective because] [S2 the sun was shining]
Common research questions wrt coherence relations and
connectives
The advantage of using automatic analyses
• Less dependent on manual analyses
- higher reliability
- larger samples
- larger number of genres
ACAD: Automatic Coherence Analysis of Dutch
Goals of ACAD
• Build a search interface, on the basis of existing Clariah
components
- corpora like SoNaR, VU-DNC, CGN
- parsers like Alpino
- formats like Folia
- search facilities like CorpusStudio
• Make it possible to formulate sophisticated search queries for
computationally uninitiated discourse analysts
- translated into XQuery in the backend
• Make analyses reproduceable (and consequently more
transparent)
• Extend the available corpora
- newspaper texts (NRC and NRC.nl) from different genres (hard news,
opinion, background stories) on related topics
- WhatsAppdata of different age groups (13/14, 20-25)
ACAD: Automatic Coherence Analysis of Dutch
ACAD: Automatic Coherence Analysis of Dutch
The search interface: Cesar
ACAD: Automatic Coherence Analysis of Dutch
How does it work?
ACAD: Automatic Coherence Analysis of Dutch
Editing the search: specification of variables
ACAD: Automatic Coherence Analysis of Dutch
Controling the output
ACAD: Automatic Coherence Analysis of Dutch
Carrying out the search: choose the corpus
ACAD: Automatic Coherence Analysis of Dutch
Results
ACAD: where do we go from here?
• Need for manuals to instruct the computationally
uninitiated discourse analyst
• The potential of ACAD
- investigate other connectives (contrastive, conditional, additive)
- investigate other issues, e.g.,
- prototypical positioning of various connectives (i.e., before both
segments or between the two segments)
- what omdat-segments have Verb-second?
- investigate constructions rather than words
- investigate other languages
• Resulting corpora with CMDI metadata released for VLO
- Newspaper texts (NRC and NRC.nl) from different genres (hard news,
opinion, background stories) on related topics (2011)
- Two WhatsApp datasets of different age groups (13/14, 20-25)

Más contenido relacionado

Similar a Automatic Coherence Analysis of Dutch Discourse with ACAD

Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the WebRinke Hoekstra
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalNik Spirin
 
Between  information  retrieval  services  and bibliometrics  research. New  ...
Between  information  retrieval  services  and bibliometrics  research. New  ...Between  information  retrieval  services  and bibliometrics  research. New  ...
Between  information  retrieval  services  and bibliometrics  research. New  ...Andrea Scharnhorst
 
DODDLE-OWL: A Domain Ontology Construction Tool with OWL
DODDLE-OWL: A Domain Ontology Construction Tool with OWLDODDLE-OWL: A Domain Ontology Construction Tool with OWL
DODDLE-OWL: A Domain Ontology Construction Tool with OWLTakeshi Morita
 
Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval GESIS
 
MetaScience: Holistic Approach for Research Modeling and Analysis
MetaScience: Holistic Approach for Research Modeling and AnalysisMetaScience: Holistic Approach for Research Modeling and Analysis
MetaScience: Holistic Approach for Research Modeling and AnalysisJordi Cabot
 
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...Andrea Scharnhorst
 
Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the ...
Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the ...Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the ...
Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the ...Paige Morgan
 
Writing Right: Teaching Writing Conventions Specific to a Discipline
Writing Right: Teaching Writing Conventions Specific to a DisciplineWriting Right: Teaching Writing Conventions Specific to a Discipline
Writing Right: Teaching Writing Conventions Specific to a DisciplineRobert Domanski
 
NLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inNLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inKumari Naveen
 
lect36-tasks.ppt
lect36-tasks.pptlect36-tasks.ppt
lect36-tasks.pptHaHa501620
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxKalpit Desai
 
DSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformDSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformAndrea Bollini
 
Dmdh winter 2015 session #2
Dmdh winter 2015 session #2Dmdh winter 2015 session #2
Dmdh winter 2015 session #2sarahkh12
 

Similar a Automatic Coherence Analysis of Dutch Discourse with ACAD (20)

Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the Web
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
Between  information  retrieval  services  and bibliometrics  research. New  ...
Between  information  retrieval  services  and bibliometrics  research. New  ...Between  information  retrieval  services  and bibliometrics  research. New  ...
Between  information  retrieval  services  and bibliometrics  research. New  ...
 
DODDLE-OWL: A Domain Ontology Construction Tool with OWL
DODDLE-OWL: A Domain Ontology Construction Tool with OWLDODDLE-OWL: A Domain Ontology Construction Tool with OWL
DODDLE-OWL: A Domain Ontology Construction Tool with OWL
 
ESWC 2014 Tutorial part 3
ESWC 2014 Tutorial part 3ESWC 2014 Tutorial part 3
ESWC 2014 Tutorial part 3
 
Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval
 
Linked open data: standardization, interoperability and multilingual challeng...
Linked open data: standardization, interoperability and multilingual challeng...Linked open data: standardization, interoperability and multilingual challeng...
Linked open data: standardization, interoperability and multilingual challeng...
 
MetaScience: Holistic Approach for Research Modeling and Analysis
MetaScience: Holistic Approach for Research Modeling and AnalysisMetaScience: Holistic Approach for Research Modeling and Analysis
MetaScience: Holistic Approach for Research Modeling and Analysis
 
IR
IRIR
IR
 
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
 
Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the ...
Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the ...Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the ...
Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the ...
 
Writing Right: Teaching Writing Conventions Specific to a Discipline
Writing Right: Teaching Writing Conventions Specific to a DisciplineWriting Right: Teaching Writing Conventions Specific to a Discipline
Writing Right: Teaching Writing Conventions Specific to a Discipline
 
NLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inNLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful in
 
lect36-tasks.ppt
lect36-tasks.pptlect36-tasks.ppt
lect36-tasks.ppt
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptx
 
DSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformDSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platform
 
Dmdh winter 2015 session #2
Dmdh winter 2015 session #2Dmdh winter 2015 session #2
Dmdh winter 2015 session #2
 
LKG Editor Dev
LKG Editor DevLKG Editor Dev
LKG Editor Dev
 
A tool for discourse visualization and analysis
A tool for discourse visualization and analysisA tool for discourse visualization and analysis
A tool for discourse visualization and analysis
 

Más de CLARIAH

DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018CLARIAH
 
Masterclass innosurance 2018
Masterclass innosurance 2018Masterclass innosurance 2018
Masterclass innosurance 2018CLARIAH
 
Flat TLA
Flat TLAFlat TLA
Flat TLACLARIAH
 
QB'er demonstration
QB'er demonstrationQB'er demonstration
QB'er demonstrationCLARIAH
 
Collection registration for the CLARIAH Media Suite.
Collection registration for the CLARIAH Media Suite.Collection registration for the CLARIAH Media Suite.
Collection registration for the CLARIAH Media Suite.CLARIAH
 
CMDI2RDF
CMDI2RDFCMDI2RDF
CMDI2RDFCLARIAH
 
2016 05-20-clariah-wp4
2016 05-20-clariah-wp42016 05-20-clariah-wp4
2016 05-20-clariah-wp4CLARIAH
 
2016 05-20-clariah-wp3
2016 05-20-clariah-wp32016 05-20-clariah-wp3
2016 05-20-clariah-wp3CLARIAH
 
2016 05-20-clariah-wp2
2016 05-20-clariah-wp22016 05-20-clariah-wp2
2016 05-20-clariah-wp2CLARIAH
 
2016 05-20-clariah-wp5
2016 05-20-clariah-wp52016 05-20-clariah-wp5
2016 05-20-clariah-wp5CLARIAH
 
MTAS Henny Brugman
MTAS Henny BrugmanMTAS Henny Brugman
MTAS Henny BrugmanCLARIAH
 
LREC Ton vd Wouden
LREC Ton vd WoudenLREC Ton vd Wouden
LREC Ton vd WoudenCLARIAH
 
Paqu Gertjan van Noord en Jan Odijk
Paqu Gertjan van Noord en Jan OdijkPaqu Gertjan van Noord en Jan Odijk
Paqu Gertjan van Noord en Jan OdijkCLARIAH
 
Open sonar martinreynaert
Open sonar martinreynaertOpen sonar martinreynaert
Open sonar martinreynaertCLARIAH
 
Struc data Auke Rijpma
Struc data Auke RijpmaStruc data Auke Rijpma
Struc data Auke RijpmaCLARIAH
 
Diachronous conceptuallexicons Marieke van Erp / Piek Vossen
Diachronous conceptuallexicons Marieke van Erp / Piek VossenDiachronous conceptuallexicons Marieke van Erp / Piek Vossen
Diachronous conceptuallexicons Marieke van Erp / Piek VossenCLARIAH
 
Corpus studio Erwin Komen
Corpus studio Erwin KomenCorpus studio Erwin Komen
Corpus studio Erwin KomenCLARIAH
 
Athena richard zijdeman
Athena richard zijdemanAthena richard zijdeman
Athena richard zijdemanCLARIAH
 
Struc data aukerijpma
Struc data aukerijpmaStruc data aukerijpma
Struc data aukerijpmaCLARIAH
 
Anansi jauco noordzij
Anansi jauco noordzijAnansi jauco noordzij
Anansi jauco noordzijCLARIAH
 

Más de CLARIAH (20)

DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
 
Masterclass innosurance 2018
Masterclass innosurance 2018Masterclass innosurance 2018
Masterclass innosurance 2018
 
Flat TLA
Flat TLAFlat TLA
Flat TLA
 
QB'er demonstration
QB'er demonstrationQB'er demonstration
QB'er demonstration
 
Collection registration for the CLARIAH Media Suite.
Collection registration for the CLARIAH Media Suite.Collection registration for the CLARIAH Media Suite.
Collection registration for the CLARIAH Media Suite.
 
CMDI2RDF
CMDI2RDFCMDI2RDF
CMDI2RDF
 
2016 05-20-clariah-wp4
2016 05-20-clariah-wp42016 05-20-clariah-wp4
2016 05-20-clariah-wp4
 
2016 05-20-clariah-wp3
2016 05-20-clariah-wp32016 05-20-clariah-wp3
2016 05-20-clariah-wp3
 
2016 05-20-clariah-wp2
2016 05-20-clariah-wp22016 05-20-clariah-wp2
2016 05-20-clariah-wp2
 
2016 05-20-clariah-wp5
2016 05-20-clariah-wp52016 05-20-clariah-wp5
2016 05-20-clariah-wp5
 
MTAS Henny Brugman
MTAS Henny BrugmanMTAS Henny Brugman
MTAS Henny Brugman
 
LREC Ton vd Wouden
LREC Ton vd WoudenLREC Ton vd Wouden
LREC Ton vd Wouden
 
Paqu Gertjan van Noord en Jan Odijk
Paqu Gertjan van Noord en Jan OdijkPaqu Gertjan van Noord en Jan Odijk
Paqu Gertjan van Noord en Jan Odijk
 
Open sonar martinreynaert
Open sonar martinreynaertOpen sonar martinreynaert
Open sonar martinreynaert
 
Struc data Auke Rijpma
Struc data Auke RijpmaStruc data Auke Rijpma
Struc data Auke Rijpma
 
Diachronous conceptuallexicons Marieke van Erp / Piek Vossen
Diachronous conceptuallexicons Marieke van Erp / Piek VossenDiachronous conceptuallexicons Marieke van Erp / Piek Vossen
Diachronous conceptuallexicons Marieke van Erp / Piek Vossen
 
Corpus studio Erwin Komen
Corpus studio Erwin KomenCorpus studio Erwin Komen
Corpus studio Erwin Komen
 
Athena richard zijdeman
Athena richard zijdemanAthena richard zijdeman
Athena richard zijdeman
 
Struc data aukerijpma
Struc data aukerijpmaStruc data aukerijpma
Struc data aukerijpma
 
Anansi jauco noordzij
Anansi jauco noordzijAnansi jauco noordzij
Anansi jauco noordzij
 

Último

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 

Último (20)

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 

Automatic Coherence Analysis of Dutch Discourse with ACAD

  • 1. Automatic Coherence Analysis of Dutch Henk van den Heuvel Jet Hoek, Micha Hulsbosch, Erwin Komen, Ted Sanders, Wilbert Spooren Student assistants: Iris Hofstra, Patrick Sonsma
  • 2. Coherence relations in discourse A CR consists of two discourse segments and, optionally, a connective [S1 The temperature rose] [connective because] [S2 the sun was shining]
  • 3. Common research questions wrt coherence relations and connectives
  • 4. The advantage of using automatic analyses • Less dependent on manual analyses - higher reliability - larger samples - larger number of genres ACAD: Automatic Coherence Analysis of Dutch
  • 5. Goals of ACAD • Build a search interface, on the basis of existing Clariah components - corpora like SoNaR, VU-DNC, CGN - parsers like Alpino - formats like Folia - search facilities like CorpusStudio • Make it possible to formulate sophisticated search queries for computationally uninitiated discourse analysts - translated into XQuery in the backend • Make analyses reproduceable (and consequently more transparent) • Extend the available corpora - newspaper texts (NRC and NRC.nl) from different genres (hard news, opinion, background stories) on related topics - WhatsAppdata of different age groups (13/14, 20-25) ACAD: Automatic Coherence Analysis of Dutch
  • 6. ACAD: Automatic Coherence Analysis of Dutch The search interface: Cesar
  • 7. ACAD: Automatic Coherence Analysis of Dutch How does it work?
  • 8. ACAD: Automatic Coherence Analysis of Dutch Editing the search: specification of variables
  • 9. ACAD: Automatic Coherence Analysis of Dutch Controling the output
  • 10. ACAD: Automatic Coherence Analysis of Dutch Carrying out the search: choose the corpus
  • 11. ACAD: Automatic Coherence Analysis of Dutch Results
  • 12. ACAD: where do we go from here? • Need for manuals to instruct the computationally uninitiated discourse analyst • The potential of ACAD - investigate other connectives (contrastive, conditional, additive) - investigate other issues, e.g., - prototypical positioning of various connectives (i.e., before both segments or between the two segments) - what omdat-segments have Verb-second? - investigate constructions rather than words - investigate other languages • Resulting corpora with CMDI metadata released for VLO - Newspaper texts (NRC and NRC.nl) from different genres (hard news, opinion, background stories) on related topics (2011) - Two WhatsApp datasets of different age groups (13/14, 20-25)