SlideShare una empresa de Scribd logo
1 de 18
Descargar para leer sin conexión
Retrieving Correct Semantic Boundaries
        in Dependency Structure
    The 4th Linguistic Annotation Workshop at ACL’10
                       July 15th, 2010


 Jinho D. Choi (University of Colorado at Boulder)
 Martha Palmer (University of Colorado at Boulder)
Dependency Structure for SRL
•   What is dependency?
    -   Syntactic or semantic relation between a pair of words.
                         TMP
                         LOC             PMOD
                                           NMOD
                    events
                    places     in       this       city
                                                  year

•   Why dependency structure for semantic role labeling?
    -   Dependency relations often correlate with semantic roles.

    -   Simpler structure → faster annotation → more gold-standard
                               faster parsing      → more applications
        Dep (Choi) vs. Phrase (Charniak) → 0.0025 vs. 0.5 (sec)

                                    2
Phrase vs. Dependency Structure
•   Constituent vs. Dependency
                                                appear
                                          SBJ            LOC

       -SBJ
                                   results                 in
                      -LOC       NMOD                           PMOD


                                    The                   news
                                                                NMOD


                                                          today
                                                                NMOD

                                                           's

       10/15 (66.67%) parsing papers at ACL’10
             are on Dependency Parsing
                             3
PropBank in Phrase Structure
 •   A corpus annotated with verbal propositions and arguments.

 •   Arguments are annotated on phrases.



        ARG0

                                           ARGM-LOC




 But there is no phrase
in dependency structure


                                4
PropBank in Dependency Structure
•   Arguments are annotated on head words instead.
                                             Phrase = Subtree of head-word
           ARG0

                                                    ARGM-LOC




                     ROOT                               PMOD
                                                         NMOD
                 NMOD        SBJ           LOC          NMOD

        root   The     results     appear        in today      's news


                                       5
Propbank in Dependency Structure
•   Phase ≠ Subtree of head-word.


            ARG1



Subtree of the head word
  includes the predicate

                NMOD        NMOD       LGS        PMOD

            The     plant      owned         by     Mark



                                   6
Tasks
•   Tasks
    -   Convert phrase structure (PS) to dependency structure (DS).

    -   Find correct head words in DS.

    -   Retrieve correct semantic boundaries from DS.

•   Conversion
    -   Pennconverter, by Richard Johansson
        •   Used for CoNLL 2007 - 2009.

    -   Penn Treebank (Wall Street Journal)
        •   49,208 trees were converted.

        •   292,073 Propbank arguments exist.


                                     7
System Overview
 Penn Treebank                   PropBank

 Pennconverter                  Heuristics

Dependency trees                Head words

          Automatic SRL System

            Set of Head words

                   Heuristics

          Set of chunks (phrases)
                       8
Finding correct head words
•   Get the word-set Sp of
    each argument in PS.
•   For each word in Sp, find
    the word wmax with the
    maximum subtree in DS.
•   Add the word to the
    head-list Sd.                                                                    }
                                Sp = { Yields, on, mutual, funds, to, slide}
•   Remove the subtree of
    wmax from Sp.
                                                 ROOT
                                                   SBJ
                                                    PMOD


•   Repeat the search until
    Sp becomes empty.
                                        NMOD

                               root Yields
                                                      NMOD

                                               on mutual   funds
                                                                          OPRD

                                                                   continued
                                                                                   IM

                                                                                 to slide

                                Sd = [Yields , to ]
                                 9
Retrieving correct semantic boundaries
•   Retrieving the subtrees of head-words
    -   100% recall, 87.62% precision, 96.11% F1-score.

    -   What does this mean?
        •   The state-of-art SRL system using DS performs about 86%.

        •   If your application requires actual argument phrases instead of
            head-words, the performance becomes lower than 86%.

•   Improve the precision by applying heuristics on:
    -   Modals, negations

    -   Verb chain, relative clauses

    -   Gerunds, past-participles


                                      10
Verb Predicates whose Semantic
Arguments are their Syntactic Heads
•   Semantic arguments of verb predicates can be the
    syntactic heads of the verbs.
•   General solution
    -   For each head word, retrieve the subtree of the head word
        excluding the subtree of the verb predicate.

                  NMOD        NMOD        LGS        PMOD

              The     plant      owned          by     Mark




                                     11
Examples
•   Modals are the heads of the main verbs in DS.
            ROOT                                      COORD           OBJ
                 SBJ     COORD           CONJ         ADV             NMOD

       root He         may          or          may     not read     the    book

•   Conjunctions
                  NMOD                            OBJ
                       DEP          COORD       CONJ          NMOD

           people who        meet           or exceed the expectation

•   Past-participles
                        NMOD                            PMOD
                        NMOD                                NMOD

         correspondence mailed about incomplete 8300s



                                           12
Evaluations
•   Models
    -   Model I	

 : retrieving all words in the subtrees (baseline).

    -   Model II : using all heuristics.

    -   Model III : II + excluding punctuation.

•   Measurements
    -   Accuracy	

: exact match

    -   Precision

    -   Recall

    -   F1-score


                                    13
Evaluations
•     Results
    -       Baseline	

 	

   : 88.00%a, 92.51%p, 100%r , 96.11%f

    -       Final model	

 : 98.20%a, 99.14%p, 99.95%r, 99.54%f
            •   Statistically significant (t = 149, p < .0001)
100

 97
                                                                      Accuracy
 94                                                                   Precision
                                                                      Recall
 91                                                                   F1

 88
        I                             II                        III


                                            14
Error Analysis
•    Overlapping arguments

                                                   ARG1

              ARG1    ARGM-LOC




                       PMOD                  LOC          PMOD
        OBJ     LOC     NMOD               OBJ            NMOD

    share burdens in the region        share burdens in the region




                                  15
Error Analysis
•   PP attachment

                          NMOD
            NMOD               SBJ     ADV         PMOD

          the enthusiasm investors showed    for    stocks
                ARG1



                              ADV
                           NMOD
            NMOD               SBJ                 PMOD

          the enthusiasm investors showed    for    stocks
                ARG1



                              16
Conclusion
•   Conclusion
    -   Find correct head words (min-set with max-coverage).

    -   Find correct semantic boundaries (99.54% F1-score).

    -   Suggest ways of reconstructing dependency structure so that
        it can fit better with semantic roles.

    -   Can be used to fix some of the inconsistencies in both
        Treebank and Propbank annotations.

•   Future work
    -   Apply to different corpora.

    -   Find ways of automatically adding empty categories.


                                 17
Acknowledgements
•   Special thanks are due to Professor Joakim Nivre of
    Uppsala University (Sweden) for his helpful insights.
•   National Science Foundation CISE-CRI-0551615
•   Towards a Comprehensive Linguistic Annotation and
    CISE-CRI 0709167
•   Collaborative: A Multi-Representational and Multi-
    Layered Treebank for Hindi/Urdu
•   Defense Advanced Research Projects Agency (DARPA/
    IPTO) under the GALE program, DARPA/CMO
    Contract No. HR0011-06-C-0022, subcontract from
    BBN, Inc.


                             18

Más contenido relacionado

Destacado

Multi-layer Annotation in Dependency Structure
Multi-layer Annotation in Dependency StructureMulti-layer Annotation in Dependency Structure
Multi-layer Annotation in Dependency StructureJinho Choi
 
Startup Workshop - Pitching
Startup Workshop - PitchingStartup Workshop - Pitching
Startup Workshop - PitchingOliver Hanisch
 
An elementary navigation simulated in Java
An elementary navigation simulated in JavaAn elementary navigation simulated in Java
An elementary navigation simulated in JavaJinho Choi
 
Real-time, Automatic Alert System by using GPS
Real-time, Automatic Alert System by using GPSReal-time, Automatic Alert System by using GPS
Real-time, Automatic Alert System by using GPSJinho Choi
 
Vericenter Summary
Vericenter SummaryVericenter Summary
Vericenter Summarydeyoepw
 
Using Parallel Propbanks to Enhance Word-alignments
Using Parallel Propbanks to Enhance Word-alignmentsUsing Parallel Propbanks to Enhance Word-alignments
Using Parallel Propbanks to Enhance Word-alignmentsJinho Choi
 
The CLEAR Dependency
The CLEAR DependencyThe CLEAR Dependency
The CLEAR DependencyJinho Choi
 
Transition-based Semantic Role Labeling Using Predicate Argument Clustering
Transition-based Semantic Role Labeling Using Predicate Argument ClusteringTransition-based Semantic Role Labeling Using Predicate Argument Clustering
Transition-based Semantic Role Labeling Using Predicate Argument ClusteringJinho Choi
 
Startup Workshop - US Business Culture
Startup Workshop - US Business CultureStartup Workshop - US Business Culture
Startup Workshop - US Business CultureOliver Hanisch
 
Ch 1 language-Presented by Mr. Kak Sovanna
Ch 1 language-Presented by Mr. Kak SovannaCh 1 language-Presented by Mr. Kak Sovanna
Ch 1 language-Presented by Mr. Kak SovannaSovanna Kakk
 
Constraining the Theory - Prof. Fredreck J. Newmeyer
Constraining the Theory - Prof. Fredreck J. NewmeyerConstraining the Theory - Prof. Fredreck J. Newmeyer
Constraining the Theory - Prof. Fredreck J. NewmeyerPhoenix Tree Publishing Inc
 
Syntax by George Yule
Syntax by George YuleSyntax by George Yule
Syntax by George YuleAsif Ali Raza
 
Principles And Parameters Of Universal Grammar
Principles And Parameters Of Universal GrammarPrinciples And Parameters Of Universal Grammar
Principles And Parameters Of Universal GrammarDr. Cupid Lucid
 
Transformational Grammar by: Noam Chomsky
Transformational Grammar by: Noam ChomskyTransformational Grammar by: Noam Chomsky
Transformational Grammar by: Noam ChomskyShiela May Claro
 
Communicative Competence -Final PPT.
Communicative Competence -Final PPT.Communicative Competence -Final PPT.
Communicative Competence -Final PPT.Bilal Yaseen
 

Destacado (18)

Multi-layer Annotation in Dependency Structure
Multi-layer Annotation in Dependency StructureMulti-layer Annotation in Dependency Structure
Multi-layer Annotation in Dependency Structure
 
Startup Workshop - Pitching
Startup Workshop - PitchingStartup Workshop - Pitching
Startup Workshop - Pitching
 
An elementary navigation simulated in Java
An elementary navigation simulated in JavaAn elementary navigation simulated in Java
An elementary navigation simulated in Java
 
Real-time, Automatic Alert System by using GPS
Real-time, Automatic Alert System by using GPSReal-time, Automatic Alert System by using GPS
Real-time, Automatic Alert System by using GPS
 
Vericenter Summary
Vericenter SummaryVericenter Summary
Vericenter Summary
 
Using Parallel Propbanks to Enhance Word-alignments
Using Parallel Propbanks to Enhance Word-alignmentsUsing Parallel Propbanks to Enhance Word-alignments
Using Parallel Propbanks to Enhance Word-alignments
 
The CLEAR Dependency
The CLEAR DependencyThe CLEAR Dependency
The CLEAR Dependency
 
Transition-based Semantic Role Labeling Using Predicate Argument Clustering
Transition-based Semantic Role Labeling Using Predicate Argument ClusteringTransition-based Semantic Role Labeling Using Predicate Argument Clustering
Transition-based Semantic Role Labeling Using Predicate Argument Clustering
 
Startup Workshop - US Business Culture
Startup Workshop - US Business CultureStartup Workshop - US Business Culture
Startup Workshop - US Business Culture
 
Ch 1 language-Presented by Mr. Kak Sovanna
Ch 1 language-Presented by Mr. Kak SovannaCh 1 language-Presented by Mr. Kak Sovanna
Ch 1 language-Presented by Mr. Kak Sovanna
 
Syntax
SyntaxSyntax
Syntax
 
Constraining the Theory - Prof. Fredreck J. Newmeyer
Constraining the Theory - Prof. Fredreck J. NewmeyerConstraining the Theory - Prof. Fredreck J. Newmeyer
Constraining the Theory - Prof. Fredreck J. Newmeyer
 
Syntax by George Yule
Syntax by George YuleSyntax by George Yule
Syntax by George Yule
 
Principles And Parameters Of Universal Grammar
Principles And Parameters Of Universal GrammarPrinciples And Parameters Of Universal Grammar
Principles And Parameters Of Universal Grammar
 
八德
八德八德
八德
 
八德
八德八德
八德
 
Transformational Grammar by: Noam Chomsky
Transformational Grammar by: Noam ChomskyTransformational Grammar by: Noam Chomsky
Transformational Grammar by: Noam Chomsky
 
Communicative Competence -Final PPT.
Communicative Competence -Final PPT.Communicative Competence -Final PPT.
Communicative Competence -Final PPT.
 

Más de Jinho Choi

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Jinho Choi
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Jinho Choi
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Jinho Choi
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Jinho Choi
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionJinho Choi
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Jinho Choi
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning RepresentationJinho Choi
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role LabelingJinho Choi
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet SimilaritiesJinho Choi
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical RelationsJinho Choi
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementJinho Choi
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingJinho Choi
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueJinho Choi
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingJinho Choi
 
Topological Sort
Topological SortTopological Sort
Topological SortJinho Choi
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseJinho Choi
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsJinho Choi
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyJinho Choi
 

Más de Jinho Choi (20)

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
 
Topological Sort
Topological SortTopological Sort
Topological Sort
 
Tries - Put
Tries - PutTries - Put
Tries - Put
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
 

Último

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Retrieving Correct Semantic Boundaries in Dependency Structures

  • 1. Retrieving Correct Semantic Boundaries in Dependency Structure The 4th Linguistic Annotation Workshop at ACL’10 July 15th, 2010 Jinho D. Choi (University of Colorado at Boulder) Martha Palmer (University of Colorado at Boulder)
  • 2. Dependency Structure for SRL • What is dependency? - Syntactic or semantic relation between a pair of words. TMP LOC PMOD NMOD events places in this city year • Why dependency structure for semantic role labeling? - Dependency relations often correlate with semantic roles. - Simpler structure → faster annotation → more gold-standard faster parsing → more applications Dep (Choi) vs. Phrase (Charniak) → 0.0025 vs. 0.5 (sec) 2
  • 3. Phrase vs. Dependency Structure • Constituent vs. Dependency appear SBJ LOC -SBJ results in -LOC NMOD PMOD The news NMOD today NMOD 's 10/15 (66.67%) parsing papers at ACL’10 are on Dependency Parsing 3
  • 4. PropBank in Phrase Structure • A corpus annotated with verbal propositions and arguments. • Arguments are annotated on phrases. ARG0 ARGM-LOC But there is no phrase in dependency structure 4
  • 5. PropBank in Dependency Structure • Arguments are annotated on head words instead. Phrase = Subtree of head-word ARG0 ARGM-LOC ROOT PMOD NMOD NMOD SBJ LOC NMOD root The results appear in today 's news 5
  • 6. Propbank in Dependency Structure • Phase ≠ Subtree of head-word. ARG1 Subtree of the head word includes the predicate NMOD NMOD LGS PMOD The plant owned by Mark 6
  • 7. Tasks • Tasks - Convert phrase structure (PS) to dependency structure (DS). - Find correct head words in DS. - Retrieve correct semantic boundaries from DS. • Conversion - Pennconverter, by Richard Johansson • Used for CoNLL 2007 - 2009. - Penn Treebank (Wall Street Journal) • 49,208 trees were converted. • 292,073 Propbank arguments exist. 7
  • 8. System Overview Penn Treebank PropBank Pennconverter Heuristics Dependency trees Head words Automatic SRL System Set of Head words Heuristics Set of chunks (phrases) 8
  • 9. Finding correct head words • Get the word-set Sp of each argument in PS. • For each word in Sp, find the word wmax with the maximum subtree in DS. • Add the word to the head-list Sd. } Sp = { Yields, on, mutual, funds, to, slide} • Remove the subtree of wmax from Sp. ROOT SBJ PMOD • Repeat the search until Sp becomes empty. NMOD root Yields NMOD on mutual funds OPRD continued IM to slide Sd = [Yields , to ] 9
  • 10. Retrieving correct semantic boundaries • Retrieving the subtrees of head-words - 100% recall, 87.62% precision, 96.11% F1-score. - What does this mean? • The state-of-art SRL system using DS performs about 86%. • If your application requires actual argument phrases instead of head-words, the performance becomes lower than 86%. • Improve the precision by applying heuristics on: - Modals, negations - Verb chain, relative clauses - Gerunds, past-participles 10
  • 11. Verb Predicates whose Semantic Arguments are their Syntactic Heads • Semantic arguments of verb predicates can be the syntactic heads of the verbs. • General solution - For each head word, retrieve the subtree of the head word excluding the subtree of the verb predicate. NMOD NMOD LGS PMOD The plant owned by Mark 11
  • 12. Examples • Modals are the heads of the main verbs in DS. ROOT COORD OBJ SBJ COORD CONJ ADV NMOD root He may or may not read the book • Conjunctions NMOD OBJ DEP COORD CONJ NMOD people who meet or exceed the expectation • Past-participles NMOD PMOD NMOD NMOD correspondence mailed about incomplete 8300s 12
  • 13. Evaluations • Models - Model I : retrieving all words in the subtrees (baseline). - Model II : using all heuristics. - Model III : II + excluding punctuation. • Measurements - Accuracy : exact match - Precision - Recall - F1-score 13
  • 14. Evaluations • Results - Baseline : 88.00%a, 92.51%p, 100%r , 96.11%f - Final model : 98.20%a, 99.14%p, 99.95%r, 99.54%f • Statistically significant (t = 149, p < .0001) 100 97 Accuracy 94 Precision Recall 91 F1 88 I II III 14
  • 15. Error Analysis • Overlapping arguments ARG1 ARG1 ARGM-LOC PMOD LOC PMOD OBJ LOC NMOD OBJ NMOD share burdens in the region share burdens in the region 15
  • 16. Error Analysis • PP attachment NMOD NMOD SBJ ADV PMOD the enthusiasm investors showed for stocks ARG1 ADV NMOD NMOD SBJ PMOD the enthusiasm investors showed for stocks ARG1 16
  • 17. Conclusion • Conclusion - Find correct head words (min-set with max-coverage). - Find correct semantic boundaries (99.54% F1-score). - Suggest ways of reconstructing dependency structure so that it can fit better with semantic roles. - Can be used to fix some of the inconsistencies in both Treebank and Propbank annotations. • Future work - Apply to different corpora. - Find ways of automatically adding empty categories. 17
  • 18. Acknowledgements • Special thanks are due to Professor Joakim Nivre of Uppsala University (Sweden) for his helpful insights. • National Science Foundation CISE-CRI-0551615 • Towards a Comprehensive Linguistic Annotation and CISE-CRI 0709167 • Collaborative: A Multi-Representational and Multi- Layered Treebank for Hindi/Urdu • Defense Advanced Research Projects Agency (DARPA/ IPTO) under the GALE program, DARPA/CMO Contract No. HR0011-06-C-0022, subcontract from BBN, Inc. 18

Notas del editor

  1. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  2. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  3. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  4. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  5. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  6. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  7. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  8. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  9. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  10. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  11. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  12. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  13. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  14. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  15. Many SRL systems use phrase structure but For 4M sentences: 2.7 hours vs. 23 days
  16. Visualize the difference between phrase and dependency -SBJ, still doesn&amp;#x2019;t show relations between &amp;#x2018;The results&amp;#x2019; and &amp;#x2018;appear&amp;#x2019;
  17. Visualize the difference between phrase and dependency -SBJ, still doesn&amp;#x2019;t show relations between &amp;#x2018;The results&amp;#x2019; and &amp;#x2018;appear&amp;#x2019;
  18. Visualize the difference between phrase and dependency -SBJ, still doesn&amp;#x2019;t show relations between &amp;#x2018;The results&amp;#x2019; and &amp;#x2018;appear&amp;#x2019;
  19. Visualize the difference between phrase and dependency -SBJ, still doesn&amp;#x2019;t show relations between &amp;#x2018;The results&amp;#x2019; and &amp;#x2018;appear&amp;#x2019;
  20. Visualize the difference between phrase and dependency -SBJ, still doesn&amp;#x2019;t show relations between &amp;#x2018;The results&amp;#x2019; and &amp;#x2018;appear&amp;#x2019;
  21. Dependency relations vs. semantic roles
  22. Dependency relations vs. semantic roles
  23. Dependency relations vs. semantic roles
  24. Dependency relations vs. semantic roles
  25. Dependency relations vs. semantic roles
  26. Dependency relations vs. semantic roles
  27. Dependency relations vs. semantic roles
  28. Dependency relations vs. semantic roles
  29. Dependency relations vs. semantic roles
  30. Dependency relations vs. semantic roles
  31. Dependency relations vs. semantic roles
  32. Reduced relative clauses
  33. Reduced relative clauses
  34. Reduced relative clauses
  35. Reduced relative clauses
  36. Reduced relative clauses
  37. Reduced relative clauses
  38. head word = superset increase precision
  39. head word = superset increase precision
  40. head word = superset increase precision
  41. head word = superset increase precision
  42. head word = superset increase precision
  43. head word = superset increase precision
  44. head word = superset increase precision
  45. head word = superset increase precision
  46. head word = superset increase precision
  47. head word = superset increase precision
  48. 400bell ringers show out of 100
  49. 400bell ringers show out of 100
  50. 400bell ringers show out of 100
  51. 400bell ringers show out of 100
  52. 400bell ringers show out of 100
  53. 400bell ringers show out of 100
  54. 400bell ringers show out of 100
  55. 400bell ringers show out of 100
  56. 400bell ringers show out of 100