SlideShare una empresa de Scribd logo
1 de 25
Descargar para leer sin conexión
Introduction            Temporal links        Temporal signals     Improving annotation              Summary




                  A Corpus-based Study of Temporal Signals

                                           Leon Derczynski

                                           University of Sheffield


                                             20 July, 2011




Leon Derczynski                                                                           University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Outline


       1 Introduction

       2 Temporal links

       3 Temporal signals

       4 Improving annotation

       5 Summary




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Motivation


       Language for time helps us describe:
               changes
               planning
               history
       Time is not always explicit in natural language – we don’t include
       a timestamp with every action
       Goals:
       Try to automatically extract temporal information from
       documents, so that we can build a model that connects
       information in a text with time


Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Temporal Entities

       What elements can we try to extract from discourse?
       Each document might contain:
       Basic primitives:
               Events – occurences, states, reports
               Times – dates and times, durations, sets
       Linkages between primitives:
               general temporal link
               aspectual links and subordination
       We can use the basic primitives as nodes on a graph, and links as
       its arcs.


Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Outline


       1 Introduction

       2 Temporal links

       3 Temporal signals

       4 Improving annotation

       5 Summary




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Temporal link labelling



               How do we label the links between temporal entities?
               First, choose a relation set: TimeML gives us 13, including
               before, simultaneous, includes..
               Some relations have transitive and commutative properties:
               If “a before b” and “b before c” then we can infer “a before c”
               This means that consistency can be important
               Develop a gold-standard corpus – TimeBank




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction             Temporal links    Temporal signals   Improving annotation              Summary




Automated temporal link labelling


                   How can we automatically label links?
                   Machine learning approaches: teach ourselves how to label a
                   link based on times and events it may connect
                   Use TimeBank and other as examples of how
                   A difficult task: notable research effort, including various
                   evaluation exercises, have attempted it
                   Overall accuracy remains around 60% – 70% : too low1



               1
           See Chambers & Jurafsky, 2008;
       Mirroshandel et. al. 2010; TempEval-2010
Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Source of temporal linking information


               What information can we use to label links?
               If a human can manage to understand temporal relations, the
               information must be somewhere
               Possible sources:
               – tense and aspect
               – world knowledge
               – discourse structure
               – specific time information (at 9 o’clock)
               – explicit signals: temporal conjunctions


Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Outline


       1 Introduction

       2 Temporal links

       3 Temporal signals

       4 Improving annotation

       5 Summary




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Temporal conjunctions



               Are these words/phrases useful for automatic understanding?
               A baseline system could learn to label links with 62% accuracy
               With simple modification, links in TimeBank that had
               associated signals could be annotated with 83% accuracy
               Clear indication that signals are an accessible source of
               temporal information




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Temporal conjunctions in newswire



               What do temporal conjunctions look like in TimeBank?
               11.2% of temporal links are annotated as having one (718
               instances)
               Top words:
               – prepositions (in, for, on)
               – conjunctions (after, before, since)




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links          Temporal signals         Improving annotation              Summary




Temporal conjunctions in newswire


                                                          Occurrences       Likelihood of
                  Phrase                 Corpus freq.     as signal         being a signal
                  subsequently                      3                3          100%
                  after                            72               67           93%
                  follows                           4                3           75%
                  before                           33               23           70%
                  until                            36               25           69%
                  during                           19               13           68%
                  as soon as                        3                2           67%

       Table: A sample of phrases most likely to be annotated as a signal when
       they occur in TimeBank, which occur more than once in the corpus.




Leon Derczynski                                                                                 University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Discrimination of temporal signal words



               What else are these temporal signal words used for?
               Some words are very likely to have a temporal sense:
               subsequently – 3 instances, all temporal;
               after – 72 instances, 93% temporal.
               Other words are versatile:
               from – 366 instances, 5% temporal.
               between – 33 instances, 1 temporal;




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links         Temporal signals     Improving annotation              Summary




Signal-to-link relations

               What temporal relations do these words signify?
               after doesn’t always signify a temporal after relation
               Word order is important
               After I ate, I went to bed
               I ate after I went to bed
                               Signal phrase      TimeML relation      Frequency
                               after                    AFTER                 56
                               after                    ENDS                   6
                               after                  BEGINS                   4
                               after                   IAFTER                  1
                               already                BEFORE                   6
                               already               INCLUDES                  4
                               already             IS INCLUDED                 3


Leon Derczynski                                                                            University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Signal class



               How can we characterise temporal signals?
               Signals are likely to belong to a closed class of words
               Common prepositions as seen earlier
               Some adverbs – previously, subsequently
               Set phrases – as soon as, so far




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Spatial/Temporal overlap


               Time and space are related and events are constrained in
               terms of both
               Language for space and time has some similarities
               before has both temporal and spatial senses
               Spatially annotated corpora – SpatialML
               Relative spatial links in this corpus are much more likely to
               employ a signal (97.5%)
               Possible explanation – temporal language is more diverse
               (tense, auxiliaries)



Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Outline


       1 Introduction

       2 Temporal links

       3 Temporal signals

       4 Improving annotation

       5 Summary




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Re-annotation



               Are these signals correctly annotated in TimeBank?
               Manual examination: start with words that are likely to be
               temporal signals
               before: found 33 times in the corpus, 23 are signals
               Many under-annotated cases:
               before the war began
               was scheduled to return to port before hostilities erupted




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Re-annotation




               How could we improve signal annotation?
               Linguistic description of temporal conjunctions may be weak
               Annotation guidelines may be insufficient
               Solution: provide an enhanced signal description, and revise
               TimeBank accordingly




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Formal signal description



               A temporal signal is a word that indicates the type of
               temporal relation between two intervals
               Signal surface forms have a head and an optional quantifier
               shortly after – quantified temporal signal
               Temporal signals have exactly two arguments (events and/or
               times)
               One argument may be implicit (e.g. for Later)




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Augmented TimeBank



               We examined 30 of the most frequent signal words and
               phrases that were not annotated as temporal
               This comprised around 1 000 instances in text
               We annotated any missed temporal signals, including EVENT
               and TLINK annotations where required
               This resulted in 15.8% of TLINKs using a signal




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Outline


       1 Introduction

       2 Temporal links

       3 Temporal signals

       4 Improving annotation

       5 Summary




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Conclusion




               Temporal signals are a usable and important source of
               information
               We have provided a definition for temporal signals
               Existing corpora have been upgraded with better annotation




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




Future work




               Automatic signal discrimination
               Signal association
               Applying findings to spatial language




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction            Temporal links     Temporal signals   Improving annotation              Summary




                              Thank you. Are there any questions?




Leon Derczynski                                                                      University of Sheffield
A Corpus-based Study of Temporal Signals

Más contenido relacionado

Destacado

Empirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense FrameworkEmpirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense FrameworkLeon Derczynski
 
Social psycology journal assignment
Social psycology journal assignmentSocial psycology journal assignment
Social psycology journal assignmentKz Ng
 
Christian busquieal and javier alvarez modern presentation
Christian busquieal and javier alvarez   modern presentationChristian busquieal and javier alvarez   modern presentation
Christian busquieal and javier alvarez modern presentationmlo825
 
Wacana 33-the-fight-for-forest-control-and-the-struggle-of-indigenous-women
Wacana 33-the-fight-for-forest-control-and-the-struggle-of-indigenous-womenWacana 33-the-fight-for-forest-control-and-the-struggle-of-indigenous-women
Wacana 33-the-fight-for-forest-control-and-the-struggle-of-indigenous-womenAksi SETAPAK
 
Open Innovation Marketplace bizBarcelona
Open Innovation Marketplace bizBarcelonaOpen Innovation Marketplace bizBarcelona
Open Innovation Marketplace bizBarcelonaXPCAT
 
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data
 Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy DataLeon Derczynski
 
Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice GuidelinesCorpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice GuidelinesLeon Derczynski
 
Towards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media DataTowards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media DataLeon Derczynski
 
Leveraging the Power of Social Media
Leveraging the Power of Social MediaLeveraging the Power of Social Media
Leveraging the Power of Social MediaLeon Derczynski
 
Determining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseDetermining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseLeon Derczynski
 
Microblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracyMicroblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracyLeon Derczynski
 
Word Sense Disambiguation and Induction
Word Sense Disambiguation and InductionWord Sense Disambiguation and Induction
Word Sense Disambiguation and InductionLeon Derczynski
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Leon Derczynski
 
Dachis Group Social Business Journal - Issue 01
Dachis Group Social Business Journal - Issue 01Dachis Group Social Business Journal - Issue 01
Dachis Group Social Business Journal - Issue 01Dachis Group
 
Unit 6 capstone assignment
Unit 6 capstone assignmentUnit 6 capstone assignment
Unit 6 capstone assignmenttillbillysbabe
 

Destacado (20)

Cmap4
Cmap4Cmap4
Cmap4
 
Cmap6
Cmap6Cmap6
Cmap6
 
Empirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense FrameworkEmpirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense Framework
 
Social psycology journal assignment
Social psycology journal assignmentSocial psycology journal assignment
Social psycology journal assignment
 
Christian busquieal and javier alvarez modern presentation
Christian busquieal and javier alvarez   modern presentationChristian busquieal and javier alvarez   modern presentation
Christian busquieal and javier alvarez modern presentation
 
Wacana 33-the-fight-for-forest-control-and-the-struggle-of-indigenous-women
Wacana 33-the-fight-for-forest-control-and-the-struggle-of-indigenous-womenWacana 33-the-fight-for-forest-control-and-the-struggle-of-indigenous-women
Wacana 33-the-fight-for-forest-control-and-the-struggle-of-indigenous-women
 
The Social Journal
The Social JournalThe Social Journal
The Social Journal
 
Open Innovation Marketplace bizBarcelona
Open Innovation Marketplace bizBarcelonaOpen Innovation Marketplace bizBarcelona
Open Innovation Marketplace bizBarcelona
 
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data
 Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data
 
Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice GuidelinesCorpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
 
Towards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media DataTowards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media Data
 
Leveraging the Power of Social Media
Leveraging the Power of Social MediaLeveraging the Power of Social Media
Leveraging the Power of Social Media
 
Determining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseDetermining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in Discourse
 
Microblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracyMicroblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracy
 
Turabian 8th edition ppt
Turabian 8th edition pptTurabian 8th edition ppt
Turabian 8th edition ppt
 
Word Sense Disambiguation and Induction
Word Sense Disambiguation and InductionWord Sense Disambiguation and Induction
Word Sense Disambiguation and Induction
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
 
Dachis Group Social Business Journal - Issue 01
Dachis Group Social Business Journal - Issue 01Dachis Group Social Business Journal - Issue 01
Dachis Group Social Business Journal - Issue 01
 
Unit 6 capstone assignment
Unit 6 capstone assignmentUnit 6 capstone assignment
Unit 6 capstone assignment
 
Annotation notes[1]
Annotation notes[1]Annotation notes[1]
Annotation notes[1]
 

Más de Leon Derczynski

Joint Rumour Stance and Veracity
Joint Rumour Stance and VeracityJoint Rumour Stance and Veracity
Joint Rumour Stance and VeracityLeon Derczynski
 
State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018Leon Derczynski
 
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceBroad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceLeon Derczynski
 
Handling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGCHandling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGCLeon Derczynski
 
Efficient named entity annotation through pre-empting
Efficient named entity annotation through pre-emptingEfficient named entity annotation through pre-empting
Efficient named entity annotation through pre-emptingLeon Derczynski
 
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...Leon Derczynski
 
Starting to Process Social Media
Starting to Process Social MediaStarting to Process Social Media
Starting to Process Social MediaLeon Derczynski
 
Christmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I doChristmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I doLeon Derczynski
 
Recognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal ExpressionsRecognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal ExpressionsLeon Derczynski
 
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextTwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextLeon Derczynski
 
Determining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseDetermining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseLeon Derczynski
 
Review of: Challenges of migrating to agile methodologies
Review of: Challenges of migrating to agile methodologiesReview of: Challenges of migrating to agile methodologies
Review of: Challenges of migrating to agile methodologiesLeon Derczynski
 
An Annotation Scheme for Reichenbach's Verbal Tense Structure
An Annotation Scheme for Reichenbach's Verbal Tense StructureAn Annotation Scheme for Reichenbach's Verbal Tense Structure
An Annotation Scheme for Reichenbach's Verbal Tense StructureLeon Derczynski
 
RTMBank: Capturing Verbs with Reichenbach's Tense Model
RTMBank: Capturing Verbs with Reichenbach's Tense ModelRTMBank: Capturing Verbs with Reichenbach's Tense Model
RTMBank: Capturing Verbs with Reichenbach's Tense ModelLeon Derczynski
 

Más de Leon Derczynski (15)

Joint Rumour Stance and Veracity
Joint Rumour Stance and VeracityJoint Rumour Stance and Veracity
Joint Rumour Stance and Veracity
 
State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018
 
RumourEval
RumourEvalRumourEval
RumourEval
 
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceBroad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
 
Handling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGCHandling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGC
 
Efficient named entity annotation through pre-empting
Efficient named entity annotation through pre-emptingEfficient named entity annotation through pre-empting
Efficient named entity annotation through pre-empting
 
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
 
Starting to Process Social Media
Starting to Process Social MediaStarting to Process Social Media
Starting to Process Social Media
 
Christmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I doChristmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I do
 
Recognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal ExpressionsRecognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal Expressions
 
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextTwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
 
Determining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseDetermining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in Discourse
 
Review of: Challenges of migrating to agile methodologies
Review of: Challenges of migrating to agile methodologiesReview of: Challenges of migrating to agile methodologies
Review of: Challenges of migrating to agile methodologies
 
An Annotation Scheme for Reichenbach's Verbal Tense Structure
An Annotation Scheme for Reichenbach's Verbal Tense StructureAn Annotation Scheme for Reichenbach's Verbal Tense Structure
An Annotation Scheme for Reichenbach's Verbal Tense Structure
 
RTMBank: Capturing Verbs with Reichenbach's Tense Model
RTMBank: Capturing Verbs with Reichenbach's Tense ModelRTMBank: Capturing Verbs with Reichenbach's Tense Model
RTMBank: Capturing Verbs with Reichenbach's Tense Model
 

Último

(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 

Último (20)

(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 

A Corpus-based Study of Temporal Signals

  • 1. Introduction Temporal links Temporal signals Improving annotation Summary A Corpus-based Study of Temporal Signals Leon Derczynski University of Sheffield 20 July, 2011 Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 2. Introduction Temporal links Temporal signals Improving annotation Summary Outline 1 Introduction 2 Temporal links 3 Temporal signals 4 Improving annotation 5 Summary Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 3. Introduction Temporal links Temporal signals Improving annotation Summary Motivation Language for time helps us describe: changes planning history Time is not always explicit in natural language – we don’t include a timestamp with every action Goals: Try to automatically extract temporal information from documents, so that we can build a model that connects information in a text with time Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 4. Introduction Temporal links Temporal signals Improving annotation Summary Temporal Entities What elements can we try to extract from discourse? Each document might contain: Basic primitives: Events – occurences, states, reports Times – dates and times, durations, sets Linkages between primitives: general temporal link aspectual links and subordination We can use the basic primitives as nodes on a graph, and links as its arcs. Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 5. Introduction Temporal links Temporal signals Improving annotation Summary Outline 1 Introduction 2 Temporal links 3 Temporal signals 4 Improving annotation 5 Summary Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 6. Introduction Temporal links Temporal signals Improving annotation Summary Temporal link labelling How do we label the links between temporal entities? First, choose a relation set: TimeML gives us 13, including before, simultaneous, includes.. Some relations have transitive and commutative properties: If “a before b” and “b before c” then we can infer “a before c” This means that consistency can be important Develop a gold-standard corpus – TimeBank Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 7. Introduction Temporal links Temporal signals Improving annotation Summary Automated temporal link labelling How can we automatically label links? Machine learning approaches: teach ourselves how to label a link based on times and events it may connect Use TimeBank and other as examples of how A difficult task: notable research effort, including various evaluation exercises, have attempted it Overall accuracy remains around 60% – 70% : too low1 1 See Chambers & Jurafsky, 2008; Mirroshandel et. al. 2010; TempEval-2010 Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 8. Introduction Temporal links Temporal signals Improving annotation Summary Source of temporal linking information What information can we use to label links? If a human can manage to understand temporal relations, the information must be somewhere Possible sources: – tense and aspect – world knowledge – discourse structure – specific time information (at 9 o’clock) – explicit signals: temporal conjunctions Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 9. Introduction Temporal links Temporal signals Improving annotation Summary Outline 1 Introduction 2 Temporal links 3 Temporal signals 4 Improving annotation 5 Summary Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 10. Introduction Temporal links Temporal signals Improving annotation Summary Temporal conjunctions Are these words/phrases useful for automatic understanding? A baseline system could learn to label links with 62% accuracy With simple modification, links in TimeBank that had associated signals could be annotated with 83% accuracy Clear indication that signals are an accessible source of temporal information Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 11. Introduction Temporal links Temporal signals Improving annotation Summary Temporal conjunctions in newswire What do temporal conjunctions look like in TimeBank? 11.2% of temporal links are annotated as having one (718 instances) Top words: – prepositions (in, for, on) – conjunctions (after, before, since) Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 12. Introduction Temporal links Temporal signals Improving annotation Summary Temporal conjunctions in newswire Occurrences Likelihood of Phrase Corpus freq. as signal being a signal subsequently 3 3 100% after 72 67 93% follows 4 3 75% before 33 23 70% until 36 25 69% during 19 13 68% as soon as 3 2 67% Table: A sample of phrases most likely to be annotated as a signal when they occur in TimeBank, which occur more than once in the corpus. Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 13. Introduction Temporal links Temporal signals Improving annotation Summary Discrimination of temporal signal words What else are these temporal signal words used for? Some words are very likely to have a temporal sense: subsequently – 3 instances, all temporal; after – 72 instances, 93% temporal. Other words are versatile: from – 366 instances, 5% temporal. between – 33 instances, 1 temporal; Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 14. Introduction Temporal links Temporal signals Improving annotation Summary Signal-to-link relations What temporal relations do these words signify? after doesn’t always signify a temporal after relation Word order is important After I ate, I went to bed I ate after I went to bed Signal phrase TimeML relation Frequency after AFTER 56 after ENDS 6 after BEGINS 4 after IAFTER 1 already BEFORE 6 already INCLUDES 4 already IS INCLUDED 3 Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 15. Introduction Temporal links Temporal signals Improving annotation Summary Signal class How can we characterise temporal signals? Signals are likely to belong to a closed class of words Common prepositions as seen earlier Some adverbs – previously, subsequently Set phrases – as soon as, so far Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 16. Introduction Temporal links Temporal signals Improving annotation Summary Spatial/Temporal overlap Time and space are related and events are constrained in terms of both Language for space and time has some similarities before has both temporal and spatial senses Spatially annotated corpora – SpatialML Relative spatial links in this corpus are much more likely to employ a signal (97.5%) Possible explanation – temporal language is more diverse (tense, auxiliaries) Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 17. Introduction Temporal links Temporal signals Improving annotation Summary Outline 1 Introduction 2 Temporal links 3 Temporal signals 4 Improving annotation 5 Summary Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 18. Introduction Temporal links Temporal signals Improving annotation Summary Re-annotation Are these signals correctly annotated in TimeBank? Manual examination: start with words that are likely to be temporal signals before: found 33 times in the corpus, 23 are signals Many under-annotated cases: before the war began was scheduled to return to port before hostilities erupted Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 19. Introduction Temporal links Temporal signals Improving annotation Summary Re-annotation How could we improve signal annotation? Linguistic description of temporal conjunctions may be weak Annotation guidelines may be insufficient Solution: provide an enhanced signal description, and revise TimeBank accordingly Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 20. Introduction Temporal links Temporal signals Improving annotation Summary Formal signal description A temporal signal is a word that indicates the type of temporal relation between two intervals Signal surface forms have a head and an optional quantifier shortly after – quantified temporal signal Temporal signals have exactly two arguments (events and/or times) One argument may be implicit (e.g. for Later) Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 21. Introduction Temporal links Temporal signals Improving annotation Summary Augmented TimeBank We examined 30 of the most frequent signal words and phrases that were not annotated as temporal This comprised around 1 000 instances in text We annotated any missed temporal signals, including EVENT and TLINK annotations where required This resulted in 15.8% of TLINKs using a signal Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 22. Introduction Temporal links Temporal signals Improving annotation Summary Outline 1 Introduction 2 Temporal links 3 Temporal signals 4 Improving annotation 5 Summary Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 23. Introduction Temporal links Temporal signals Improving annotation Summary Conclusion Temporal signals are a usable and important source of information We have provided a definition for temporal signals Existing corpora have been upgraded with better annotation Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 24. Introduction Temporal links Temporal signals Improving annotation Summary Future work Automatic signal discrimination Signal association Applying findings to spatial language Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals
  • 25. Introduction Temporal links Temporal signals Improving annotation Summary Thank you. Are there any questions? Leon Derczynski University of Sheffield A Corpus-based Study of Temporal Signals