SlideShare una empresa de Scribd logo
1 de 30
The Web is not a PERSON, Berners-
Lee is not an ORGANIZATION, and
African-Americans are not
LOCATIONS:
An Analysis of the Performance of
Named-Entity Recognition
Robert Krovetz (Lexicalresearch.com), Paul Deane, Nitin
Madnani (ETS)




A Review by Richard
Littauer (UdS)
The Background
   Named-Entity Recognition (NER) is
    normally judged in the context of
    Information Extraction (IE)
The Background
 Named-Entity Recognition (NER) is
  normally judged in the context of
  Information Extraction (IE)
 Various competitions
The Background
 Named-Entity Recognition (NER) is
  normally judged in the context of
  Information Extraction (IE)
 Various competitions
 Recently:
    ◦ non-English languages
    ◦ improving unsupervised learning methods
The Background
   “There are no well-established
    standards for evaluation of NER.”
The Background
   “There are no well-established
    standards for evaluation of NER.”
    ◦ Criteria for NER system changes for
      competitions
    ◦ Proprietary software
The Background
   KDM wanted to identify MWEs…
The Background
   KDM wanted to identify MWEs…
      … but false positives, tagging
      inconsistencies stopped this.
The Background
   KDM wanted to identify MWEs…
      … but false positives, tagging
      inconsistencies stopped this.

 IE derives Recall and Precision from
  Information Retrieval
 NER is just a small part of this, so is
  rarely evaluated independently
The Background
   So, they want to test NER systems,
    and provide a unit test based on the
    problems encountered
Evaluation
Compared three NER taggers:
 Stanford:
    ◦ CRF, 100m training corpus;
   University of Illinois (LBJ):
    ◦ Regularized average perceptron, Reuters
      1996 News Corpus;
   BBN IdentiFinder (IdentiFinder):
    ◦ HMMs, commercial
Evaluation
   Agreement on Classification
Evaluation
 Agreement on Classification
 Ambiguity in Discourse
Evaluation
 Agreement on Classification
 Ambiguity in Discourse


 Stanford vs. LBJ on internal ETS
  425m corpus
 All three on American National Corpus
Stanford vs. LBJ
   NER reported as 85-95% accurate.
Stanford vs. LBJ
 NER reported as 85-95% accurate.
 Same number for both: 1.95m for
  Stanford, 1.8m for LBJ (7.6%
  difference)
 However, errors:
Stanford vs. LBJ
   Agreement:
Stanford vs. LBJ
   Ambiguity:
Stanford vs. LBJ vs.
IdentiFinder
   Agreement:
Stanford vs. LBJ vs.
IdentiFinder
   Agreement:
Stanford vs. LBJ vs.
IdentiFinder
   Differences:
    ◦ How they are tokenized
    ◦ Number of entities recognized overall
Stanford vs. LBJ vs.
IdentiFinder
   Ambiguity:
Unit Test
   Created two documents that can be
    used as texts
    ◦ Different cases for true positives of
      PERSON, LOCATION, ORGANIZATION
    ◦ Entirely upper case not NE (Ex.
      AAARGH)
    ◦ Punctuated terms not NE
    ◦ Terms with Initials
    ◦ Acronyms (some expanded, some not)
    ◦ Last names in close proximity to first
      names
Unit Test
   Created two documents that can be
    used as texts
    ◦ Terms with prepositions (Mass. Inst. Of
      Tech.)
    ◦ Terms with location and organization
      (Amherst College)

   Provided freely online.
One NE Tag per Discourse
 Unusual for multiple occurrences of a
  token in a document to be different
  entities
 True for homonyms
 An exception: Location + sports team
One NE Tag per Discourse
 Stanford, LBJ have features for non-
  local dependencies to help with this.
 KDM: Two other uses for NLD:
    ◦ Source of error in evaluation
    ◦ A way to identify semantically related
      entities

   These should be treated as
    exceptions
Discussion
 There are guidelines for NER – but we
  need standards.
 The community should focus on
  PERSON, ORGANISATION,
  LOCATION, and MISC.
    ◦   Harder to deal with than Dates, Times.
    ◦   Disagreement between taggers.
    ◦   MISC is necessary.
    ◦   These have important value elsewhere.
Discussion
   To improve intrinsic evaluation for
    NER:
    1. Create test sets for divers domains.
    2. Use standardized sets for different
       phenomena.
    3. Report accuracy for POL separately.
    4. Establish uncertainty in the tagging
       system.
Conclusion
 90% accuracy not real.
 We need to use only entities that are
  agreed on by multiple taggers.
 Even in cases where they both
  disagree (Hint: Future work.)

   Unit test downloadable.
Cheers/PERSON


Richard/ORGANISATION thanks the
Mword Class/LOCATION for listening to
his talk about Berners-Lee/MISC

Más contenido relacionado

Destacado

Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Olivier Grisel
 

Destacado (20)

Dictionary-based named entity recognition
Dictionary-based named entity recognitionDictionary-based named entity recognition
Dictionary-based named entity recognition
 
Named Entities
Named EntitiesNamed Entities
Named Entities
 
A Semi-Automatic Annotation Tool For Arabic Online Handwritten Text
A Semi-Automatic Annotation Tool For Arabic Online Handwritten TextA Semi-Automatic Annotation Tool For Arabic Online Handwritten Text
A Semi-Automatic Annotation Tool For Arabic Online Handwritten Text
 
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
 
Automatic Term Ambiguity Detection
Automatic Term Ambiguity DetectionAutomatic Term Ambiguity Detection
Automatic Term Ambiguity Detection
 
Exploring Linked Data content through network analysis
Exploring Linked Data content through network analysisExploring Linked Data content through network analysis
Exploring Linked Data content through network analysis
 
Linked Data: What’s the Story?
Linked Data: What’s the Story?Linked Data: What’s the Story?
Linked Data: What’s the Story?
 
Entity Search Engine
Entity Search Engine Entity Search Engine
Entity Search Engine
 
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
 
Multlingual Linked Data Patterns
Multlingual Linked Data PatternsMultlingual Linked Data Patterns
Multlingual Linked Data Patterns
 
QER : query entity recognition
QER : query entity recognitionQER : query entity recognition
QER : query entity recognition
 
Text mining
Text miningText mining
Text mining
 
RDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization dataRDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization data
 
Dynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsDynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data Platforms
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked Data
 
Discoverers of Surface Analysis
Discoverers of Surface AnalysisDiscoverers of Surface Analysis
Discoverers of Surface Analysis
 
Enhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER ModelsEnhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER Models
 
Natural language procssing
Natural language procssing Natural language procssing
Natural language procssing
 
Recipes for PhD
Recipes for PhDRecipes for PhD
Recipes for PhD
 

Similar a Named Entity Recognition - ACL 2011 Presentation

130102 venera arnaoudova - a new family of software anti-patterns linguisti...
130102   venera arnaoudova - a new family of software anti-patterns linguisti...130102   venera arnaoudova - a new family of software anti-patterns linguisti...
130102 venera arnaoudova - a new family of software anti-patterns linguisti...
Ptidej Team
 
How We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad GuysHow We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad Guys
New York City College of Technology Computer Systems Technology Colloquium
 

Similar a Named Entity Recognition - ACL 2011 Presentation (20)

Csmr13d.ppt
Csmr13d.pptCsmr13d.ppt
Csmr13d.ppt
 
130102 venera arnaoudova - a new family of software anti-patterns linguisti...
130102   venera arnaoudova - a new family of software anti-patterns linguisti...130102   venera arnaoudova - a new family of software anti-patterns linguisti...
130102 venera arnaoudova - a new family of software anti-patterns linguisti...
 
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology Applications
 
leewayhertz.com-Named Entity Recognition NER Unveiling the value in unstructu...
leewayhertz.com-Named Entity Recognition NER Unveiling the value in unstructu...leewayhertz.com-Named Entity Recognition NER Unveiling the value in unstructu...
leewayhertz.com-Named Entity Recognition NER Unveiling the value in unstructu...
 
asdrfasdfasdf
asdrfasdfasdfasdrfasdfasdf
asdrfasdfasdf
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference Resolution
 
How We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad GuysHow We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad Guys
 
Learn How to Overcome Patient Identity Challenges
Learn How to Overcome Patient Identity ChallengesLearn How to Overcome Patient Identity Challenges
Learn How to Overcome Patient Identity Challenges
 
columbia-gwu
columbia-gwucolumbia-gwu
columbia-gwu
 
Data Science Course In Pune
Data Science Course In Pune Data Science Course In Pune
Data Science Course In Pune
 
data science institute in bangalore
data science institute in bangaloredata science institute in bangalore
data science institute in bangalore
 
Data Science Course Pune
Data Science Course PuneData Science Course Pune
Data Science Course Pune
 
Data science course pdf
Data science course pdfData science course pdf
Data science course pdf
 
data science courses in banglore
data science courses in bangloredata science courses in banglore
data science courses in banglore
 
Data Science Course
Data Science CourseData Science Course
Data Science Course
 
Data Science Course
Data Science CourseData Science Course
Data Science Course
 
data science certification
data science certificationdata science certification
data science certification
 
data science course in pune
data science course in punedata science course in pune
data science course in pune
 
Data mining
Data miningData mining
Data mining
 

Más de Richard Littauer

On Tocharian Exceptionality to the centum/satem Isogloss
On Tocharian Exceptionality to the centum/satem IsoglossOn Tocharian Exceptionality to the centum/satem Isogloss
On Tocharian Exceptionality to the centum/satem Isogloss
Richard Littauer
 
Evolution of Morphological Agreement - Peche Kucha
Evolution of Morphological Agreement - Peche KuchaEvolution of Morphological Agreement - Peche Kucha
Evolution of Morphological Agreement - Peche Kucha
Richard Littauer
 
Towards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in LinguisticsTowards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in Linguistics
Richard Littauer
 

Más de Richard Littauer (14)

Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...
Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...
Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...
 
Marcu 2000 presentation
Marcu 2000 presentationMarcu 2000 presentation
Marcu 2000 presentation
 
Barzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentationBarzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentation
 
Saarland and UdS
Saarland and UdSSaarland and UdS
Saarland and UdS
 
Building Corpora from Social Media
Building Corpora from Social MediaBuilding Corpora from Social Media
Building Corpora from Social Media
 
Visualising Typological Relationships: Plotting WALS with Heat Maps
Visualising Typological Relationships: Plotting WALS with Heat MapsVisualising Typological Relationships: Plotting WALS with Heat Maps
Visualising Typological Relationships: Plotting WALS with Heat Maps
 
On Tocharian Exceptionality to the centum/satem Isogloss
On Tocharian Exceptionality to the centum/satem IsoglossOn Tocharian Exceptionality to the centum/satem Isogloss
On Tocharian Exceptionality to the centum/satem Isogloss
 
The Evolution of Morphological Agreement
The Evolution of Morphological AgreementThe Evolution of Morphological Agreement
The Evolution of Morphological Agreement
 
Trends in Use of Scientific Workflows: Insights from a Public Repository and ...
Trends in Use of Scientific Workflows: Insights from a Public Repository and ...Trends in Use of Scientific Workflows: Insights from a Public Repository and ...
Trends in Use of Scientific Workflows: Insights from a Public Repository and ...
 
Evolution of Morphological Agreement - Peche Kucha
Evolution of Morphological Agreement - Peche KuchaEvolution of Morphological Agreement - Peche Kucha
Evolution of Morphological Agreement - Peche Kucha
 
Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...
Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...
Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...
 
The Evolution of Speech Segmentation: A Computer Simulation
The Evolution of Speech Segmentation: A Computer SimulationThe Evolution of Speech Segmentation: A Computer Simulation
The Evolution of Speech Segmentation: A Computer Simulation
 
Towards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in LinguisticsTowards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in Linguistics
 
A Reanalysis of Anatomical Changes for Language
A Reanalysis of Anatomical Changes for LanguageA Reanalysis of Anatomical Changes for Language
A Reanalysis of Anatomical Changes for Language
 

Último

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Último (20)

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

Named Entity Recognition - ACL 2011 Presentation

  • 1. The Web is not a PERSON, Berners- Lee is not an ORGANIZATION, and African-Americans are not LOCATIONS: An Analysis of the Performance of Named-Entity Recognition Robert Krovetz (Lexicalresearch.com), Paul Deane, Nitin Madnani (ETS) A Review by Richard Littauer (UdS)
  • 2. The Background  Named-Entity Recognition (NER) is normally judged in the context of Information Extraction (IE)
  • 3. The Background  Named-Entity Recognition (NER) is normally judged in the context of Information Extraction (IE)  Various competitions
  • 4. The Background  Named-Entity Recognition (NER) is normally judged in the context of Information Extraction (IE)  Various competitions  Recently: ◦ non-English languages ◦ improving unsupervised learning methods
  • 5. The Background  “There are no well-established standards for evaluation of NER.”
  • 6. The Background  “There are no well-established standards for evaluation of NER.” ◦ Criteria for NER system changes for competitions ◦ Proprietary software
  • 7. The Background  KDM wanted to identify MWEs…
  • 8. The Background  KDM wanted to identify MWEs… … but false positives, tagging inconsistencies stopped this.
  • 9. The Background  KDM wanted to identify MWEs… … but false positives, tagging inconsistencies stopped this.  IE derives Recall and Precision from Information Retrieval  NER is just a small part of this, so is rarely evaluated independently
  • 10. The Background  So, they want to test NER systems, and provide a unit test based on the problems encountered
  • 11. Evaluation Compared three NER taggers:  Stanford: ◦ CRF, 100m training corpus;  University of Illinois (LBJ): ◦ Regularized average perceptron, Reuters 1996 News Corpus;  BBN IdentiFinder (IdentiFinder): ◦ HMMs, commercial
  • 12. Evaluation  Agreement on Classification
  • 13. Evaluation  Agreement on Classification  Ambiguity in Discourse
  • 14. Evaluation  Agreement on Classification  Ambiguity in Discourse  Stanford vs. LBJ on internal ETS 425m corpus  All three on American National Corpus
  • 15. Stanford vs. LBJ  NER reported as 85-95% accurate.
  • 16. Stanford vs. LBJ  NER reported as 85-95% accurate.  Same number for both: 1.95m for Stanford, 1.8m for LBJ (7.6% difference)  However, errors:
  • 17. Stanford vs. LBJ  Agreement:
  • 18. Stanford vs. LBJ  Ambiguity:
  • 19. Stanford vs. LBJ vs. IdentiFinder  Agreement:
  • 20. Stanford vs. LBJ vs. IdentiFinder  Agreement:
  • 21. Stanford vs. LBJ vs. IdentiFinder  Differences: ◦ How they are tokenized ◦ Number of entities recognized overall
  • 22. Stanford vs. LBJ vs. IdentiFinder  Ambiguity:
  • 23. Unit Test  Created two documents that can be used as texts ◦ Different cases for true positives of PERSON, LOCATION, ORGANIZATION ◦ Entirely upper case not NE (Ex. AAARGH) ◦ Punctuated terms not NE ◦ Terms with Initials ◦ Acronyms (some expanded, some not) ◦ Last names in close proximity to first names
  • 24. Unit Test  Created two documents that can be used as texts ◦ Terms with prepositions (Mass. Inst. Of Tech.) ◦ Terms with location and organization (Amherst College)  Provided freely online.
  • 25. One NE Tag per Discourse  Unusual for multiple occurrences of a token in a document to be different entities  True for homonyms  An exception: Location + sports team
  • 26. One NE Tag per Discourse  Stanford, LBJ have features for non- local dependencies to help with this.  KDM: Two other uses for NLD: ◦ Source of error in evaluation ◦ A way to identify semantically related entities  These should be treated as exceptions
  • 27. Discussion  There are guidelines for NER – but we need standards.  The community should focus on PERSON, ORGANISATION, LOCATION, and MISC. ◦ Harder to deal with than Dates, Times. ◦ Disagreement between taggers. ◦ MISC is necessary. ◦ These have important value elsewhere.
  • 28. Discussion  To improve intrinsic evaluation for NER: 1. Create test sets for divers domains. 2. Use standardized sets for different phenomena. 3. Report accuracy for POL separately. 4. Establish uncertainty in the tagging system.
  • 29. Conclusion  90% accuracy not real.  We need to use only entities that are agreed on by multiple taggers.  Even in cases where they both disagree (Hint: Future work.)  Unit test downloadable.
  • 30. Cheers/PERSON Richard/ORGANISATION thanks the Mword Class/LOCATION for listening to his talk about Berners-Lee/MISC

Notas del editor

  1. NER: The Aim is to recognize and classify different types of entities (names, organizations, locations, dates, etc.)
  2. Not sure why they focused on competitions, to be honest. But they mention the Message Understanding Conference, and CoNLL.
  3. They give two possible reasons for this:
  4. Part of the problem is that
  5. No Gold Standards for any of these. So, they compared on two levels
  6. How well do they work on PERSON, ORGANIZATION, and LOCATION? How much to they agree? What mistakes?
  7. How frequently does each tagger produce multiple classifications for the same entity in a single document? Clinton as a person, and place, for instance.
  8. ANC tagged for IdentiFinder already.
  9. However, this was often not consistent
  10. Identifiner got much more ORGANISATION than the others. Also uses extra class, Geo-Political Entity
  11. Existing taggers treat the non-local dependencies as a way of dealing with the sparse data problem, and as a way to resolve tagging differences by look- ing at how often one token is classified as one type versus another.
  12. 1. They didn’t do this. 2. And actually use them, not just one of them. 3. Report accuracy rates separately for the three major classes. Accuracy rates should be further broken down according to the items in the unit test that are designed to assess mistakes: or- thography, acronym processing, frequent false positives, and knowledge-based classification.They go on to say that ANC is doing it right, but is too small, hence their ETS corpus.
  13. 1. They didn’t do this. 2. And actually use them, not just one of them. 3. Report accuracy rates separately for the three major classes. Accuracy rates should be further broken down according to the items in the unit test that are designed to assess mistakes: or- thography, acronym processing, frequent false positives, and knowledge-based classification.They go on to say that ANC is doing it right, but is too small, hence their ETS corpus.
  14. 1. They didn’t do this. 2. And actually use them, not just one of them. 3. Report accuracy rates separately for the three major classes. Accuracy rates should be further broken down according to the items in the unit test that are designed to assess mistakes: or- thography, acronym processing, frequent false positives, and knowledge-based classification.They go on to say that ANC is doing it right, but is too small, hence their ETS corpus.