SlideShare una empresa de Scribd logo
1 de 35
600.465 Connecting the dots - I(NLP in Practice) Delip Rao delip@jhu.edu
What is “Text”?
What is “Text”?
What is “Text”?
“Real” World Tons of data on the web A lot of it is text In many languages In many genres Language by itself is complex.  The Web further complicates language.
But we have 600.465 ,[object Object]
1. Formalize some insights
2. Study the formalism mathematically
3. Develop & implement algorithms
4. Test on real dataForward Backward,  Gradient Descent, LBFGS, Simulated Annealing, Contrastive Estimation, … feature functions! f(wi = off, wi+1 = the) f(wi = obama, yi = NP) Adapted from : Jason Eisner
NLP for fun and profit Making NLP more accessible Provide APIs for common NLP tasks vartext = document.get(…); varentities = agent.markNE(text); Big $$$$ Backend to intelligent processing of text
Desideratum: Multilinguality Except for feature extraction, systems should be language agnostic
In this lecture Understand how to solve and ace in NLP tasks Learn general methodology or approaches End-to-End development using an example task Overview of (un)common NLP tasks
Case study: Named Entity Recognition
Case study: Named Entity Recognition Demo: http://viewer.opencalais.com ,[object Object]
How do we find out well we are doing?
How can we improve?,[object Object]
Case study: Named Entity Recognition Collect data to learn from Sentences with words marked as PER, ORG, LOC, NONE How do we get this data?
Pay the experts
Wisdom of the crowds
Getting the data: Annotation Time consuming Costs $$$ Need for quality control Inter-annotator aggreement Kappa score (Kippendorf, 1980) Smarter ways to annotate Get fewer annotations: Active Learning Rationales (Zaidan, Eisner & Piatko, 2007)
Only France and Great Britain backed Fischler ‘s proposal . Only France and Great Britain backed Fischler‘s proposal . Input x Labels y
[object Object]
2. Study the formalism mathematically
3. Develop & implement algorithms
4. Test on real dataOur recipe …
NER: Designing features Need to segment sentences Tokenize the sentences Preprocessing Not as trivial as you think Original text itself might be in an ugly HTML Cleaneval!
NER: Designing features
NER: Designing features
NER: Designing features
NER: Designing features
NER: Designing features These are extracted during preprocessing!
NER: Designing features
NER: Designing features

Más contenido relacionado

Similar a NLP in Practice: Case Study of Named Entity Recognition

Gadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALLGadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALLLawrie Hunter
 
Applications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignApplications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignAnubhav Jain
 
2014 nicta-reproducibility
2014 nicta-reproducibility2014 nicta-reproducibility
2014 nicta-reproducibilityc.titus.brown
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingTheodore J. LaGrow
 
Natural Language Processing for Irish
Natural Language Processing for IrishNatural Language Processing for Irish
Natural Language Processing for IrishTeresa Lynn
 
Week1- Introduction.pptx
Week1- Introduction.pptxWeek1- Introduction.pptx
Week1- Introduction.pptxfahmi324663
 
Identify Development Pains and Resolve Them with Idea Flow
Identify Development Pains and Resolve Them with Idea FlowIdentify Development Pains and Resolve Them with Idea Flow
Identify Development Pains and Resolve Them with Idea FlowTechWell
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language ProcessingMichel Bruley
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA DATASCIENCE
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
 
Deitel® SerHow To Program SeriesC How to Program.docx
Deitel® SerHow To Program SeriesC How to Program.docxDeitel® SerHow To Program SeriesC How to Program.docx
Deitel® SerHow To Program SeriesC How to Program.docxsimonithomas47935
 
IRJET- Querying Database using Natural Language Interface
IRJET-  	  Querying Database using Natural Language InterfaceIRJET-  	  Querying Database using Natural Language Interface
IRJET- Querying Database using Natural Language InterfaceIRJET Journal
 
Text-mining and Automation
Text-mining and AutomationText-mining and Automation
Text-mining and Automationbenosteen
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologyAamir-sheriff
 
Storytelling for research software engineers
Storytelling for research software engineersStorytelling for research software engineers
Storytelling for research software engineersAlbanLevy
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extractionGabriel Hamilton
 
Problem-based Learning & Resource-based Learning two complementary approac...
Problem-based Learning & Resource-based Learning  two complementary approac...Problem-based Learning & Resource-based Learning  two complementary approac...
Problem-based Learning & Resource-based Learning two complementary approac...Wilco te Winkel
 

Similar a NLP in Practice: Case Study of Named Entity Recognition (20)

Gadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALLGadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALL
 
Applications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignApplications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and Design
 
2014 nicta-reproducibility
2014 nicta-reproducibility2014 nicta-reproducibility
2014 nicta-reproducibility
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
 
Natural Language Processing for Irish
Natural Language Processing for IrishNatural Language Processing for Irish
Natural Language Processing for Irish
 
Let's pretend
Let's pretendLet's pretend
Let's pretend
 
Week1- Introduction.pptx
Week1- Introduction.pptxWeek1- Introduction.pptx
Week1- Introduction.pptx
 
Identify Development Pains and Resolve Them with Idea Flow
Identify Development Pains and Resolve Them with Idea FlowIdentify Development Pains and Resolve Them with Idea Flow
Identify Development Pains and Resolve Them with Idea Flow
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
2014 aus-agta
2014 aus-agta2014 aus-agta
2014 aus-agta
 
2013 arizona-swc
2013 arizona-swc2013 arizona-swc
2013 arizona-swc
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Deitel® SerHow To Program SeriesC How to Program.docx
Deitel® SerHow To Program SeriesC How to Program.docxDeitel® SerHow To Program SeriesC How to Program.docx
Deitel® SerHow To Program SeriesC How to Program.docx
 
IRJET- Querying Database using Natural Language Interface
IRJET-  	  Querying Database using Natural Language InterfaceIRJET-  	  Querying Database using Natural Language Interface
IRJET- Querying Database using Natural Language Interface
 
Text-mining and Automation
Text-mining and AutomationText-mining and Automation
Text-mining and Automation
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Storytelling for research software engineers
Storytelling for research software engineersStorytelling for research software engineers
Storytelling for research software engineers
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extraction
 
Problem-based Learning & Resource-based Learning two complementary approac...
Problem-based Learning & Resource-based Learning  two complementary approac...Problem-based Learning & Resource-based Learning  two complementary approac...
Problem-based Learning & Resource-based Learning two complementary approac...
 

Último

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 

Último (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 

NLP in Practice: Case Study of Named Entity Recognition

  • 1. 600.465 Connecting the dots - I(NLP in Practice) Delip Rao delip@jhu.edu
  • 2.
  • 6. “Real” World Tons of data on the web A lot of it is text In many languages In many genres Language by itself is complex. The Web further complicates language.
  • 7.
  • 9. 2. Study the formalism mathematically
  • 10. 3. Develop & implement algorithms
  • 11. 4. Test on real dataForward Backward, Gradient Descent, LBFGS, Simulated Annealing, Contrastive Estimation, … feature functions! f(wi = off, wi+1 = the) f(wi = obama, yi = NP) Adapted from : Jason Eisner
  • 12. NLP for fun and profit Making NLP more accessible Provide APIs for common NLP tasks vartext = document.get(…); varentities = agent.markNE(text); Big $$$$ Backend to intelligent processing of text
  • 13. Desideratum: Multilinguality Except for feature extraction, systems should be language agnostic
  • 14. In this lecture Understand how to solve and ace in NLP tasks Learn general methodology or approaches End-to-End development using an example task Overview of (un)common NLP tasks
  • 15. Case study: Named Entity Recognition
  • 16.
  • 17. How do we find out well we are doing?
  • 18.
  • 19. Case study: Named Entity Recognition Collect data to learn from Sentences with words marked as PER, ORG, LOC, NONE How do we get this data?
  • 21. Wisdom of the crowds
  • 22. Getting the data: Annotation Time consuming Costs $$$ Need for quality control Inter-annotator aggreement Kappa score (Kippendorf, 1980) Smarter ways to annotate Get fewer annotations: Active Learning Rationales (Zaidan, Eisner & Piatko, 2007)
  • 23. Only France and Great Britain backed Fischler ‘s proposal . Only France and Great Britain backed Fischler‘s proposal . Input x Labels y
  • 24.
  • 25. 2. Study the formalism mathematically
  • 26. 3. Develop & implement algorithms
  • 27. 4. Test on real dataOur recipe …
  • 28. NER: Designing features Need to segment sentences Tokenize the sentences Preprocessing Not as trivial as you think Original text itself might be in an ugly HTML Cleaneval!
  • 33. NER: Designing features These are extracted during preprocessing!
  • 36. NER: Designing features Can you think of other features? HAS_DIGITS IS_HYPHENATED IS_ALLCAPS FREQ_WORD RARE_WORD USEFUL_UNIGRAM_PER USEFUL_BIGRAM_PER USEFUL_UNIGRAM_LOC USEFUL_BIGRAM_LOC USEFUL_UNIGRAM_ORG USEFUL_BIGRAM_ORG USEFUL_SUFFIX_PER USEFUL_SUFFIX_LOC USEFUL_SUFFIX_ORG WORD PREV_WORD NEXT_WORD PREV_BIGRAM NEXT_BIGRAM POS PREV_POS NEXT_POS PREV_POS_BIGRAM NEXT_POS_BIGRAM IN_LEXICON_PER IN_LEXICON_LOC IN_LEXICON_ORG IS_CAPITALIZED
  • 37. Case: Named Entity Recognition Evaluation Metrics Token accuracy: What percent of the tokens got labeled correctly Problem with accuracy Precision-Recall-F president O Barack B-PER Obama O
  • 38. NER: How can we improve? Engineer better features Design better models Conditional Random Fields Y1 Y2 Y3 Y4 x1 x2 x3 x4
  • 39. NER: How else can we improve? Unlabeled data! example from Jerry Zhu
  • 40. NER : Challenges Domain transfer WSJ NYT WSJ  Blogs ?? WSJ  Twitter ??!? Tough nut: Organizations Non textual data? Entity Extraction is a Boring Solved Problem – or is it? (Vilain, Su and Lubar, 2007)
  • 41. NER: Related application Extracting real estate information from Criagslist Ads Our oversized one, two and three bedroom apartment homes with floor plans featuring 1 and 2 baths offer space unlike any competition. Relax and enjoy the views from your own private balcony or patio, or feel free to entertain, with plenty of space in your large living room, dining area and eat-in kitchen. The lovely pool and sun deck make summer fun a splash. Our location makes commuting a breeze – Near MTA bus lines, the Metro station, major shopping areas, and for the little ones, an elementary school is right next door. Our oversized one, two and three bedroom apartment homes with floor plans featuring 1 and 2 baths offer space unlike any competition. Relax and enjoy the views from your own private balcony or patio, or feel free to entertain, with plenty of space in your large living room, dining area and eat-inkitchen. The lovely pool and sun deck make summer fun a splash. Our location makes commuting a breeze – Near MTA bus lines, the Metro station, major shopping areas, and for the little ones, an elementary school is right next door.
  • 42. NER: Related Application BioNLP: Annotation of chemical entities Corbet, Batchelor & Teufel, 2007
  • 43. Shared Tasks: NLP in practice Shared Task Everybody works on a (mostly) common dataset Evaluation measures are defined Participants get ranked on the evaluation measures Advance the state of the art Set benchmarks Tasks involve common hard problems or new interesting problems