Giuseppe Rizzo, Biana Pereira, Andra Varga, Marieke van Erp, Amparo Elizabeth Cano Basave
Presented on Wednesday 10 October at the 17th International Semantic Web Conference (ISWC 2018)
Paper: http://www.semantic-web-journal.net/content/lessons-learnt-named-entity-recognition-and-linking-neel-challenge-series
Conference: http://iswc2018.semanticweb.org/
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge Series
1. Lessons Learnt from the Named Entity
rEcognition and Linking (NEEL)
Challenge Series
Giuseppe Rizzo
Bianca Pereira
Andrea Varga
Marieke van Erp
Amparo Elizabeth Cano Basave
By Piet Mondrian - Gemeentemuseum Den Haag, Public Domain, https://commons.wikimedia.org/w/index.php?curid=37614350
2. NEEL Challenge Overview
• Microposts are challenging because:
• brevity (140 characters)
• (domain specific) abbreviations and
typos
• ‘grammar free’
• The NEEL challenge aims to explore new
approaches to foster research into novel,
more accurate entity recognition and linking
approaches tailored to Microposts
• NEEL ran from 2013 - 2016
3. NEEL Evolution
• 2013: Information Extraction
• named entity recognition (4 types)
• 2014: Named Entity Extraction and Linking (NEEL)
• named entity linking to DBpedia 3.9
• 2015: Named Entity rEcognition and Linking
(NEEL)
• named entity recognition (7 types) and
linking to DBpedia 2014
• 2016: Named Entity rEcognition and Linking
(NEEL)
• named entity recognition (7 types) and
linking to DBpedia 2015-04, NIL clustering
Image source: https://c1.staticflickr.com/8/7020/6405801675_efd6d09977_b.jpg
4. Cross-domain task
• Named Entity and Event Linking is a shared
task in NLP and Semantic Web
• Machine Learning approaches need data
• Data curation is expensive and hard
• Knowledge bases can reduce some of the
data bottleneck
• Resulting in hybrid approaches
6. Evaluating Entity Linking
• end-to-end: evaluates a system on the
aggregated output of all steps
• error propagation harms results
• step-by-step: robust benchmark that
evaluates each step of the process
individually
• time consuming to set up
• penalises systems that do not follow
standard workflow
• partial end-to-end: evaluates particular
steps in the process individually e.g. NER,
NIL & Linking
7. Named Entity Recognition and Linking challenges since 2013
Characteris
tic
TAC-KBP ERD SemEval W-NUT NEEL
2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016
Text
newswire
web sites
discussion forum posts
web sites
search
queries
technical
manuals
reports
formal discussion
tweets
tweets
Reddit
YouTube
StackExchange
tweets
Kowledge
Base
Wikipedia Freebase Freebase Babelnet none none none none DBpedia
Entity given by Type
given by
KB
given by KB given by Type given by Type
Evaluation
file API file file file API file
partial
end-to-end
end-to-
end
end-to-end end-to-end end-to-end
partial
end-to-end
Target
conference
TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW
8. Named Entity Recognition and Linking challenges since 2013
Characteris
tic
TAC-KBP ERD SemEval W-NUT NEEL
2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016
Text
newswire
web sites
discussion forum posts
web sites
search
queries
technical
manuals
reports
formal discussion
tweets
tweets
Reddit
YouTube
StackExchange
tweets
Kowledge
Base
Wikipedia Freebase Freebase Babelnet none none none none DBpedia
Entity given by Type
given by
KB
given by KB given by Type given by Type
Evaluation
file API file file file API file
partial
end-to-end
end-to-
end
end-to-end end-to-end end-to-end
partial
end-to-end
Target
conference
TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW
9. Named Entity Recognition and Linking challenges since 2013
Characteris
tic
TAC-KBP ERD SemEval W-NUT NEEL
2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016
Text
newswire
web sites
discussion forum posts
web sites
search
queries
technical
manuals
reports
formal discussion
tweets
tweets
Reddit
YouTube
StackExchange
tweets
Kowledge
Base
Wikipedia Freebase Freebase Babelnet none none none none DBpedia
Entity given by Type
given by
KB
given by KB given by Type given by Type
Evaluation
file API file file file API file
partial
end-to-end
end-to-
end
end-to-end end-to-end end-to-end
partial
end-to-end
Target
conference
TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW
10. Named Entity Recognition and Linking challenges since 2013
Characteris
tic
TAC-KBP ERD SemEval W-NUT NEEL
2014 2015 2016 2014 2015 2015 2016 2017 2013 2014 2015 2016
Text
newswire
web sites
discussion forum posts
web sites
search
queries
technical
manuals
reports
formal discussion
tweets
tweets
Reddit
YouTube
StackExchange
tweets
Kowledge
Base
Wikipedia Freebase Freebase Babelnet none none none none DBpedia
Entity given by Type
given by
KB
given by KB given by Type given by Type
Evaluation
file API file file file API file
partial
end-to-end
end-to-
end
end-to-end end-to-end end-to-end
partial
end-to-end
Target
conference
TAC SIGIR NAACL-HLT ACL-IJNLP COLING EMNLP WWW
11. NEEL Datasets
Image source: https://www.maxpixel.net/Word-Data-Data-Deluge-Binary-System-Binary-Dataset-2728117
• 2013: 4,265 tweets, end of 2010, start of
2011. No explicit hashtag search, 66% train,
33% test.
• 2014: 3,505 tweets, 15 July 2011 - 15 August
2011. First Story Detection algorithm to
identify tweet clusters representing events,
70% train, 30% test.
• 2015: 6,025 tweets, extension of 2014 dataset
including tweets from 2013 and November
2014. Train: 2014 dataset, 8% development,
34% test.
• 2016: 9,289 tweets, extension of 2014 & 2015
datasets via selection of hashtags. 65% train
(2015 datset), 1% development and 34% test.
12. NEEL Datasets (ctd)
• Entity types are not distributed equally
• Difficult to balance entity types over different
dataset slices
• Confusability: a measure of the number of surface
forms an entity can have (i.e. how many different
‘terms’ can refer to the same entity)
• Dominance: a measure of the number of
resources can be associated with a single surface
form (i.e. how many entities share the same
‘name’)
2013
2016
Confusability
Dominance
13. Results
• NEEL Challenge more difficult
every year (from 4 entity types to
7 + linking + NIL clustering)
• Systems more complex every
year
• 2016 task more difficult probably
due to domain specificity of test
dataset (US Primary Elections
and Star Wars)
Precision Recall F1
2013 0.764 0.604 0.67
2014 0.771 0.642 0.701
Tagging Clustering Linking Overall
2015 0.807 0.84 0.762 0.8067
2016 0.473 0.641 0.501 0.5486
14. Emerging Trends
• Tweet normalisation is common
• Use of KBs for mention detection and
typing
• End-to-end systems and pruning for
candidate selection
• Hierarchical clustering for aggregating
mentions of the same entity/event
• Decrease in the use of off-the-shelf
systems (which were popular in the first
editions)
15. Lessons Learnt
• Creating balanced challenge datasets is hard!
• You are invited to expand and improve our
datasets!
• The datasets are available for evaluation of new
systems: http://
microposts2016.seas.upenn.edu/challenge.html
• NEEL provides an opportunity to compare
results against other systems
• Multilingual or other language challenges? (2016
also had an Italian variant)
• New popular micropost platforms require
different analyses
17. Are you a Master’s or PhD student?
Do you want to learn how to do this type of research yourself?
Join us in Italy next summer!
http://semanticwebsummerschool.org