SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
www.ucomp.eu | www.chistera.eu @uCompEU
uComp Objectives
• Develop a generic and reusable Human
Computation (HC) framework
• Address challenges of noisy data
• Embed human computation into
knowledge extraction workflows
• Factual Knowledge
• Affective Knowledge
• Evaluate EHC performance
(EHC = Embedded Human Computation)
www.ucomp.eu | www.chistera.eu @uCompEU
Work Package Overview
www.ucomp.eu | www.chistera.eu @uCompEU
System Architecture
www.ucomp.eu | www.chistera.eu @uCompEU
Content Repository (WP1)
• Extensible Web Retrieval Toolkit (eWRT)
• Open Source Library
www.weblyzard.com/ewrt
• Media Watch on Climate Change
• English Version
• www.ecoresearch.net/climate
• News Media Articles: 1,275,000
• Social Media Postings: 20,000,000
• German Version
• www.ecoresearch.net/climate/de
• News Media Articles: 650,000
• Social Meeting Postings: 565,000
• French Version
• www.ecoresearch.net/climate/de
• News Media Articles: 720,000
• Social Meeting Postings: 410,000
www.ucomp.eu | www.chistera.eu @uCompEU
HC Framework (WP2)
• Application Framework. Facilitate developing GWAPs
to engage users and generate valuable information.
• Mechanism. Players score if inputs match: (i) system-
generated values; (ii) Real-time input from other
players; (iii) stored records from previous users.
• If a certain number of players agree, the task will be
assumed complete and taken out of the game
• Progress
• Cross-platform HTML5 application framework. Complete.
• Application Programming Interface (API). Complete.
• Integration of GWAPs with CrowdFlower. Complete.
• Support of Prediction Tasks. Complete.
• Framework for Social Logins. Complete.
www.ucomp.eu | www.chistera.eu @uCompEU
GWAP Use Case
Launch – 25 Mar 2015
www.twitter.com/uCompEU
www.ucomp.eu | www.chistera.eu @uCompEU
GWAP Use Case
www.ucomp.eu | www.chistera.eu @uCompEU
HC + Text Mining (WP3)
• Open-source, released as part of GATE
gate.ac.uk/wiki/crowdsourcing.html
• Two types of tasks: (i) Classification - e.g. entity/word
disambiguation, sentiment; (ii) sequence selection - e.g.
named entity annotation
• Tasks commissioned from the GATE Developer UI
• Automatic mapping from sentences to HC tasks
• Annotation provenance & contributor reliability tracked
• Collected data mapped back onto corpora and
documents automatically
• Several knowledge aggregation and corpus distribution
methods implemented (T3.3)
www.ucomp.eu | www.chistera.eu @uCompEU
• One entity class per crowdsourcing task; better
than simultaneous annotation of entity types
Crowdsourced NE Corpora
www.ucomp.eu | www.chistera.eu @uCompEU
Result Aggregation
• Automatic adjudication/aggregation strategies
implemented
• Challenges encountered
• Worker agreement not always representative of quality
• Many entities are recognised by only a minority of
workers
• Regional knowledge is required: #mufc, the bulls
• Span mismatch: King of England vs King of England
• Quality evaluation
• PER P 68.7 R 56.2 F1 61.8
LOC P 15.3 R 91.7 F1 26.2
ORG P 53.2 R 67.1 F1 59.3
www.ucomp.eu | www.chistera.eu @uCompEU
Factual Knowledge (WP4)
• Ontologies create shared meaning and are
a cornerstone of the Semantic Web
• Manual construction of ontologies is
cumbersome and expensive
• Ontology learning is a (semi-)automatic
process to assist the ontology engineer
• uComp builds on an existing ontology
learning framework
www.ucomp.eu | www.chistera.eu @uCompEU
Protégé Plugin
• Goal: Apply the uComp HC framework to
ontology learning and other ontology
construction tasks
• How: A plugin implemented for Protégé, a
popular ontology engineering platform,
using the uComp HC API to validate
ontological entities
www.ucomp.eu | www.chistera.eu @uCompEU
Knowledge Creation Lifecycle
www.ucomp.eu | www.chistera.eu @uCompEU
Knowledge Quality Evaluation
• Feasibility Study
• Cost: Reduction of 40% to 83% depending on
design used
• Quality: Comparable with that of tasks performed
by ontology engineers
• Large-Scale Evaluation in Medical Domain
• Result Quality: Accuracy of 89% / 99%
• Completion Time: Similar to domain experts
• Cost Reduction of 75% to 81%
www.ucomp.eu | www.chistera.eu @uCompEU
Affective Knowledge (WP5)
• Use HC to produce affective resources that
are difficult to obtain automatically and too
costly to produce manually, for multiple
languages (EN, FR, DE).
• Assess HC-produced resources by
evaluating the performance impact of using
them instead of traditional resources for
opinion mining and sentiment analysis
(quantitative black-box methodology).
• Assess the possibility to replace static gold
standard resources by dynamic HC
www.ucomp.eu | www.chistera.eu @uCompEU
Affective Model
www.ucomp.eu | www.chistera.eu @uCompEU
Multilingual Twitter Data
www.ucomp.eu | www.chistera.eu @uCompEU
Crowdsourcing lexicon validation experiment
• French Affective Lexicon (9,939 Entries)
• Task Design
• Results
• Feasibility depends on workers’ motivation
• Good quality/cost ratio
• Ethical and legal
issues
Evaluation
Percentage of crowdsourced validated terms per affective class
www.ucomp.eu | www.chistera.eu @uCompEU
Evaluation
www.ucomp.eu | www.chistera.eu @uCompEU
Evaluation
• Data Annotation
• Expert Annotation: 30.000 tweets : 50% French + 50%
German; French: Complete, German: In Progress
• Annotation Guide
• 7 Entities: Opinion Holder, Opinion Target, Opinion
/ Sentiment / Emotion Expression, Negation,
Modifier, Global OSE Recipient
• 6 Relations: SAYS, ABOOUT, NEG, MOD and
RECEIVER
• Evaluation Campaign – DEFT2015
• 22 participants registered
• Polarity, emotion, and opinion holder/target detection
• DEFT Workshop at TALN 2015
www.ucomp.eu | www.chistera.eu @uCompEU
Dissemination & Impact (WP6)
• Web Site: www.ucomp.eu; Twitter Presence: @uCompEU
• Deliverables: 17
• Y1: D1.1, D1.2, D2.1, D3.1, D5.1, D6.1, D6.2, D7.1, D7.2, D7.3
• Y2: D1.3, D3.2, D3.4, D4.2, D5.2, D5.3, D7.4
• Scientific Publications: 24
• Open-Source Toolkits: 4
• eWRT, TwitIE, Gate HC Plugin, Protégé Plugin
• Collaboration: DecarboNet (Climate Challenge), PHEME
(Evaluation), Member of the European Center for Social Media
• Training and Teaching
• Two week-long courses on Mining and Crowdsourcing Social Media
Corpora. GATE Summer School (8-12 June 2015; 9-13 June 2014)
• Tutorial: Knowledge Extraction from Social Media with GATE.
12th Extended Semantic Web Conference (ESWC-2015)
• Tutorial: NLP for Social Media. 14th Conference of the European Chapter
of the Association for Computational Linguistics (EACL-2014)
www.ucomp.eu | www.chistera.eu @uCompEU
Project Management (WP7)
• Project duration extended by six months
until 14 May 2016 (key staff leaving at MOD and
USFD; recruitment delays at WU)
• Changes to Work Plan
• D2.2 - Postpone to M30 (matching completion of T2.3
and T2.4);
• D2.3 - Postpone to M40 (matching T2.5);
• D3.3 - Postpone to M42 (matching completion of T3.4);
• D5.2 v2 and D5.3 v.2 - postpone to M36 (to allow prior
completion of D2.2. at M30);
• D5.4 - Postpone to M42;
• D6.3 - Postpone to M42 (as this needs to report on all
the work done until the end of the project).

Más contenido relacionado

Destacado

Bilingual Terminology Extraction based on Translation Patterns
Bilingual Terminology Extraction based on Translation PatternsBilingual Terminology Extraction based on Translation Patterns
Bilingual Terminology Extraction based on Translation PatternsAlberto Simões
 
Parallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaParallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaHaithem Afli
 
Challenges in the linguistic exploitation of specialized republishable web co...
Challenges in the linguistic exploitation of specialized republishable web co...Challenges in the linguistic exploitation of specialized republishable web co...
Challenges in the linguistic exploitation of specialized republishable web co...Adrien Barbaresi
 
Macro economische analyse van brazilië
Macro economische analyse van braziliëMacro economische analyse van brazilië
Macro economische analyse van braziliëJan-Willem Lammens
 
Bilingual terminology mining
Bilingual terminology miningBilingual terminology mining
Bilingual terminology miningEstelle Delpech
 
A cognitive view of the bilingual lexicon
A cognitive view of the bilingual lexiconA cognitive view of the bilingual lexicon
A cognitive view of the bilingual lexiconİrem Tümer
 
Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...Tobias Wunner
 
Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...
Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...
Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...Association for Computational Linguistics
 
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration ExtractionEnriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration ExtractionSarvnaz Karimi
 
Chelo Vargas-Sierra
Chelo Vargas-SierraChelo Vargas-Sierra
Chelo Vargas-SierraChelo Vargas
 
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...Estelle Delpech
 
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchangeDealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchangeEstelle Delpech
 
Applicative evaluation of bilingual terminologies
Applicative evaluation of bilingual terminologiesApplicative evaluation of bilingual terminologies
Applicative evaluation of bilingual terminologiesEstelle Delpech
 
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...Association for Computational Linguistics
 
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...Association for Computational Linguistics
 
Word Formation in English
Word Formation in EnglishWord Formation in English
Word Formation in Englishteflang
 

Destacado (16)

Bilingual Terminology Extraction based on Translation Patterns
Bilingual Terminology Extraction based on Translation PatternsBilingual Terminology Extraction based on Translation Patterns
Bilingual Terminology Extraction based on Translation Patterns
 
Parallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaParallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corpora
 
Challenges in the linguistic exploitation of specialized republishable web co...
Challenges in the linguistic exploitation of specialized republishable web co...Challenges in the linguistic exploitation of specialized republishable web co...
Challenges in the linguistic exploitation of specialized republishable web co...
 
Macro economische analyse van brazilië
Macro economische analyse van braziliëMacro economische analyse van brazilië
Macro economische analyse van brazilië
 
Bilingual terminology mining
Bilingual terminology miningBilingual terminology mining
Bilingual terminology mining
 
A cognitive view of the bilingual lexicon
A cognitive view of the bilingual lexiconA cognitive view of the bilingual lexicon
A cognitive view of the bilingual lexicon
 
Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...
 
Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...
Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...
Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon I...
 
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration ExtractionEnriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
 
Chelo Vargas-Sierra
Chelo Vargas-SierraChelo Vargas-Sierra
Chelo Vargas-Sierra
 
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
 
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchangeDealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
 
Applicative evaluation of bilingual terminologies
Applicative evaluation of bilingual terminologiesApplicative evaluation of bilingual terminologies
Applicative evaluation of bilingual terminologies
 
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
 
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
 
Word Formation in English
Word Formation in EnglishWord Formation in English
Word Formation in English
 

Similar a Embedded Human Computation for Knowledge Extraction and Evaluation

Science Demonstrator Session: Social and Earth Sciences
Science Demonstrator Session: Social and Earth SciencesScience Demonstrator Session: Social and Earth Sciences
Science Demonstrator Session: Social and Earth SciencesEOSCpilot .eu
 
IDCC Presentation on the Future of Data Management Planning, Feb 2016
IDCC Presentation on the Future of Data Management Planning, Feb 2016IDCC Presentation on the Future of Data Management Planning, Feb 2016
IDCC Presentation on the Future of Data Management Planning, Feb 2016Stephanie Simms
 
Crowdsourcing Representation Information to Support Preservation: CRISP
Crowdsourcing Representation Information to Support Preservation: CRISPCrowdsourcing Representation Information to Support Preservation: CRISP
Crowdsourcing Representation Information to Support Preservation: CRISPmopennock
 
Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013MediaMixerCommunity
 
BSC presentation for Festibity Sponsors
BSC presentation for Festibity SponsorsBSC presentation for Festibity Sponsors
BSC presentation for Festibity SponsorsFestibity
 
Conjugating Open Science & Open Education: The Sci-GaIA e-Research Hackfest m...
Conjugating Open Science & Open Education: The Sci-GaIA e-Research Hackfest m...Conjugating Open Science & Open Education: The Sci-GaIA e-Research Hackfest m...
Conjugating Open Science & Open Education: The Sci-GaIA e-Research Hackfest m...African Open Science Platform
 
DARE: Delivering Agile Research Excellence on European e-Infrastructures
DARE: Delivering Agile Research Excellence on European e-Infrastructures DARE: Delivering Agile Research Excellence on European e-Infrastructures
DARE: Delivering Agile Research Excellence on European e-Infrastructures EUDAT
 
Enabling open and reproducible computer systems research: the good, the bad a...
Enabling open and reproducible computer systems research: the good, the bad a...Enabling open and reproducible computer systems research: the good, the bad a...
Enabling open and reproducible computer systems research: the good, the bad a...Grigori Fursin
 
HiPEAC 2019 Workshop Overview
HiPEAC 2019 Workshop OverviewHiPEAC 2019 Workshop Overview
HiPEAC 2019 Workshop OverviewTulipp. Eu
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver
 
MuSa. Combined use of mooc, e learning and workplace learning to support prof...
MuSa. Combined use of mooc, e learning and workplace learning to support prof...MuSa. Combined use of mooc, e learning and workplace learning to support prof...
MuSa. Combined use of mooc, e learning and workplace learning to support prof...EADTU
 
COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016COBWEB Project
 
SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...Sandra Gesing
 

Similar a Embedded Human Computation for Knowledge Extraction and Evaluation (20)

Science Demonstrator Session: Social and Earth Sciences
Science Demonstrator Session: Social and Earth SciencesScience Demonstrator Session: Social and Earth Sciences
Science Demonstrator Session: Social and Earth Sciences
 
IDCC Presentation on the Future of Data Management Planning, Feb 2016
IDCC Presentation on the Future of Data Management Planning, Feb 2016IDCC Presentation on the Future of Data Management Planning, Feb 2016
IDCC Presentation on the Future of Data Management Planning, Feb 2016
 
Crowdsourcing Representation Information to Support Preservation: CRISP
Crowdsourcing Representation Information to Support Preservation: CRISPCrowdsourcing Representation Information to Support Preservation: CRISP
Crowdsourcing Representation Information to Support Preservation: CRISP
 
Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013
 
BSC presentation for Festibity Sponsors
BSC presentation for Festibity SponsorsBSC presentation for Festibity Sponsors
BSC presentation for Festibity Sponsors
 
Conjugating Open Science & Open Education: The Sci-GaIA e-Research Hackfest m...
Conjugating Open Science & Open Education: The Sci-GaIA e-Research Hackfest m...Conjugating Open Science & Open Education: The Sci-GaIA e-Research Hackfest m...
Conjugating Open Science & Open Education: The Sci-GaIA e-Research Hackfest m...
 
DARE: Delivering Agile Research Excellence on European e-Infrastructures
DARE: Delivering Agile Research Excellence on European e-Infrastructures DARE: Delivering Agile Research Excellence on European e-Infrastructures
DARE: Delivering Agile Research Excellence on European e-Infrastructures
 
Enabling open and reproducible computer systems research: the good, the bad a...
Enabling open and reproducible computer systems research: the good, the bad a...Enabling open and reproducible computer systems research: the good, the bad a...
Enabling open and reproducible computer systems research: the good, the bad a...
 
E Infrastructure for OA
E Infrastructure for OAE Infrastructure for OA
E Infrastructure for OA
 
HiPEAC 2019 Workshop Overview
HiPEAC 2019 Workshop OverviewHiPEAC 2019 Workshop Overview
HiPEAC 2019 Workshop Overview
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award Ceremony
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award Ceremony
 
MuSa. Combined use of mooc, e learning and workplace learning to support prof...
MuSa. Combined use of mooc, e learning and workplace learning to support prof...MuSa. Combined use of mooc, e learning and workplace learning to support prof...
MuSa. Combined use of mooc, e learning and workplace learning to support prof...
 
2021 09 kowi_tsoukala final
2021 09 kowi_tsoukala final2021 09 kowi_tsoukala final
2021 09 kowi_tsoukala final
 
COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016
 
All WP Meeting Athens - Europeana Inside - Gordon McKenna
All WP Meeting Athens - Europeana Inside - Gordon McKennaAll WP Meeting Athens - Europeana Inside - Gordon McKenna
All WP Meeting Athens - Europeana Inside - Gordon McKenna
 
SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...
 
HPC Performance tools, on the road to Exascale
HPC Performance tools, on the road to ExascaleHPC Performance tools, on the road to Exascale
HPC Performance tools, on the road to Exascale
 
Intro-EOSC.pptx
Intro-EOSC.pptxIntro-EOSC.pptx
Intro-EOSC.pptx
 
ELIXIR TCG update
ELIXIR TCG updateELIXIR TCG update
ELIXIR TCG update
 

Más de webLyzard technology

Elasticsearch Meetup Vienna - webLyzard Live Demo
Elasticsearch Meetup Vienna - webLyzard Live DemoElasticsearch Meetup Vienna - webLyzard Live Demo
Elasticsearch Meetup Vienna - webLyzard Live DemowebLyzard technology
 
News Literacy 2020 - How to Understand and Combat Misinformation
News Literacy 2020 - How to Understand and Combat MisinformationNews Literacy 2020 - How to Understand and Combat Misinformation
News Literacy 2020 - How to Understand and Combat MisinformationwebLyzard technology
 
Communication Success Metrics for the U.S. Climate Agency NOAA
Communication Success Metrics for the U.S. Climate Agency NOAACommunication Success Metrics for the U.S. Climate Agency NOAA
Communication Success Metrics for the U.S. Climate Agency NOAAwebLyzard technology
 
Automated Rumor Detection and Visualization
Automated Rumor Detection and VisualizationAutomated Rumor Detection and Visualization
Automated Rumor Detection and VisualizationwebLyzard technology
 
E-Day 2017: Fake News in Sozialen Medien
E-Day 2017: Fake News in Sozialen MedienE-Day 2017: Fake News in Sozialen Medien
E-Day 2017: Fake News in Sozialen MedienwebLyzard technology
 
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...webLyzard technology
 
TEDx - Analyzing the Digital Talk: Visual Tools for Exploring Global Communic...
TEDx - Analyzing the Digital Talk: Visual Tools for Exploring Global Communic...TEDx - Analyzing the Digital Talk: Visual Tools for Exploring Global Communic...
TEDx - Analyzing the Digital Talk: Visual Tools for Exploring Global Communic...webLyzard technology
 
InVID Research Project - In Video Veritas
InVID Research Project - In Video VeritasInVID Research Project - In Video Veritas
InVID Research Project - In Video VeritaswebLyzard technology
 
PHEME Dashboard - Interactive Visual Analytics
PHEME Dashboard - Interactive Visual AnalyticsPHEME Dashboard - Interactive Visual Analytics
PHEME Dashboard - Interactive Visual AnalyticswebLyzard technology
 
Networking Knowledge, Networking People
Networking Knowledge, Networking PeopleNetworking Knowledge, Networking People
Networking Knowledge, Networking PeoplewebLyzard technology
 
Web Intelligence | Marktforschung, Strategische Positionierung, Messung von K...
Web Intelligence | Marktforschung, Strategische Positionierung, Messung von K...Web Intelligence | Marktforschung, Strategische Positionierung, Messung von K...
Web Intelligence | Marktforschung, Strategische Positionierung, Messung von K...webLyzard technology
 
Web Intelligence and Visual Media Analytics
Web Intelligence and Visual Media AnalyticsWeb Intelligence and Visual Media Analytics
Web Intelligence and Visual Media AnalyticswebLyzard technology
 

Más de webLyzard technology (13)

Elasticsearch Meetup Vienna - webLyzard Live Demo
Elasticsearch Meetup Vienna - webLyzard Live DemoElasticsearch Meetup Vienna - webLyzard Live Demo
Elasticsearch Meetup Vienna - webLyzard Live Demo
 
News Literacy 2020 - How to Understand and Combat Misinformation
News Literacy 2020 - How to Understand and Combat MisinformationNews Literacy 2020 - How to Understand and Combat Misinformation
News Literacy 2020 - How to Understand and Combat Misinformation
 
Communication Success Metrics for the U.S. Climate Agency NOAA
Communication Success Metrics for the U.S. Climate Agency NOAACommunication Success Metrics for the U.S. Climate Agency NOAA
Communication Success Metrics for the U.S. Climate Agency NOAA
 
Automated Rumor Detection and Visualization
Automated Rumor Detection and VisualizationAutomated Rumor Detection and Visualization
Automated Rumor Detection and Visualization
 
E-Day 2017: Fake News in Sozialen Medien
E-Day 2017: Fake News in Sozialen MedienE-Day 2017: Fake News in Sozialen Medien
E-Day 2017: Fake News in Sozialen Medien
 
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
 
TEDx - Analyzing the Digital Talk: Visual Tools for Exploring Global Communic...
TEDx - Analyzing the Digital Talk: Visual Tools for Exploring Global Communic...TEDx - Analyzing the Digital Talk: Visual Tools for Exploring Global Communic...
TEDx - Analyzing the Digital Talk: Visual Tools for Exploring Global Communic...
 
InVID Research Project - In Video Veritas
InVID Research Project - In Video VeritasInVID Research Project - In Video Veritas
InVID Research Project - In Video Veritas
 
PHEME Dashboard - Interactive Visual Analytics
PHEME Dashboard - Interactive Visual AnalyticsPHEME Dashboard - Interactive Visual Analytics
PHEME Dashboard - Interactive Visual Analytics
 
US Election 2016 Web Monitor
US Election 2016 Web MonitorUS Election 2016 Web Monitor
US Election 2016 Web Monitor
 
Networking Knowledge, Networking People
Networking Knowledge, Networking PeopleNetworking Knowledge, Networking People
Networking Knowledge, Networking People
 
Web Intelligence | Marktforschung, Strategische Positionierung, Messung von K...
Web Intelligence | Marktforschung, Strategische Positionierung, Messung von K...Web Intelligence | Marktforschung, Strategische Positionierung, Messung von K...
Web Intelligence | Marktforschung, Strategische Positionierung, Messung von K...
 
Web Intelligence and Visual Media Analytics
Web Intelligence and Visual Media AnalyticsWeb Intelligence and Visual Media Analytics
Web Intelligence and Visual Media Analytics
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 

Último (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Embedded Human Computation for Knowledge Extraction and Evaluation

  • 1.
  • 2. www.ucomp.eu | www.chistera.eu @uCompEU uComp Objectives • Develop a generic and reusable Human Computation (HC) framework • Address challenges of noisy data • Embed human computation into knowledge extraction workflows • Factual Knowledge • Affective Knowledge • Evaluate EHC performance (EHC = Embedded Human Computation)
  • 3. www.ucomp.eu | www.chistera.eu @uCompEU Work Package Overview
  • 4. www.ucomp.eu | www.chistera.eu @uCompEU System Architecture
  • 5. www.ucomp.eu | www.chistera.eu @uCompEU Content Repository (WP1) • Extensible Web Retrieval Toolkit (eWRT) • Open Source Library www.weblyzard.com/ewrt • Media Watch on Climate Change • English Version • www.ecoresearch.net/climate • News Media Articles: 1,275,000 • Social Media Postings: 20,000,000 • German Version • www.ecoresearch.net/climate/de • News Media Articles: 650,000 • Social Meeting Postings: 565,000 • French Version • www.ecoresearch.net/climate/de • News Media Articles: 720,000 • Social Meeting Postings: 410,000
  • 6. www.ucomp.eu | www.chistera.eu @uCompEU HC Framework (WP2) • Application Framework. Facilitate developing GWAPs to engage users and generate valuable information. • Mechanism. Players score if inputs match: (i) system- generated values; (ii) Real-time input from other players; (iii) stored records from previous users. • If a certain number of players agree, the task will be assumed complete and taken out of the game • Progress • Cross-platform HTML5 application framework. Complete. • Application Programming Interface (API). Complete. • Integration of GWAPs with CrowdFlower. Complete. • Support of Prediction Tasks. Complete. • Framework for Social Logins. Complete.
  • 7. www.ucomp.eu | www.chistera.eu @uCompEU GWAP Use Case Launch – 25 Mar 2015 www.twitter.com/uCompEU
  • 8. www.ucomp.eu | www.chistera.eu @uCompEU GWAP Use Case
  • 9. www.ucomp.eu | www.chistera.eu @uCompEU HC + Text Mining (WP3) • Open-source, released as part of GATE gate.ac.uk/wiki/crowdsourcing.html • Two types of tasks: (i) Classification - e.g. entity/word disambiguation, sentiment; (ii) sequence selection - e.g. named entity annotation • Tasks commissioned from the GATE Developer UI • Automatic mapping from sentences to HC tasks • Annotation provenance & contributor reliability tracked • Collected data mapped back onto corpora and documents automatically • Several knowledge aggregation and corpus distribution methods implemented (T3.3)
  • 10. www.ucomp.eu | www.chistera.eu @uCompEU • One entity class per crowdsourcing task; better than simultaneous annotation of entity types Crowdsourced NE Corpora
  • 11. www.ucomp.eu | www.chistera.eu @uCompEU Result Aggregation • Automatic adjudication/aggregation strategies implemented • Challenges encountered • Worker agreement not always representative of quality • Many entities are recognised by only a minority of workers • Regional knowledge is required: #mufc, the bulls • Span mismatch: King of England vs King of England • Quality evaluation • PER P 68.7 R 56.2 F1 61.8 LOC P 15.3 R 91.7 F1 26.2 ORG P 53.2 R 67.1 F1 59.3
  • 12. www.ucomp.eu | www.chistera.eu @uCompEU Factual Knowledge (WP4) • Ontologies create shared meaning and are a cornerstone of the Semantic Web • Manual construction of ontologies is cumbersome and expensive • Ontology learning is a (semi-)automatic process to assist the ontology engineer • uComp builds on an existing ontology learning framework
  • 13. www.ucomp.eu | www.chistera.eu @uCompEU Protégé Plugin • Goal: Apply the uComp HC framework to ontology learning and other ontology construction tasks • How: A plugin implemented for Protégé, a popular ontology engineering platform, using the uComp HC API to validate ontological entities
  • 14. www.ucomp.eu | www.chistera.eu @uCompEU Knowledge Creation Lifecycle
  • 15.
  • 16. www.ucomp.eu | www.chistera.eu @uCompEU Knowledge Quality Evaluation • Feasibility Study • Cost: Reduction of 40% to 83% depending on design used • Quality: Comparable with that of tasks performed by ontology engineers • Large-Scale Evaluation in Medical Domain • Result Quality: Accuracy of 89% / 99% • Completion Time: Similar to domain experts • Cost Reduction of 75% to 81%
  • 17. www.ucomp.eu | www.chistera.eu @uCompEU Affective Knowledge (WP5) • Use HC to produce affective resources that are difficult to obtain automatically and too costly to produce manually, for multiple languages (EN, FR, DE). • Assess HC-produced resources by evaluating the performance impact of using them instead of traditional resources for opinion mining and sentiment analysis (quantitative black-box methodology). • Assess the possibility to replace static gold standard resources by dynamic HC
  • 18. www.ucomp.eu | www.chistera.eu @uCompEU Affective Model
  • 19. www.ucomp.eu | www.chistera.eu @uCompEU Multilingual Twitter Data
  • 20. www.ucomp.eu | www.chistera.eu @uCompEU Crowdsourcing lexicon validation experiment • French Affective Lexicon (9,939 Entries) • Task Design • Results • Feasibility depends on workers’ motivation • Good quality/cost ratio • Ethical and legal issues Evaluation Percentage of crowdsourced validated terms per affective class
  • 21. www.ucomp.eu | www.chistera.eu @uCompEU Evaluation
  • 22. www.ucomp.eu | www.chistera.eu @uCompEU Evaluation • Data Annotation • Expert Annotation: 30.000 tweets : 50% French + 50% German; French: Complete, German: In Progress • Annotation Guide • 7 Entities: Opinion Holder, Opinion Target, Opinion / Sentiment / Emotion Expression, Negation, Modifier, Global OSE Recipient • 6 Relations: SAYS, ABOOUT, NEG, MOD and RECEIVER • Evaluation Campaign – DEFT2015 • 22 participants registered • Polarity, emotion, and opinion holder/target detection • DEFT Workshop at TALN 2015
  • 23. www.ucomp.eu | www.chistera.eu @uCompEU Dissemination & Impact (WP6) • Web Site: www.ucomp.eu; Twitter Presence: @uCompEU • Deliverables: 17 • Y1: D1.1, D1.2, D2.1, D3.1, D5.1, D6.1, D6.2, D7.1, D7.2, D7.3 • Y2: D1.3, D3.2, D3.4, D4.2, D5.2, D5.3, D7.4 • Scientific Publications: 24 • Open-Source Toolkits: 4 • eWRT, TwitIE, Gate HC Plugin, Protégé Plugin • Collaboration: DecarboNet (Climate Challenge), PHEME (Evaluation), Member of the European Center for Social Media • Training and Teaching • Two week-long courses on Mining and Crowdsourcing Social Media Corpora. GATE Summer School (8-12 June 2015; 9-13 June 2014) • Tutorial: Knowledge Extraction from Social Media with GATE. 12th Extended Semantic Web Conference (ESWC-2015) • Tutorial: NLP for Social Media. 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL-2014)
  • 24. www.ucomp.eu | www.chistera.eu @uCompEU Project Management (WP7) • Project duration extended by six months until 14 May 2016 (key staff leaving at MOD and USFD; recruitment delays at WU) • Changes to Work Plan • D2.2 - Postpone to M30 (matching completion of T2.3 and T2.4); • D2.3 - Postpone to M40 (matching T2.5); • D3.3 - Postpone to M42 (matching completion of T3.4); • D5.2 v2 and D5.3 v.2 - postpone to M36 (to allow prior completion of D2.2. at M30); • D5.4 - Postpone to M42; • D6.3 - Postpone to M42 (as this needs to report on all the work done until the end of the project).