[poster] Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder

•

1 like•233 views

The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19) January 27 – February 1, 2019, Honolulu, Hawaii, USA.

Presentations & Public Speaking

 Performance comparison
 Performance over long text input
The proposed models showed robustness in handling long articles
 Evaluation in the real world supports the validity of the dataset / approach
- Manual validation - Surveys on AMT workers
Detecting Incongruity Between News Headline and Body Text
via a Deep Hierarchical Encoder
Seunghyun Yoon*, Kunwoo Park*, Joongbo Shin, Hongjun Lim, Seungpil Won, Meeyoung Cha and Kyomin Jung
Seoul National University, KAIST
Full paper at emerging track on artificial intelligence for social impact (oral presentation at 29 Jan 14:00-15:30)
Research Problem
 Detecting incongruity between news headline and body text
(i.e., when a news headline does not correctly represent the story as in
advertisements, clickbaits, fake news, hijacked stories, etc.)
 Why is it important?
- People are less likely to read the whole contents but just read news headlines
- News headlines play an important role in making first impressions to readers,
which persist even after reading the whole article content
Most news are shared based on headlines in online platforms
 Previous efforts
- Public competition aiming to facilitate research on developing algorithms
(e.g., Fake News Challenge in 2017)
- Feature based approaches in detecting ambiguous headlines (Wei et al. 2017)
A sophisticated model could not be proposed or fully-evaluated
due to lack of available corpus for research
 Proposed approaches
- Attentive Hierarchical Dual Encoder (AHDE): encodes the full news article
from a word-level to a paragraph-level with attention mechanism
- Independent Paragraph (IP) Method: splits paragraphs in the body text and
learns the relationship between each paragraph and headline
Experimental Results
Dataset Generation
 Data source
- Main dataset: Crawl a near-complete set of news article published in South
Korea (period: January of 2017 – October of 2017)
- Supplementary dataset: 136K news articles in English (Horne et al. 2018)
 Two types of dataset
- Whole-corpus: Inject paragraphs that are negatively sampled from other
articles into an original article to make the headline become incongruent
- Paragraph-corpus: Transform a pair of headline and body text into multiple
sub-pairs of headline and paragraph
Methodology
 Baseline approaches
- XGBoost: the winning model in Fake News Challenge (FNC-1)
- Recurrent Dual Encoder (RDE): independently learn sequential patterns of
headline and body text via dual RNNs
- Convolutional Dual Encoder (CDE): independently learn textual patterns of
headline and body text via dual CNNs
Passage encoding
, , , ,
,,
Attention
tanh
∑
,
∑
Without IP With IP
AHDE was the best AHDE was (still) the bestEnjoy more advantages
compared to hierarchical models
Without IP With IP
Summary of contributions
 RELEASE a million-scale dataset for research on misinformation
 PROPOSE two neural networks that efficiently learn the textual relationship
between headline and body text
 SHOW that the models trained on the released corpus decent performances
on the real-world dataset
The code, data, and paper are available at http://david-yoon.github.io

What's hot

Lexicon Based Emotion Analysis on Twitter Dataijtsrd

Learning to learn with meta learningShreeGowriRadhakrish

Analyzing User Reviews in Tourism with Topic ModelsInternational Federation for Information Technologies in Travel and Tourism (IFITT)

Tutorial on Relationship Mining In Online Social Networkspjing2

Tweets ClassifierAvanti Gupta

GraphAshenafi Workie

News Session-Based Recommendations Using Deep Neural NetworksFelipe Ferreira

Online Learning to Rankewhuang3

On nonmetric similarity search problems in complex domainsunyil96

Selection of Tags for Tag CloudsAakash Gupta

Dynamic learning of keyword-based preferences for news recommendation (WI-2014)Antonio Moreno

An overview of text mining and sentiment analysis for Decision Support SystemGan Keng Hoon

36 students' interactin with librarians through twitterCITE

IJSRED-V2I2P09IJSRED

Summer internship 2014 report by Rishabh Misra, Thapar UniversityRishabh Misra

Unpacking Altmetric Donuts: Content Analysis of Tweets to Scholarly Journal A...smine

Domain Ontology Usage Analysis Framework (OUSAF)Jamshaid Ashraf

Unpacking Altmetric Donuts: Content Analysis of Tweets to Scholarly Journal A...smine

Supporting scientific discovery through linkages of literature and dataDon Pellegrino

Information Retrieval Models for Recommender Systems - PhD slidesDaniel Valcarce

What's hot (20)

Lexicon Based Emotion Analysis on Twitter Data

Learning to learn with meta learning

Analyzing User Reviews in Tourism with Topic Models

Tutorial on Relationship Mining In Online Social Networks

Tweets Classifier

Graph

News Session-Based Recommendations Using Deep Neural Networks

Online Learning to Rank

On nonmetric similarity search problems in complex domains

Selection of Tags for Tag Clouds

Dynamic learning of keyword-based preferences for news recommendation (WI-2014)

An overview of text mining and sentiment analysis for Decision Support System

36 students' interactin with librarians through twitter

IJSRED-V2I2P09

Summer internship 2014 report by Rishabh Misra, Thapar University

Unpacking Altmetric Donuts: Content Analysis of Tweets to Scholarly Journal A...

Domain Ontology Usage Analysis Framework (OUSAF)

Unpacking Altmetric Donuts: Content Analysis of Tweets to Scholarly Journal A...

Supporting scientific discovery through linkages of literature and data

Information Retrieval Models for Recommender Systems - PhD slides

Similar to [poster] Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder

Opinion mining on newspaper headlines using SVM and NLPIJECEIAES

Sensing complicated meanings from unstructured data: a novel hybrid approachIJECEIAES

Prediction of Answer Keywords using Char-RNNIJECEIAES

IRJET- Automatic Recapitulation of Text DocumentIRJET Journal

A Review Of Text Mining Techniques And ApplicationsLisa Graves

Text Mining: (Asynchronous Sequences)IJERA Editor

Meta-evaluation of machine translation evaluation methodsLifeng (Aaron) Han

1808.10245v1 (1).pdfKSHITIJCHAUDHARY20

TldrNarayana Murthy

Oles Petriv “Creating one concept embedding space for persons, brands and new...Lviv Startup Club

Discover How Scientific Data is Used for the Public Good with Natural Languag...BaoTramDuong2

Eat it, Review it: A New Approach for Review Predictionvivatechijri

[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.PadmapriyaIJET - International Journal of Engineering and Techniques

French machine reading for question answeringAli Kabbadj

team10.ppt.pptxREMEGIUSPRAVEENSAHAY

HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...Lifeng (Aaron) Han

Journal of Computer Science Research | Vol.1, Iss.3Bilingual Publishing Group

A-STUDY-ON-SENTIMENT-POLARITY.pdfSUDESHNASANI1

An in-depth review on News Classification through NLPIRJET Journal

IRJET- Automated Document Summarization and Classification using Deep Lear...IRJET Journal

Similar to [poster] Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder (20)

Opinion mining on newspaper headlines using SVM and NLP

Sensing complicated meanings from unstructured data: a novel hybrid approach

Prediction of Answer Keywords using Char-RNN

IRJET- Automatic Recapitulation of Text Document

A Review Of Text Mining Techniques And Applications

Text Mining: (Asynchronous Sequences)

Meta-evaluation of machine translation evaluation methods

1808.10245v1 (1).pdf

Tldr

Oles Petriv “Creating one concept embedding space for persons, brands and new...

Discover How Scientific Data is Used for the Public Good with Natural Languag...

Eat it, Review it: A New Approach for Review Prediction

[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya

French machine reading for question answering

team10.ppt.pptx

HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...

Journal of Computer Science Research | Vol.1, Iss.3

A-STUDY-ON-SENTIMENT-POLARITY.pdf

An in-depth review on News Classification through NLP

IRJET- Automated Document Summarization and Classification using Deep Lear...

Recently uploaded

Work Remotely with Confluence ACE 2.pptxmavinoikein

Event 4 Introduction to Open Source.pptxaryanv1753

THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘謝

Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...marjmae69

The 3rd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella

PHYSICS PROJECT BY MSC - NANOTECHNOLOGYpruthirajnayak525

SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comsaastr

RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRachelAnnTenibroAmaz

INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524

The Ten Facts About People With Autism PresentationNathan Young

Quality by design.. ppt for RA (1ST SEMCharmi13

SBFT Tool Competition 2024 -- Python Test Case Generation TrackSebastiano Panichella

Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Krijn Poppe

Anne Frank A Beacon of Hope amidst darkness ppt.pptxnoorehahmad

Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power

Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella

Chizaram's Women Tech Makers Deck. .pptxogubuikealex

PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.KathleenAnnCordero2

miladyskindiseases-200705210221 2.!!pptxCarrieButtitta

Genshin Impact PPT Template by EaTemp.pptxJohnree4

Recently uploaded (20)

Work Remotely with Confluence ACE 2.pptx

Event 4 Introduction to Open Source.pptx

THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...

Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...

The 3rd Intl. Workshop on NL-based Software Engineering

PHYSICS PROJECT BY MSC - NANOTECHNOLOGY

SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com

RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION

INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR

The Ten Facts About People With Autism Presentation

Quality by design.. ppt for RA (1ST SEM

SBFT Tool Competition 2024 -- Python Test Case Generation Track

Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...

Anne Frank A Beacon of Hope amidst darkness ppt.pptx

Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics

Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist

Chizaram's Women Tech Makers Deck. .pptx

PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.

miladyskindiseases-200705210221 2.!!pptx

Genshin Impact PPT Template by EaTemp.pptx

[poster] Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder

1.  Performance comparison  Performance over long text input The proposed models showed robustness in handling long articles  Evaluation in the real world supports the validity of the dataset / approach - Manual validation - Surveys on AMT workers Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder Seunghyun Yoon*, Kunwoo Park*, Joongbo Shin, Hongjun Lim, Seungpil Won, Meeyoung Cha and Kyomin Jung Seoul National University, KAIST Full paper at emerging track on artificial intelligence for social impact (oral presentation at 29 Jan 14:00-15:30) Research Problem  Detecting incongruity between news headline and body text (i.e., when a news headline does not correctly represent the story as in advertisements, clickbaits, fake news, hijacked stories, etc.)  Why is it important? - People are less likely to read the whole contents but just read news headlines - News headlines play an important role in making first impressions to readers, which persist even after reading the whole article content Most news are shared based on headlines in online platforms  Previous efforts - Public competition aiming to facilitate research on developing algorithms (e.g., Fake News Challenge in 2017) - Feature based approaches in detecting ambiguous headlines (Wei et al. 2017) A sophisticated model could not be proposed or fully-evaluated due to lack of available corpus for research  Proposed approaches - Attentive Hierarchical Dual Encoder (AHDE): encodes the full news article from a word-level to a paragraph-level with attention mechanism - Independent Paragraph (IP) Method: splits paragraphs in the body text and learns the relationship between each paragraph and headline Experimental Results Dataset Generation  Data source - Main dataset: Crawl a near-complete set of news article published in South Korea (period: January of 2017 – October of 2017) - Supplementary dataset: 136K news articles in English (Horne et al. 2018)  Two types of dataset - Whole-corpus: Inject paragraphs that are negatively sampled from other articles into an original article to make the headline become incongruent - Paragraph-corpus: Transform a pair of headline and body text into multiple sub-pairs of headline and paragraph Methodology  Baseline approaches - XGBoost: the winning model in Fake News Challenge (FNC-1) - Recurrent Dual Encoder (RDE): independently learn sequential patterns of headline and body text via dual RNNs - Convolutional Dual Encoder (CDE): independently learn textual patterns of headline and body text via dual CNNs Passage encoding , , , , ,, Attention tanh ∑ , ∑ Without IP With IP AHDE was the best AHDE was (still) the bestEnjoy more advantages compared to hierarchical models Without IP With IP Summary of contributions  RELEASE a million-scale dataset for research on misinformation  PROPOSE two neural networks that efficiently learn the textual relationship between headline and body text  SHOW that the models trained on the released corpus decent performances on the real-world dataset The code, data, and paper are available at http://david-yoon.github.io

[poster] Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to [poster] Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder

Similar to [poster] Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder (20)

Recently uploaded

Recently uploaded (20)

[poster] Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder