Automated Identification of Framing by Word Choice and Labeling to Reveal Media Bias in News Articles

Automated Identification of Framing
by Word Choice and Labeling
to Reveal Media Bias in News Articles
Anastasia Zhukova
Doctoral supervisor: Felix Hamborg
1st examiner: Prof. Dr. Bela Gipp
2nd examiner: Prof. Dr. Karsten Donnay
Date: 2019-03-07

Agenda
1. Introduction
2. Project motivation and research objectives
3. Related work and research gap
4. Word choice and labeling (WCL) analysis system
5. Usability prototype
6. Multi-step merging approach
7. Evaluation results
8. Future work
9. Conclusion
07-Feb-23 Anastasia Zhukova - Automated Identification of Framing by Word Choice and Labeling 2

07-Feb-23 3
Anastasia Zhukova - Automated Identification of Framing by Word Choice and Labeling
Introduction
https://tgram.ru/channels/otsuka_bld
• Biased perception of the Russian president depends on how he was framed

07-Feb-23 4
Anastasia Zhukova - Automated Identification of Framing by Word Choice and Labeling
Introduction
invasion forces
vs.
coalition forces
heart-wrenching tales of hardship
vs.
information on the lifestyles
http://umich.edu/~newsbias/wordchoice.html
Word Choice (WC)
Labeling (L)

5
WCL depends on… [1-5]
• actor or perspective selection
• author position
• goal of the message
07-Feb-23 Anastasia Zhukova - Automated Identification of Framing by Word Choice and Labeling
http-//www.anmbadiary.com/2015/04/framing-effect-and-marketing.html
Project motivation
*equal with some degree of approximation
When not identified WCL influences on… [2, 4-6]
• emotion evaluation
• decision making process
• false information propagation
Existing solutions… (cf.[15-17])
• involve manual annotation by social scientists
• automated approaches yield simplistic results
• results are not scalable and not interactive

Project research objectives
RQ: How can we automatically identify instances of bias by WCL referring to the
semantic concepts in a set of English news articles reporting on the same event by
using natural language processing (NLP)?
Research tasks:
1. Design and develop a modular WCL analysis system;
2. Develop a first usability prototype with interactive visualization to explore the results of
WCL analysis;
3. Research, propose, and implement an approach based on the NLP methods to identify
semantic concepts that can be a target of bias by WCL;
4. Conduct an evaluation of the proposed semantic concept identification approach.
6

Related work and research gap
1. Social science methodology
a. Content analysis [2, 7, 9]
b. Framing analysis [1, 4, 6, 10, 11]
→ effective but manual and time-consuming
2. Automated WCL identification
a. from topic perspective [12, 14-17]
b. from actor perspective [13, 18]
→ require interpretation of the word choice difference
→ no concept-to-concept automatic comparison
3. Natural language processing
a. Named Entity Recognition (NER) (cf. [21])
b. Coreference resolution (cf.[12,20,24])
c. Cross-document coreference resolution (cf. [22, 23])
→ do not resolve broad sense anaphora
→ do not analyze difference of word choice

Roadmap
RT1: WCL analysis methodology and system
RT2: Usability prototype
RT3: Candidate alignment task: methodology of multi-step merging approach
RT4: Evaluation of the multi-step merging approach

9
WCL analysis pipeline methodology
Putin
president
savior
tyrant
humble man
thief
president
savior
Putin
tyrant
humble man
thief
https://tgram.ru/channels/otsuka_bld
Data
preprocessing
Semantic concept
identification
Framing analysis
of semantic
concepts
Framing similarity
across news articles
Semantic concept
identification

10
WCL analysis system
Preprocessing
Coreference resolution
Tokenization
POS tagging
Dependency parsing
NE Recognition
Related
articles
Sentence splitting
Parsing
Concept identification
Candidate extraction
Corefs NPs
Candidate alignment
Multi-step merging
Core meaning
Core meaning modifiers
Frequent word patterns
Usability prototype
Emotion frames
LIWC emotion dimensions
Emotion clustering
Visualization
Matrix view
Bar chart view
Article view
• Inductive analysis, i.e., no prior knowledge given
• The implementation is focused on the candidate alignment task

Roadmap

12
Usability prototype
Matrix view Bar chart view Article view
WCL diversity

13
Usability prototype
Selection mode of the
Matrix view
Candidate view Selection mode of the
Article view

Roadmap

Candidate alignment task
15
Task NER Coref. resolution Cand. alignment
Categorization/grouping
Cross-document coreferences
Linking of mentions
a. Common knowledge
anaphora
b. Broad sense anaphora
• Candidate alignment task aims at resolving anaphora both of common
knowledge and broad sense.

Multi-step merging approach (MSMA): overview
• Initial entities: coreferences and NPs
• Extract entity attributes to highlight certain properties
• Specify entity comparability to other entities
• Iterate multiple times over all entities
→merge entities based on similarities attributes
• Merging step = level in a hierarchy
16
all entities
sorted by
their size
similar color =
similarity in
one criterion
compare the first
entity to the other
entities
the considered
entity merges
similar entities
place the updated
entity to the end
and continue
the considered
entity merges
similar entities
place the
updated entity
to the end
sort entities
by their size
Step 1
Step N
…
Init.

Multi-step merging approach: steps
17
Step1: Representative phrases’ heads
Matching phrases’ heads, e.g. “President Trump” and “Donald Trump”
Step2: Head sets
Semantically similar head sets, e.g., {“Trump”, “president”} and {“billionaire”}
Step3: Representative labeling phrases
Semantically similar labeling phrases, e.g., “undocumented immigrants” and “illegal aliens”
Step4: Compounds
Semantically similar compounds, e.g., “DACA illegals” and “DACA recipients”
Step 5: Representative frequent wordsets
Semantically similar frequent wordsets, e.g., “United States” and “U.S.”
Step 6: Representative frequent phrases
String-similar frequent phrases, e.g., “Deferred Action of Childhood Arrivals” and “Childhood Arrivals”

Multi-step merging approach: summary
18
Type Step Goal Problems
Core
meaning
Representative
phrases’ heads
Compare on the output of
coreference resolution
Applicable only for named entity (NE)
entity types
Head sets Find synonyms of head
words among entities
Word collocations contain more
meaning than head words
Core
meaning
modifiers
Representative
labeling
phrases
Identify most prominent
adjective + noun patterns
Adjective is not the only core
meaning modifier
Compounds Compare noun-to-noun
similar compounds
More than two-word phrases are
required to represent entities
Frequent
word
patterns
Representative
frequent
wordsets
Identify frequently
repeated wording
Wordsets disregard word order
important for pattern identification
Representative
frequent
phrases
Identify frequently
repeated phrases
Requires extensive repetitive
wording

Roadmap

• Dataset: extended NewsWCL50 corpus
• Ten topics of 5 articles each: NewsWCL50 [25]
• One topic of 25 articles collected according to the NewsWCL50 methodology
• Simplified content analysis (CA) annotation
• Used annotation codes referring to the entities
• Avoided complex semantic concepts, e.g., a reaction on something
• Annotated extracted NPs and coreferential chains
• Metrics
• Weighted precision, recall, F1-score (evaluation of the best matching entities (BMEs) [27]
• Homogeneity, completeness, V-measure (general clustering evaluation) [26]
• WCL complexity metric (phasing diversity)
• Baselines
• Random baseline (B1)
• CoreNLP coreference resolution: employ only coreferential chains (B2) [24]
• Candidate clustering in the word vector space (B3)
• Concept type categorization
• Actor, e.g., Donald Trump
• Group, i.e., group of people acting as one entity
• Country, i.e., country names, anaphora, related to it organizations
• Misc, i.e., events, objects, abstract entities
20
Experiment setup

Evaluation results
0.12
0.97
0.87
0.91
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
Precision
B1_P B2_P B3_P M_P
0.15 0.17
0.32
0.82
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
Recall
B1_R B2_R B3_R M_R
0.12
0.27
0.42
0.84
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
F1-score
B1_F1 B2_F1 B3_F1 M_F1
B1: Random guessing
B2: CoreNLP coreference resolution
B3: Candidate clustering
M: Multi-step merging approach

WCL complexity evaluation
Concept Type WCL F1
Actor 2.10 0.97
Country 4.49 0.74
Misc 5.67 0.82
Group 9.20 0.78
0.97
0.74
0.82 0.78
0.00
0.20
0.40
0.60
0.80
1.00
1.00 3.00 5.00 7.00 9.00 11.00
F1
WCL metric
0.91
0.81
0.88
0.78 0.82 0.81
0.00
0.20
0.40
0.60
0.80
1.00
2.00 5.00 8.00 11.00 14.00
F1
WCL metric
Topic WCL F1
8 2.84 0.91
7 2.89 0.89
5 3.31 0.83
4 3.54 0.87
1 3.63 0.85
3 3.95 0.87
0 3.99 0.81
9 4.63 0.88
2 5.44 0.78
6 8.37 0.82
10 12.71 0.81
• Concept type split
• Topic split
• Logarithmic trend:
Concepts with high WCL diversity are harder to
identify.
• The most phrase-diverse topics 6 and 10 perform
comparably to the average performance (F1 = 0.84)
➢ WCL complexity is a metric representing anaphora phrasing diversity that
refer to a concept. High complexity = high phrasing variation

23
Merging steps evaluation: concept types
Steps Actor Country Misc Group
B1 0.123 0.124 0.107 0.112
B2 0.407 0.297 0.198 0.137
B3 0.450 0.428 0.468 0.289
Init. 0.408 0.298 0.204 0.140
Step 1 0.872 0.634 0.298 0.222
Step 2 0.927 0.685 0.779 0.502
Step 3 0.927 0.685 0.803 0.744
Step 4 0.970 0.700 0.803 0.744
Step 5 0.970 0.736 0.808 0.783
Step 6 0.970 0.736 0.817 0.783
Merging step types Actor Country Misc Group
Core meaning (Steps 1 & 2) 0.519 0.388 0.575 0.362
Core modifiers (Steps 3 & 4) 0.043 0.014 0.024 0.242
Word patterns (Steps 5 & 6) 0.000 0.037 0.014 0.039
Overall 0.562 0.439 0.613 0.643
• Development of F1-score at each step
• Difference of F1-score
o Gradual increase at all merging steps
o Init.step: extracted from CoreNLP
coreferential chains and NPs
o Step 1 outperforms B3 on NE-based types
o Step 2 outperforms B3 on non-NE-based
types
o Highest F1: 𝐹1𝐴𝑐𝑡𝑜𝑟 = 0.97
o Lowest F1: “Country” and “Group” types
o Lowest F1 boost:
“Country” type
→ lack of semantic similarity
o Highest F1 boost:
“Group” type
→ many semantic patterns captured

➢ Better approach performance: on small or big topics?
• Big topic: 25 articles per topic
• Small topic: three subsets of topics of 5 articles each
• We report average performance
• big: F1 = 0.81 small: F1 = 0.72
• Big topic outperforms on “Misc” and “Group” types
• Reasons: semantically similar repetitive word choice occurs often enough in a big topic
24
Big vs. small topic comparison
0.96
0.67
0.63 0.66
0.96
0.88
0.75
0.59
0.00
0.20
0.40
0.60
0.80
1.00
1.20
Actor Misc Group Country
DACA: F1
R5_avg_F1 All25_F1
1.59
6.68
9.37
12.65
1.79
11.47
19.34
23.89
0.00
5.00
10.00
15.00
20.00
25.00
30.00
DACA: WCL metric
R5_avg_WCL All25_WCL

• MSMA: F1 = 0.84 baseline B3: F1 = 0.42
• Best performance on “Actor” type: F1 = 0.97
• Largest phrasing diversity: “Group” type
• Largest performance boost on “Group” type
∆ = 0.643
• Better performance on the larger topics:
big: F1 = 0.81 small: F1 = 0.72
• Worst performance on “Group” and “Country”
types:
“Group” type:
o Requires additional merging step(s)
o Concept sense disambiguation
“Country” type:
o Low word semantic representation by the
chosen word vector model
o Broadly defined CA concepts: mix of country
names and organizations
Discussion summary
0.12
0.27
0.42
0.84
0.00
0.20
0.40
0.60
0.80
1.00
F1-score
B1_F1 B2_F1 B3_F1 M_F1
0.41
0.2
0.1
0.3
0.97
0.82 0.78
0.74
0
0.2
0.4
0.6
0.8
1
F1-score: Concept types
Init step All six steps

• Additional merging step using local context
• e.g., “Kim Jong Um” = “Little Rocket Man”
• Concept sense disambiguation
• e.g., “American people”≠ “foreign people”
• Different word vector models
• find better semantic representation of phrases
• More complex concepts
• Identify concepts such as action or reaction on something
• Next step: Deductive analysis
• collect large corpus of “silver”-quality annotated topics
• train a sequential neural network (SNN) model
• identify framing by WCL in any news topic
26
Future work

Contributions:
1. Proposed methodology of WCL analysis pipeline
2. Implemented WCL analysis system
3. Proposed, implemented and evaluated multi-step merging approach
MSMA: F1 = 0.84 baseline B3: F1 = 0.42
Approach benefits:
• resolves anaphora of broad sense
• uses only candidate phrases without their context
• no additional long model training required
• tested on a specific dataset for WCL analysis
4. Implemented the first usability prototype
Future work:
• Concept sense disambiguation
• SNN model for WCL deductive analysis
27
Conclusion

1. Kahneman, D., Tversky, A., 1984. Choices, values, and frames. Am. Psychol. 39, 341–350.
2. F. Hamborg, K. Donnay, and B. Gipp, “Automated identification of media bias in news articles : an interdisciplinary literature review,”
International Journal on Digital Libraries, 2018.
3. W. Linstrõm, M., & Marais, M. Linstrom, and W. Marais, “Qualitative News Frame Analysis: A Methodology,” Communitas, vol. 17, no.
17, pp. 21–38, 2012.
4. D. Chong and J. N. Druckman, “Framing Theory,” Annual Review of Political Science, vol. 10, no. 1, pp. 103–126, 2007.
5. A. Duzett, “Media Bias in Strategic Word Choice,” http://www.aim.org/on-targetblog/media-bias-in-strategic-word-choice/, 2011.
6. J. N. Druckman, “Political Preference Formation : Competition and the ( Ir ) relevance of Framing Effects,” The American Political
Science Review, vol. 98, no. 4, pp. 671–686, 2004.
7. M. Linstrom and W. Marais, “Qualitative News Frame Analysis: A Methodology,”Communitas, vol. 17, pp. 21–38, 2012.
8. F. Hamborg, A. Zhukova, and B. Gipp, “Illegal Aliens or Undocumented Immigrants ? Towards the Automated Identification of Bias by
Word Choice and Labeling,” in Proceedings of the iConference 2019, 2019.
9. M. Schreier, Qualitative content analysis in practice. Sage publications, 2012.
10. R. M. Entman, “Framing: Toward Clarification of a Fractured Paradigm,” Journal of Communication, vol. 43, no. 4, pp. 51–58, 1993.
11. R. M. Entman, “Framing bias: Media in the distribution of power,” Journal of Communication, vol. 57, no. 1, pp. 163–173, 2007.
12. Tian, Yan, and Concetta M. Stewart. "Framing the SARS crisis: A computer-assisted text analysis of CNN and BBC online news reports of
SARS." Asian Journal of Communication 15.3 (2005): 289-301.
13. Sendén, Marie Gustafsson, Sverker Sikström, and Torun Lindholm. "“She” and “He” in news media messages: pronoun use reflects
gender biases in semantic contexts." Sex Roles 72.1-2 (2015): 40-49.
14. Fortuna, Blaz, Carolina Galleguillos, and Nello Cristianini. "Detection of bias in media outlets with statistical learning methods." Text
Mining. Chapman and Hall/CRC, 2009. 57-80.
15. Recasens, Marta, Cristian Danescu-Niculescu-Mizil, and Dan Jurafsky. "Linguistic models for analyzing and detecting biased language."
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol. 1. 2013.
28
References

16. Z. Papacharissi and M. de Fatima Oliveira, “News frames terrorism: A comparative analysis of frames employed in terrorism coverage in
U.S. and U.K. newspapers,” International Journal of Press/Politics, vol. 13, no. 1, pp. 52–74, 2008.
17. D. M. Garyantes and P. J. Murphy, “Success or chaos?: Framing and ideology in news coverage of the Iraqi national elections,”
International Communication Gazette, vol. 72, no. 2, pp. 151–170, 2010.
18. D. Card, J. H. Gross, A. E. Boydstun, and N. A. Smith, “Analyzing Framing through the Casts of Characters in the News,” Proceedings of
the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP-16), pp. 1410–1420, 2016.
19. K. Clark and C. D. Manning, “Deep Reinforcement Learning for Mention-Ranking Coreference Models,” Proceedings of the 2016
Conference on Empirical Methods in Natural Language Processing, pp. 2256–2262, 2016.
20. H. Lee, “A Scaffolding Approach to Coreference Resolution Integrating Statistical and Rule-based Models,” Natural Language
Engineering, vol. 23, no. 5, pp. 733–762, 2017
21. J. R. Finkel, T. Grenager, and C. Manning, “Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling,”
Proceedings of the 43nd Annual Meeting of the Association for Computational Linguistics (ACL 2005), pp. 363–370, 2005.
22. S. Dutta and G. Weikum, “Cross-Document Co-Reference Resolution using SampleBased Clustering with Knowledge Enrichment,”
Transactions of the Association for Computational Linguistics, vol. 3, pp. 15–28, 2015
23. S. Singh, A. Subramanya, F. Pereira, and A. Mccallum, “Large-Scale Cross-Document Coreference Using Distributed Inference and
Hierarchical Models,” In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language
Technologies, vol. 1, pp. 793–803, 2011.
24. K. Clark and C. D. Manning, “Improving Coreference Resolution by Learning EntityLevel Distributed Representations,” In Proceedings of
the 54th Annual Meeting of the79 Association for Computational Linguistics, pp. 643–653, 2016
25. F. Hamborg, A. Zhukova, and B. Gipp, “Automated Identification of Media Bias by Word Choice and Labeling in News Articles,”
Manuscript submitted for publication, pp. 1–10,
26. Rosenberg, Andrew, and Julia Hirschberg. "V-measure: A conditional entropy-based external cluster evaluation measure." Proceedings
of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning
(EMNLP-CoNLL). 2007.
27. N. Chinchor and P. D, “MUC-5 EVALUATION METRIC S Science Applications International Corporatio n 10260 Campus Point Drive , MIS
A2-F San Diego , CA 9212 1 Naval Command , Control , and Ocean Surveillance Cente r RDT & E Division ( NRaD ) Information Access
Technology Project Te,” System, pp. 69–78, 1992
29
References

Thank you for your attention!
Questions?

Back-up slides

34
Entity
Type
Entity Subtype Source Example CA Concept Type
person
nn (noun single) WordNet + POS immigrant Actor
nns (noun plural) WordNet + POS politicians Group
ne (named entity) NER Trump Actor
nes (named entity plural) NER + POS Democrats Group
group
-- WordNet university Group
ne NER Congress Country/Group
country
-- WordNet Homeland Country
ne NER Germany Country
other -- -- vote Misc
Idea:
• Words can be similar in the vector space but the results will be irrelevant to CA concepts
• Identify entity types for the effective results
• Entity types resemble concept type from manual CA
Entity types

35
Step 1: Representative phrases’ heads
Donald Trump
Trump
Mr. Trump
forceful Mr. Trump
President Trump
Donald Trump
the president
The president of the US
identical by string
comparison
Entity 1 Entity 2
Merged entities
Heads of
phrases
Representative
phrases Trump Donald Trump
Trump Trump
Donald Trump
Trump
Mr. Trump
forceful Mr. Trump
President Trump
Donald Trump
the president
The president of the US
Heads of
phrases
Representative
phrases

36
Step 2: Headsets
young illegals
the illegals
illegals who arrived as
children
DACA illegals
roughly 800,000 young undocumented
immigrants
young immigrants
illegal immigrants
undocumented immigrants
illegal aliens who were brought as
children
nearly 800,000 illegal aliens
illegal aliens
young illegal aliens
headsets {illegals} {immigrants} {aliens}
similar in the
vector space
Entity 1 Entity 2 Entity 3
the word alone is related
to the UFO; it will be
merged later as “illegal
alien” at the third step
Merge entities
young illegals
the illegals
illegals who arrived as children
DACA illegals
roughly 800,000 young undocumented immigrants
young immigrants
illegal immigrants
headsets

37
Step 3: Representative labeling phrases
young illegals
the illegals
DACA illegals
young immigrants
illegal immigrants
endangered immigrants
additional illegals
this group of young people
nearly 800,000 people
a people
people who are American in every way except through birth
foreign people
bad people
people affected by the move
the estimated 800,000 people
these people
young people
Labeling
phrases
young immigrants,
undocumented immigrants,
illegal immigrants,
young illegals,
endangered immigrants,
additional illegals
Entity 1 Entity 2
Merged entities
Representative
labeling
phrases
A1: young immigrants,
A2: illegal immigrants,
A3: young illegals
B1: young people,
B2: foreign people
young people,
foreign people,
bad people,
estimated people
Sim.matrix
A1
A2
A3
B1 B2
1
1
1
0
0
0
3
2×3
≥ 0.3 → similar in the vector space
young illegals
the illegals
DACA illegals
young immigrants
illegal immigrants
endangered immigrants
additional illegals
this group of young people
nearly 800,000 people
a people
people who are American in every way except through birth
foreign people
bad people
people affected by the move
the estimated 800,000 people
these people
young people
Labeling
phrases
Representative
labeling
phrases

38
Step 4a: Headword-compound match
PM Theresa May
Mrs. May
UK Prime Minister Theresa May
Prime Minister
The British prime minister
identical by string
comparison
Entity 1 Entity 2
Merged entities
Heads of
phrases
Compounds
{Minister May, PM May,
Mrs. May, Theresa May}
Minister Minister
{minister,
Minister}
PM Theresa May
Mrs. May
UK Prime Minister Theresa May
Prime Minister
The British prime minister
dependent governor

39
Step 4b: Common compounds
DACA recipients
the program’s beneficiaries
DACA beneficiaries
800,000 recipients
DACA participants
800,000 participants
more than a quarter of DACA registrants
program participants
Entity 1 Entity 2
Compounds with
overlapping words
Compounds DACA recipients,
program’s beneficiaries
DACA beneficiaries
DACA participants,
DACA registrants,
A1: DACA recipients,
A2: DACA beneficiaries
B1: DACA participants,
B2: DACA registrants
{DACA}
Overlapping NE
compounds
Sim.matrix
A1
A2
B1 B2
1
1
0
0
Compounds
Merged entities
2
2×2
DACA recipients
the program’s beneficiaries
DACA beneficiaries
800,000 recipients
DACA participants
800,000 participants
more than a quarter of DACA registrants
Compounds with
overlapping words

40
Step 5: Representative frequent wordsets
illegals whose DACA protection is pending
DACA illegals
young illegals
illegal alien applicants
DACA applicants
more than 2,000 DACA recipients
DACA beneficiaries
DACA recipients whose status expires on March 5
former DACA participants
the participants
Entity 1 Entity 2
Frequent
wordsets
A1: {DACA, illegals},
A2: {illegals},
A3: {applicants},
A4: {DACA}
B1: {DACA, recipients},
B2: {DACA},
B3: {participants}
Frequent
wordsets
Sim.matrix
A1
A2
A3
B1 B2
1 1
1
0 0
0
5
4×3
A4
B3
1
1
0
0 0
0
Merged entities
illegals whose DACA protection is pending
DACA illegals
young illegals
illegal alien applicants
DACA applicants
more than 2,000 DACA recipients
DACA beneficiaries
DACA recipients whose status expires on March 5
former DACA participants
the participants

41
Step 6: Representative frequent phrases
DACA program (x10)
DACA (x10)
Deferred Action Childhood Arrivals program (x5)
Obama-era program (x5)
Childhood Arrivals DACA (x4)
Deferred Action for Childhood Arrivals program (x 3)
Deferred Action (x5)
Deferred Action for Childhood Arrivals (x2)
Entity 1 Entity 2
Frequent phrases
B1: Deferred Action Childhood Arrivals,
B2: Deferred Action,
B3: Childhood Arrivals,
B4: Deferred Action Childhood Arrivals program,
B5: Childhood Arrivals DACA
A1: DACA,
A2: program,
A3: DACA program,
A4: Childhood Arrivals program,
A5: Obama-era program
Frequent phrases
Sim.matrix
A1
A2
A3
B1 B2
1 1
0 0
0
A4
B3
1 1
0 0
0 0
Merged entities
A5
B4 B5
0 0 0
0 0 0 0 0
0
0
0 0 0 0
𝑠𝑖𝑚𝑣𝑎𝑙 = 𝑠𝑖𝑚ℎ𝑜𝑟 =
4
5
DACA program (x10)
DACA (x10)
Deferred Action Childhood Arrivals program (x5)
Obama-era program (x5)
Childhood Arrivals DACA (x4)
Deferred Action for Childhood Arrivals program (x 3)
Deferred Action (x5)
Deferred Action for Childhood Arrivals (x2)

42
WCL complexity metric
𝑊𝐶𝐿 = ෍
ℎ∈𝐻
𝑆ℎ
𝐿ℎ
• 𝐻 is a set of phrases’ heads in a code,
• 𝑆ℎ is a set of unique phrases with a phrase’s head ℎ ,
• 𝐿ℎ is a list of non-unique phrases with a phrase’s head ℎ.

Automated Identification of Framing by Word Choice and Labeling to Reveal Media Bias in News Articles

Recommended

Recommended

More Related Content

Similar to Automated Identification of Framing by Word Choice and Labeling to Reveal Media Bias in News Articles

Similar to Automated Identification of Framing by Word Choice and Labeling to Reveal Media Bias in News Articles (20)

More from Anastasia Zhukova

More from Anastasia Zhukova (10)

Recently uploaded

Recently uploaded (20)

Automated Identification of Framing by Word Choice and Labeling to Reveal Media Bias in News Articles