Automated Multimodal Content Moderation for First Responders

Empowering First Responders through Automated
Multimodal Content Moderation
Divam Gupta, Indira Sen, Niharika Sachdeva, Ponnurangam
Kumaraguru, Arun Balaji Buduru

Why should we care about Sensitive content?

Why should we care about Sensitive content?
- Event or crises related sensitive
content can cause offline ramifications
- Have large-scale social and economic
impact

Who does it affect?
- Community moderators
strongly affected by
exposure to such content

Why multimodal?
● Most of the tweets contain
multimedia content such as
images , videos , etc
● Current text based models fail
when the main content is in the
tweet
● With a multimodal approach we
can jointly model different
content sources of the tweet

Roadmap
- Why should we care about sensitive content?
- Previous Work
- What is sensitive content?
- Data Collection
- Methodology
- Results
- Takeaways

Previous Work and Research Gaps
Content
Moderation
- Detecting personal attacks
using Logistic Regression
and large scale
annotations by et al. [1]
(Forms our baseline)
- Detecting hate speech in
Yahoo comments using
advanced NLP techniques
by et al. [2]

Multimodal
detection
- Multimodal detection of
pro-anorexia content using
CNNs [3]
-

Content
Moderation
Multimodal
detection
Our work

Sensitivity Rulebook
Hate Speech
shows the citizen disrespect "on grounds of religion, race, place of birth,
residence, language, caste or community or any other ground whatsoever".
Violent/Gory
violent or gory content that's primarily intended to be shocking, sensational,
or disrespectful.
Political
Criticism
Content that brings or attempts to bring into hatred or contempt, or excites
or attempts to excite disaffection towards the Government.
Some examples:
Situational
Information
Event based content that is informative; curating or producing content;
contribute to situational awareness; situational information; contextual
information to better understand the situation
Mobilisation
Content that seeks to organize a movement or protest or content that
reports such an event

Text Sensitivity Dataset
● Level 1 Dataset:
○ Tweets from sensitive hashtags and non sensitive hashtags collected.
Sensitive Hashtag No of tweets
AsaramBapuji 190696
Freekashmir 74237
3rdhinduadhiveshan 38823
Owaisi 33098
lovejihad 24297
Non Sensitive hashtag No of tweets
Nifty 202894
IndvsSA 136096
MondayMotivation 110178
IPLfinal 103083
MWC16 92309

Text Sensitivity Dataset
● Level 2 Dataset:
○ Tweets from sensitive hashtags and annotated manually using codebook (one
of more sensitive categories is marked as sensitive).
Hashtag # Sensitive Tweets # Non Sensitive Tweets
CauveryProtest 2129 796
JaichandKejriwal 768 270
DhakaEid 1280 64
TamilNaduBandh 334 85
Kashmir 358 110
Jallikattu 1329 363

Image Sensitivity Dataset
- 4,500
sensitive and
nonsensitive
images.

Roadmap
- Why should we care about sensitive content?
- What is sensitive content?
- Data Collection
- Methodology
- Results
- Takeaways

Multimodal Sensitivity detection

Detecting Sensitivity in Text
● We use Recurrent Neural Networks for classifying the text
as sensitive and non-sensitive
● We learn randomly initialized word embeddings along with
the RNN classifier.
● The hidden state of the last time-step is passed to a fully
connected layer with softmax to predict the probability of
sensitivity

Detecting Sensitivity in Images
● We use a two stream Convolutional Neural Network to
classify sensitive images
● The object recognition model is pre-trained on the
ImageNet dataset
● The object recognition model is pre-trained on MIT Places
dataset

Multimodal Sensitivity detection
● We combine both the text models and the image models
which enables the model to learn the features jointly
● We concatenate the intermediate outputs of the image
model and the text model.
● In the end, we use a fully connected layer with softmax to
predict the probability of sensitivity
● We show the improvement in the results if we combine the
two models

Multilevel Sensitivity Classification
● Due to the skewness of the data, we get a lot of positives.
● To solve this we train a model to filter out the tweets which
are definitely not sensitive.
● We train the level 1 model on weakly annotated large data
● After filtering out the tweets, we train a level 2 classifier
which gives the final sensitivity score

Quantitative Results
Method F1 Score Accuracy
VGG16 Finetuning 0.5350 0.5500
VGG16 Features + SVM 0.8065 0.8069
Object Model 0.8343 0.8438
Object + Scene Model 0.8547 0.8550
● Results on the Image Only Dataset

Quantitative Results
Method F1 Score Accuracy
SVM Baseline 0.682 0.701
2 layer word LSTM (level 1
text model)
0.7372 0.7385
Character Level GRU( level
2 text model )
0.7180 0.7619
Word Level GRU ( level 2
text model )
0.7760 0.7816
Image + Text Model 0.8013 0.8051
● Results on the Tweets Dataset

Hyperparameters of the Best Performing Model
(Text + Image)
We got the optimal hyperparameters via grid search using cross
validation
Hyperparameter Value
Number of tokens 30
Dimension of the word embeddings 150
Number of GRU units 512
Image Size 224 x 224
Learning rate 0.01

Qualitative Results: Visualizing the text model
● We use gradient based class activation mapping to find out
the words contributing to the sensitivity score
● We see words like boycott, fighters etc are contributing to
the sensitivity score
Two suspected Bangladeshi
terrorists arrested with fake
aadhaar card along with an arms
dealer in Kolkata
Entire nation should boycott this movie.
We r never allow to someone destroy our
history. We will fight & we will win.
Indian commando, three
fighters killed in Kashmir

Visualizing the image model
● We use class activation mapping to visualize the areas of
the image contributing to the sensitivity

Qualitative analysis: Human Moderator Study
● We label 100 nonsensitive random tweets and 100
sensitive tweets with our classifier.
● Two annotators look at the scores given by our system and
find 75 % to be correctly labeled
● There is only one false negative, implying that our system
has a very low miss rate
Labeled Positive Labeled Negative
Positive 99 1
Negative 33 67

Conclusion
● large corpus of weakly and a smaller dataset annotated by
first responders
● A multi-model classifier, for detecting sensitive content on
social media
● We show the superiority of our model by improving the
performance against other state of the art models
● We also inspect the model to see what it is learning
● Future work: extend to videos, gifs and include other kinds
of sensitive content

References
1. Wulczyn, Ellery, Nithum Thain, and Lucas Dixon. "Ex machina:
Personal attacks seen at scale." Proceedings of the 26th
International Conference on World Wide Web. International World
Wide Web Conferences Steering Committee, 2017.
2. Nobata, Chikashi, et al. "Abusive language detection in online
user content." Proceedings of the 25th international conference on
world wide web. International World Wide Web Conferences
Steering Committee, 2016.
3. Chancellor, Stevie, et al. "Multimodal Classification of
Moderated Online Pro-Eating Disorder Content." Proceedings of
the 2017 CHI Conference on Human Factors in Computing Systems.
ACM, 2017.

Automated Multimodal Content Moderation for First Responders

Recommended

Recommended

More Related Content

Similar to Automated Multimodal Content Moderation for First Responders

Similar to Automated Multimodal Content Moderation for First Responders (20)

More from IIIT Hyderabad

More from IIIT Hyderabad (20)

Recently uploaded

Recently uploaded (20)

Automated Multimodal Content Moderation for First Responders