Social media enables users to spread information and opinions, including in times of crisis events such as riots, protests or uprisings. Sensitive event-related content can lead to repercussions in the real world. Therefore it is crucial for first responders, such as law enforcement agencies, to have ready access, and the ability to monitor the propagation of such content. Obstacles to easy access include a lack of automatic moderation tools targeted for first responders. Efforts are further complicated by the multimodal nature of content which may have either textual and pictorial aspects. In this work, as a means of providing intelligence to first responders, we investigate automatic moderation of sensitive event-related content across the two modalities by exploiting recent advances in Deep Neural Networks (DNN). We use a combination of image classification with Convolutional Neural Networks (CNN) and text classification with Recurrent Neural Networks (RNN). Our multilevel content classifier is obtained by fusing the image classifier and the text classifier. We utilize feature engineering for preprocessing but bypass it during classification due to our use of DNNs while achieving coverage by leveraging community guidelines. Our approach maintains a low false positive rate and high precision by learning from a weakly labeled dataset and then, by learning from an expert annotated dataset. We evaluate our system both quantitatively and qualitatively to gain a deeper understanding of its functioning. Finally, we benchmark our technique with current approaches to combating sensitive content and find that our system outperforms by 16% in accuracy.
4. Why should we care about Sensitive content?
- Event or crises related sensitive
content can cause offline ramifications
- Have large-scale social and economic
impact
5. Who does it affect?
- Community moderators
strongly affected by
exposure to such content
6. Why multimodal?
● Most of the tweets contain
multimedia content such as
images , videos , etc
● Current text based models fail
when the main content is in the
tweet
● With a multimodal approach we
can jointly model different
content sources of the tweet
7. Roadmap
- Why should we care about sensitive content?
- Previous Work
- What is sensitive content?
- Data Collection
- Methodology
- Results
- Takeaways
8. Previous Work and Research Gaps
Content
Moderation
- Detecting personal attacks
using Logistic Regression
and large scale
annotations by et al. [1]
(Forms our baseline)
- Detecting hate speech in
Yahoo comments using
advanced NLP techniques
by et al. [2]
9. Previous Work and Research Gaps
Multimodal
detection
- Multimodal detection of
pro-anorexia content using
CNNs [3]
-
10. Previous Work and Research Gaps
Content
Moderation
Multimodal
detection
Our work
12. Sensitivity Rulebook
Hate Speech
shows the citizen disrespect "on grounds of religion, race, place of birth,
residence, language, caste or community or any other ground whatsoever".
Violent/Gory
violent or gory content that's primarily intended to be shocking, sensational,
or disrespectful.
Political
Criticism
Content that brings or attempts to bring into hatred or contempt, or excites
or attempts to excite disaffection towards the Government.
Some examples:
Situational
Information
Event based content that is informative; curating or producing content;
contribute to situational awareness; situational information; contextual
information to better understand the situation
Mobilisation
Content that seeks to organize a movement or protest or content that
reports such an event
13. Text Sensitivity Dataset
● Level 1 Dataset:
○ Tweets from sensitive hashtags and non sensitive hashtags collected.
Sensitive Hashtag No of tweets
AsaramBapuji 190696
Freekashmir 74237
3rdhinduadhiveshan 38823
Owaisi 33098
lovejihad 24297
Non Sensitive hashtag No of tweets
Nifty 202894
IndvsSA 136096
MondayMotivation 110178
IPLfinal 103083
MWC16 92309
14. Text Sensitivity Dataset
● Level 2 Dataset:
○ Tweets from sensitive hashtags and annotated manually using codebook (one
of more sensitive categories is marked as sensitive).
Hashtag # Sensitive Tweets # Non Sensitive Tweets
CauveryProtest 2129 796
JaichandKejriwal 768 270
DhakaEid 1280 64
TamilNaduBandh 334 85
Kashmir 358 110
Jallikattu 1329 363
18. Detecting Sensitivity in Text
● We use Recurrent Neural Networks for classifying the text
as sensitive and non-sensitive
● We learn randomly initialized word embeddings along with
the RNN classifier.
● The hidden state of the last time-step is passed to a fully
connected layer with softmax to predict the probability of
sensitivity
19. Detecting Sensitivity in Images
● We use a two stream Convolutional Neural Network to
classify sensitive images
● The object recognition model is pre-trained on the
ImageNet dataset
● The object recognition model is pre-trained on MIT Places
dataset
20. Multimodal Sensitivity detection
● We combine both the text models and the image models
which enables the model to learn the features jointly
● We concatenate the intermediate outputs of the image
model and the text model.
● In the end, we use a fully connected layer with softmax to
predict the probability of sensitivity
● We show the improvement in the results if we combine the
two models
22. Multilevel Sensitivity Classification
● Due to the skewness of the data, we get a lot of positives.
● To solve this we train a model to filter out the tweets which
are definitely not sensitive.
● We train the level 1 model on weakly annotated large data
● After filtering out the tweets, we train a level 2 classifier
which gives the final sensitivity score
23. Quantitative Results
Method F1 Score Accuracy
VGG16 Finetuning 0.5350 0.5500
VGG16 Features + SVM 0.8065 0.8069
Object Model 0.8343 0.8438
Object + Scene Model 0.8547 0.8550
● Results on the Image Only Dataset
24. Quantitative Results
Method F1 Score Accuracy
SVM Baseline 0.682 0.701
2 layer word LSTM (level 1
text model)
0.7372 0.7385
Character Level GRU( level
2 text model )
0.7180 0.7619
Word Level GRU ( level 2
text model )
0.7760 0.7816
Image + Text Model 0.8013 0.8051
● Results on the Tweets Dataset
25. Hyperparameters of the Best Performing Model
(Text + Image)
We got the optimal hyperparameters via grid search using cross
validation
Hyperparameter Value
Number of tokens 30
Dimension of the word embeddings 150
Number of GRU units 512
Image Size 224 x 224
Learning rate 0.01
26. Qualitative Results: Visualizing the text model
● We use gradient based class activation mapping to find out
the words contributing to the sensitivity score
● We see words like boycott, fighters etc are contributing to
the sensitivity score
Two suspected Bangladeshi
terrorists arrested with fake
aadhaar card along with an arms
dealer in Kolkata
Entire nation should boycott this movie.
We r never allow to someone destroy our
history. We will fight & we will win.
Indian commando, three
fighters killed in Kashmir
27. Visualizing the image model
● We use class activation mapping to visualize the areas of
the image contributing to the sensitivity
28. Qualitative analysis: Human Moderator Study
● We label 100 nonsensitive random tweets and 100
sensitive tweets with our classifier.
● Two annotators look at the scores given by our system and
find 75 % to be correctly labeled
● There is only one false negative, implying that our system
has a very low miss rate
Labeled Positive Labeled Negative
Positive 99 1
Negative 33 67
29. Conclusion
● large corpus of weakly and a smaller dataset annotated by
first responders
● A multi-model classifier, for detecting sensitive content on
social media
● We show the superiority of our model by improving the
performance against other state of the art models
● We also inspect the model to see what it is learning
● Future work: extend to videos, gifs and include other kinds
of sensitive content
30. References
1. Wulczyn, Ellery, Nithum Thain, and Lucas Dixon. "Ex machina:
Personal attacks seen at scale." Proceedings of the 26th
International Conference on World Wide Web. International World
Wide Web Conferences Steering Committee, 2017.
2. Nobata, Chikashi, et al. "Abusive language detection in online
user content." Proceedings of the 25th international conference on
world wide web. International World Wide Web Conferences
Steering Committee, 2016.
3. Chancellor, Stevie, et al. "Multimodal Classification of
Moderated Online Pro-Eating Disorder Content." Proceedings of
the 2017 CHI Conference on Human Factors in Computing Systems.
ACM, 2017.