SlideShare a Scribd company logo
1 of 31
Download to read offline
Empowering First Responders through Automated
Multimodal Content Moderation
Divam Gupta, Indira Sen, Niharika Sachdeva, Ponnurangam
Kumaraguru, Arun Balaji Buduru
Why should we care about Sensitive content?
Why should we care about Sensitive content?
Why should we care about Sensitive content?
- Event or crises related sensitive
content can cause offline ramifications
- Have large-scale social and economic
impact
Who does it affect?
- Community moderators
strongly affected by
exposure to such content
Why multimodal?
● Most of the tweets contain
multimedia content such as
images , videos , etc
● Current text based models fail
when the main content is in the
tweet
● With a multimodal approach we
can jointly model different
content sources of the tweet
Roadmap
- Why should we care about sensitive content?
- Previous Work
- What is sensitive content?
- Data Collection
- Methodology
- Results
- Takeaways
Previous Work and Research Gaps
Content
Moderation
- Detecting personal attacks
using Logistic Regression
and large scale
annotations by et al. [1]
(Forms our baseline)
- Detecting hate speech in
Yahoo comments using
advanced NLP techniques
by et al. [2]
Previous Work and Research Gaps
Multimodal
detection
- Multimodal detection of
pro-anorexia content using
CNNs [3]
-
Previous Work and Research Gaps
Content
Moderation
Multimodal
detection
Our work
What is sensitive content?
Sensitivity Rulebook
Hate Speech
shows the citizen disrespect "on grounds of religion, race, place of birth,
residence, language, caste or community or any other ground whatsoever".
Violent/Gory
violent or gory content that's primarily intended to be shocking, sensational,
or disrespectful.
Political
Criticism
Content that brings or attempts to bring into hatred or contempt, or excites
or attempts to excite disaffection towards the Government.
Some examples:
Situational
Information
Event based content that is informative; curating or producing content;
contribute to situational awareness; situational information; contextual
information to better understand the situation
Mobilisation
Content that seeks to organize a movement or protest or content that
reports such an event
Text Sensitivity Dataset
● Level 1 Dataset:
○ Tweets from sensitive hashtags and non sensitive hashtags collected.
Sensitive Hashtag No of tweets
AsaramBapuji 190696
Freekashmir 74237
3rdhinduadhiveshan 38823
Owaisi 33098
lovejihad 24297
Non Sensitive hashtag No of tweets
Nifty 202894
IndvsSA 136096
MondayMotivation 110178
IPLfinal 103083
MWC16 92309
Text Sensitivity Dataset
● Level 2 Dataset:
○ Tweets from sensitive hashtags and annotated manually using codebook (one
of more sensitive categories is marked as sensitive).
Hashtag # Sensitive Tweets # Non Sensitive Tweets
CauveryProtest 2129 796
JaichandKejriwal 768 270
DhakaEid 1280 64
TamilNaduBandh 334 85
Kashmir 358 110
Jallikattu 1329 363
Image Sensitivity Dataset
- 4,500
sensitive and
nonsensitive
images.
Roadmap
- Why should we care about sensitive content?
- What is sensitive content?
- Data Collection
- Methodology
- Results
- Takeaways
Multimodal Sensitivity detection
Detecting Sensitivity in Text
● We use Recurrent Neural Networks for classifying the text
as sensitive and non-sensitive
● We learn randomly initialized word embeddings along with
the RNN classifier.
● The hidden state of the last time-step is passed to a fully
connected layer with softmax to predict the probability of
sensitivity
Detecting Sensitivity in Images
● We use a two stream Convolutional Neural Network to
classify sensitive images
● The object recognition model is pre-trained on the
ImageNet dataset
● The object recognition model is pre-trained on MIT Places
dataset
Multimodal Sensitivity detection
● We combine both the text models and the image models
which enables the model to learn the features jointly
● We concatenate the intermediate outputs of the image
model and the text model.
● In the end, we use a fully connected layer with softmax to
predict the probability of sensitivity
● We show the improvement in the results if we combine the
two models
Multimodal Sensitivity detection
Multilevel Sensitivity Classification
● Due to the skewness of the data, we get a lot of positives.
● To solve this we train a model to filter out the tweets which
are definitely not sensitive.
● We train the level 1 model on weakly annotated large data
● After filtering out the tweets, we train a level 2 classifier
which gives the final sensitivity score
Quantitative Results
Method F1 Score Accuracy
VGG16 Finetuning 0.5350 0.5500
VGG16 Features + SVM 0.8065 0.8069
Object Model 0.8343 0.8438
Object + Scene Model 0.8547 0.8550
● Results on the Image Only Dataset
Quantitative Results
Method F1 Score Accuracy
SVM Baseline 0.682 0.701
2 layer word LSTM (level 1
text model)
0.7372 0.7385
Character Level GRU( level
2 text model )
0.7180 0.7619
Word Level GRU ( level 2
text model )
0.7760 0.7816
Image + Text Model 0.8013 0.8051
● Results on the Tweets Dataset
Hyperparameters of the Best Performing Model
(Text + Image)
We got the optimal hyperparameters via grid search using cross
validation
Hyperparameter Value
Number of tokens 30
Dimension of the word embeddings 150
Number of GRU units 512
Image Size 224 x 224
Learning rate 0.01
Qualitative Results: Visualizing the text model
● We use gradient based class activation mapping to find out
the words contributing to the sensitivity score
● We see words like boycott, fighters etc are contributing to
the sensitivity score
Two suspected Bangladeshi
terrorists arrested with fake
aadhaar card along with an arms
dealer in Kolkata
Entire nation should boycott this movie.
We r never allow to someone destroy our
history. We will fight & we will win.
Indian commando, three
fighters killed in Kashmir
Visualizing the image model
● We use class activation mapping to visualize the areas of
the image contributing to the sensitivity
Qualitative analysis: Human Moderator Study
● We label 100 nonsensitive random tweets and 100
sensitive tweets with our classifier.
● Two annotators look at the scores given by our system and
find 75 % to be correctly labeled
● There is only one false negative, implying that our system
has a very low miss rate
Labeled Positive Labeled Negative
Positive 99 1
Negative 33 67
Conclusion
● large corpus of weakly and a smaller dataset annotated by
first responders
● A multi-model classifier, for detecting sensitive content on
social media
● We show the superiority of our model by improving the
performance against other state of the art models
● We also inspect the model to see what it is learning
● Future work: extend to videos, gifs and include other kinds
of sensitive content
References
1. Wulczyn, Ellery, Nithum Thain, and Lucas Dixon. "Ex machina:
Personal attacks seen at scale." Proceedings of the 26th
International Conference on World Wide Web. International World
Wide Web Conferences Steering Committee, 2017.
2. Nobata, Chikashi, et al. "Abusive language detection in online
user content." Proceedings of the 25th international conference on
world wide web. International World Wide Web Conferences
Steering Committee, 2016.
3. Chancellor, Stevie, et al. "Multimodal Classification of
Moderated Online Pro-Eating Disorder Content." Proceedings of
the 2017 CHI Conference on Human Factors in Computing Systems.
ACM, 2017.
Thanks!
arunb@iiitd.ac.in

More Related Content

Similar to Automated Multimodal Content Moderation for First Responders

Weird News Ranking : IRE project
Weird News Ranking : IRE projectWeird News Ranking : IRE project
Weird News Ranking : IRE projectRupali Aher
 
Data-driven Studies on Social Networks: Privacy and Simulation
Data-driven Studies on Social Networks: Privacy and SimulationData-driven Studies on Social Networks: Privacy and Simulation
Data-driven Studies on Social Networks: Privacy and SimulationSameera Horawalavithana
 
LOne A Novel Approach Towards Fake News Detection Using Customized Bidirectio...
LOne A Novel Approach Towards Fake News Detection Using Customized Bidirectio...LOne A Novel Approach Towards Fake News Detection Using Customized Bidirectio...
LOne A Novel Approach Towards Fake News Detection Using Customized Bidirectio...SAMIMAKTAR9
 
Machine Learning BASICS AND ITS TYPES and its difference
Machine Learning BASICS AND ITS TYPES and its differenceMachine Learning BASICS AND ITS TYPES and its difference
Machine Learning BASICS AND ITS TYPES and its differencedevismileyrockz
 
AGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODELAGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODELIRJET Journal
 
Trending Topics in Machine Learning
Trending Topics in Machine LearningTrending Topics in Machine Learning
Trending Topics in Machine LearningTechsparks
 
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...Wuhan University
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSumit Raj
 
Machine learning_ Replicating Human Brain
Machine learning_ Replicating Human BrainMachine learning_ Replicating Human Brain
Machine learning_ Replicating Human BrainNishant Jain
 
IRJET- Machine Learning: Survey, Types and Challenges
IRJET- Machine Learning: Survey, Types and ChallengesIRJET- Machine Learning: Survey, Types and Challenges
IRJET- Machine Learning: Survey, Types and ChallengesIRJET Journal
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceHiba Akroush
 
Elderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detectionElderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detectionTanvi Mittal
 
Multi-level game learning analytics for serious games - VSGames 2018
Multi-level game learning analytics for serious games - VSGames 2018Multi-level game learning analytics for serious games - VSGames 2018
Multi-level game learning analytics for serious games - VSGames 2018Iván Pérez Colado
 
Sensitive label privacy protection on social
Sensitive label privacy protection on socialSensitive label privacy protection on social
Sensitive label privacy protection on socialIEEEFINALYEARPROJECTS
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruptionjagan477830
 
A Hybrid Theory Of Power Theft Detection
A Hybrid Theory Of Power Theft DetectionA Hybrid Theory Of Power Theft Detection
A Hybrid Theory Of Power Theft DetectionCamella Taylor
 
AI and Video Marketing.docx
AI and Video Marketing.docxAI and Video Marketing.docx
AI and Video Marketing.docxDigiworq
 
IRJET - Cyberbulling Detection Model
IRJET -  	  Cyberbulling Detection ModelIRJET -  	  Cyberbulling Detection Model
IRJET - Cyberbulling Detection ModelIRJET Journal
 

Similar to Automated Multimodal Content Moderation for First Responders (20)

Weird News Ranking : IRE project
Weird News Ranking : IRE projectWeird News Ranking : IRE project
Weird News Ranking : IRE project
 
Data-driven Studies on Social Networks: Privacy and Simulation
Data-driven Studies on Social Networks: Privacy and SimulationData-driven Studies on Social Networks: Privacy and Simulation
Data-driven Studies on Social Networks: Privacy and Simulation
 
LOne A Novel Approach Towards Fake News Detection Using Customized Bidirectio...
LOne A Novel Approach Towards Fake News Detection Using Customized Bidirectio...LOne A Novel Approach Towards Fake News Detection Using Customized Bidirectio...
LOne A Novel Approach Towards Fake News Detection Using Customized Bidirectio...
 
Machine Learning BASICS AND ITS TYPES and its difference
Machine Learning BASICS AND ITS TYPES and its differenceMachine Learning BASICS AND ITS TYPES and its difference
Machine Learning BASICS AND ITS TYPES and its difference
 
AGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODELAGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODEL
 
Trending Topics in Machine Learning
Trending Topics in Machine LearningTrending Topics in Machine Learning
Trending Topics in Machine Learning
 
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Machine learning_ Replicating Human Brain
Machine learning_ Replicating Human BrainMachine learning_ Replicating Human Brain
Machine learning_ Replicating Human Brain
 
IRJET- Machine Learning: Survey, Types and Challenges
IRJET- Machine Learning: Survey, Types and ChallengesIRJET- Machine Learning: Survey, Types and Challenges
IRJET- Machine Learning: Survey, Types and Challenges
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Elderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detectionElderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detection
 
Bitcoin Price Prediction
Bitcoin Price PredictionBitcoin Price Prediction
Bitcoin Price Prediction
 
Multi-level game learning analytics for serious games - VSGames 2018
Multi-level game learning analytics for serious games - VSGames 2018Multi-level game learning analytics for serious games - VSGames 2018
Multi-level game learning analytics for serious games - VSGames 2018
 
Sensitive label privacy protection on social
Sensitive label privacy protection on socialSensitive label privacy protection on social
Sensitive label privacy protection on social
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
 
A Hybrid Theory Of Power Theft Detection
A Hybrid Theory Of Power Theft DetectionA Hybrid Theory Of Power Theft Detection
A Hybrid Theory Of Power Theft Detection
 
AI and Video Marketing.docx
AI and Video Marketing.docxAI and Video Marketing.docx
AI and Video Marketing.docx
 
IRJET - Cyberbulling Detection Model
IRJET -  	  Cyberbulling Detection ModelIRJET -  	  Cyberbulling Detection Model
IRJET - Cyberbulling Detection Model
 

More from IIIT Hyderabad

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayIIIT Hyderabad
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesIIIT Hyderabad
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasIIIT Hyderabad
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIIIT Hyderabad
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityIIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...IIIT Hyderabad
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper IIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...IIIT Hyderabad
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceIIIT Hyderabad
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...IIIT Hyderabad
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesIIIT Hyderabad
 

More from IIIT Hyderabad (20)

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success stories
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake News
 
#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBias
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial Advice
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian Languages
 

Recently uploaded

List of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfList of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfisabel213075
 
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfChristianCDAM
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Romil Mishra
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTSneha Padhiar
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptJohnWilliam111370
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESkarthi keyan
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfalene1
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosVictor Morales
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Sumanth A
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsResearcher Researcher
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communicationpanditadesh123
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdfCaalaaAbdulkerim
 

Recently uploaded (20)

List of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfList of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdf
 
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
 
Designing pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxDesigning pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptx
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdf
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitos
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending Actuators
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communication
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdf
 

Automated Multimodal Content Moderation for First Responders

  • 1. Empowering First Responders through Automated Multimodal Content Moderation Divam Gupta, Indira Sen, Niharika Sachdeva, Ponnurangam Kumaraguru, Arun Balaji Buduru
  • 2. Why should we care about Sensitive content?
  • 3. Why should we care about Sensitive content?
  • 4. Why should we care about Sensitive content? - Event or crises related sensitive content can cause offline ramifications - Have large-scale social and economic impact
  • 5. Who does it affect? - Community moderators strongly affected by exposure to such content
  • 6. Why multimodal? ● Most of the tweets contain multimedia content such as images , videos , etc ● Current text based models fail when the main content is in the tweet ● With a multimodal approach we can jointly model different content sources of the tweet
  • 7. Roadmap - Why should we care about sensitive content? - Previous Work - What is sensitive content? - Data Collection - Methodology - Results - Takeaways
  • 8. Previous Work and Research Gaps Content Moderation - Detecting personal attacks using Logistic Regression and large scale annotations by et al. [1] (Forms our baseline) - Detecting hate speech in Yahoo comments using advanced NLP techniques by et al. [2]
  • 9. Previous Work and Research Gaps Multimodal detection - Multimodal detection of pro-anorexia content using CNNs [3] -
  • 10. Previous Work and Research Gaps Content Moderation Multimodal detection Our work
  • 11. What is sensitive content?
  • 12. Sensitivity Rulebook Hate Speech shows the citizen disrespect "on grounds of religion, race, place of birth, residence, language, caste or community or any other ground whatsoever". Violent/Gory violent or gory content that's primarily intended to be shocking, sensational, or disrespectful. Political Criticism Content that brings or attempts to bring into hatred or contempt, or excites or attempts to excite disaffection towards the Government. Some examples: Situational Information Event based content that is informative; curating or producing content; contribute to situational awareness; situational information; contextual information to better understand the situation Mobilisation Content that seeks to organize a movement or protest or content that reports such an event
  • 13. Text Sensitivity Dataset ● Level 1 Dataset: ○ Tweets from sensitive hashtags and non sensitive hashtags collected. Sensitive Hashtag No of tweets AsaramBapuji 190696 Freekashmir 74237 3rdhinduadhiveshan 38823 Owaisi 33098 lovejihad 24297 Non Sensitive hashtag No of tweets Nifty 202894 IndvsSA 136096 MondayMotivation 110178 IPLfinal 103083 MWC16 92309
  • 14. Text Sensitivity Dataset ● Level 2 Dataset: ○ Tweets from sensitive hashtags and annotated manually using codebook (one of more sensitive categories is marked as sensitive). Hashtag # Sensitive Tweets # Non Sensitive Tweets CauveryProtest 2129 796 JaichandKejriwal 768 270 DhakaEid 1280 64 TamilNaduBandh 334 85 Kashmir 358 110 Jallikattu 1329 363
  • 15. Image Sensitivity Dataset - 4,500 sensitive and nonsensitive images.
  • 16. Roadmap - Why should we care about sensitive content? - What is sensitive content? - Data Collection - Methodology - Results - Takeaways
  • 18. Detecting Sensitivity in Text ● We use Recurrent Neural Networks for classifying the text as sensitive and non-sensitive ● We learn randomly initialized word embeddings along with the RNN classifier. ● The hidden state of the last time-step is passed to a fully connected layer with softmax to predict the probability of sensitivity
  • 19. Detecting Sensitivity in Images ● We use a two stream Convolutional Neural Network to classify sensitive images ● The object recognition model is pre-trained on the ImageNet dataset ● The object recognition model is pre-trained on MIT Places dataset
  • 20. Multimodal Sensitivity detection ● We combine both the text models and the image models which enables the model to learn the features jointly ● We concatenate the intermediate outputs of the image model and the text model. ● In the end, we use a fully connected layer with softmax to predict the probability of sensitivity ● We show the improvement in the results if we combine the two models
  • 22. Multilevel Sensitivity Classification ● Due to the skewness of the data, we get a lot of positives. ● To solve this we train a model to filter out the tweets which are definitely not sensitive. ● We train the level 1 model on weakly annotated large data ● After filtering out the tweets, we train a level 2 classifier which gives the final sensitivity score
  • 23. Quantitative Results Method F1 Score Accuracy VGG16 Finetuning 0.5350 0.5500 VGG16 Features + SVM 0.8065 0.8069 Object Model 0.8343 0.8438 Object + Scene Model 0.8547 0.8550 ● Results on the Image Only Dataset
  • 24. Quantitative Results Method F1 Score Accuracy SVM Baseline 0.682 0.701 2 layer word LSTM (level 1 text model) 0.7372 0.7385 Character Level GRU( level 2 text model ) 0.7180 0.7619 Word Level GRU ( level 2 text model ) 0.7760 0.7816 Image + Text Model 0.8013 0.8051 ● Results on the Tweets Dataset
  • 25. Hyperparameters of the Best Performing Model (Text + Image) We got the optimal hyperparameters via grid search using cross validation Hyperparameter Value Number of tokens 30 Dimension of the word embeddings 150 Number of GRU units 512 Image Size 224 x 224 Learning rate 0.01
  • 26. Qualitative Results: Visualizing the text model ● We use gradient based class activation mapping to find out the words contributing to the sensitivity score ● We see words like boycott, fighters etc are contributing to the sensitivity score Two suspected Bangladeshi terrorists arrested with fake aadhaar card along with an arms dealer in Kolkata Entire nation should boycott this movie. We r never allow to someone destroy our history. We will fight & we will win. Indian commando, three fighters killed in Kashmir
  • 27. Visualizing the image model ● We use class activation mapping to visualize the areas of the image contributing to the sensitivity
  • 28. Qualitative analysis: Human Moderator Study ● We label 100 nonsensitive random tweets and 100 sensitive tweets with our classifier. ● Two annotators look at the scores given by our system and find 75 % to be correctly labeled ● There is only one false negative, implying that our system has a very low miss rate Labeled Positive Labeled Negative Positive 99 1 Negative 33 67
  • 29. Conclusion ● large corpus of weakly and a smaller dataset annotated by first responders ● A multi-model classifier, for detecting sensitive content on social media ● We show the superiority of our model by improving the performance against other state of the art models ● We also inspect the model to see what it is learning ● Future work: extend to videos, gifs and include other kinds of sensitive content
  • 30. References 1. Wulczyn, Ellery, Nithum Thain, and Lucas Dixon. "Ex machina: Personal attacks seen at scale." Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2017. 2. Nobata, Chikashi, et al. "Abusive language detection in online user content." Proceedings of the 25th international conference on world wide web. International World Wide Web Conferences Steering Committee, 2016. 3. Chancellor, Stevie, et al. "Multimodal Classification of Moderated Online Pro-Eating Disorder Content." Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 2017.