1. 1!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
1!
Mining Social Media Data for Policing
Presenting: Miriam Fernandez, Knowledge Media Institute
Work done in collaboration with some
fantastic colleagues!
@miriam_fs
fernandezmiriam
@miriamfs
3. 3!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
3! Three lines of work presented in this talk
• Detecting Grooming
Behaviour on Social
Media
• Radicalisation detection
on Social Media
• Policing Engagement via
Social Media
6. 6!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
6!
Detecting Grooming
Behaviour on Social Media
Cano, E; Miriam, F.; and Alani, H (2014). Detecting child grooming
behaviour patterns on social media. The 6th International Conference on
Social Informatics (SocInfo), Barcelona, Spain.
Some of the next slides from: https://www.slideshare.net/halani
7. 7!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Child Grooming
Premeditated behaviour intending to secure the
trust of a minor as a first step towards future
engagement in sexual conduct.
Choo, K-K R. Responding to online child sexual grooming: an industry perspective,
Trends & issues in crime and criminal justice, no. 379. July 2009
8. 8!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Claire Lilley, Ruth Ball, Heather
Vernon,
The experiences of 11-16 year olds
on social networking sites, NSPCC
2014
“findings show
that approximately
190,000 UK children
(1 in 58) will suffer contact
sexual abuse by a non-
related adult before turning
18, with approximately
10,000 new child victims of
contact sexual abuse being
reported in the UK each
year.”
9. 9!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
“50% of all 11 and 12 year-olds in
the UK use a social networking
site, according to our research. This
is because it's easy for children to
access sites intended for older
users.”
https://www.nspcc.org.uk/preventing-abuse/keeping-children-
safe/share-aware/
10. 10!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
https://www.statista.com/statistics/
271348/facebook-users-in-the-
united-kingdom-uk-by-age/
11. 11!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Children’s use of mobile phones - A special report 2014.
http://www.gsma.com/publicpolicy/wp-content/uploads/2012/03/GSMA_Childrens_use_of_mobile_phones_2014.pdf
12. 12!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
https://www.thinkuknow.co.uk/parents/articles/Online-grooming/
Online Grooming
13. 13!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
https://
www.thinkuknow.co.uk/
14_plus/Need-advice/
Online-grooming/
Signs of Online Grooming
14. 14!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Predator: hey whats up?…
Predator: I like your pic, very cute
Predator: so you're in san diego?
13-yr-old-girl: not far
Predator: ok, you like older guys?
13-yr-old-girl: thers nice or bad ppl all ages
Predator: have some pics if you want to see
Predator: do your parents look on your computer?
Predator: so are you by yourself or is someone else there with you?
Predator: so it should just be us, our little secret
Predator: so have you ever snuck out?
13-yr-old-girl: not rlly lol
Predator: yeah, what about tonight?
Predator: think you could sneak out tonight?
Predator: well if the wrong person found out then I'd be screwed
13-yr-old-girl: im not a teller lol
Predator: I know, just wouldn't want your dad to find out
Predator: if you are still up why not sneak out for a few minutes
Predator: but that's the fun of it
13-yr-old-girl: fun to sneak?
Predator: yes
Predator: so your dad doesn't know
Predator: would take a nap but I leave for bible study around 6:30
Predator: I know I'm bad, going to bible study and talking about sex with you
Predator: yeah, there's nothing wrong with us being friends, we have the same lord remember ;)
Predator: would take me like an hour and a half to get there
Predator: see you in a little while
~700 messages
Over a 5 month
period
Grooming in Action
15. 15!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Olson, L. N., Daggs, J. L., Ellevold, B. L. and Rogers, T. K. K. (2007), Entrapping the
Innocent: Toward a Theory of Child Sexual Predators’ Luring Communication.
Communication Theory, 17: 231–251
Olson’s Theory of Luring Communication (LTC)
16. 16!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Predator: hey whats up?…
Predator: I like your pic, very cute
Predator: so you're in san diego?
13-yr-old-girl: not far
Predator: ok, you like older guys?
13-yr-old-girl: thers nice or bad ppl all ages
Predator: have some pics if you want to see
Predator: do your parents look on your computer?
Predator: so are you by yourself or is someone else there with you?
Predator: so it should just be us, our little secret
Predator: so have you ever snuck out?
13-yr-old-girl: not rlly lol
Predator: yeah, what about tonight?
Predator: think you could sneak out tonight?
Predator: well if the wrong person found out then I'd be screwed
13-yr-old-girl: im not a teller lol
Predator: I know, just wouldn't want your dad to find out
Predator: if you are still up why not sneak out for a few minutes
Predator: but that's the fun of it
13-yr-old-girl: fun to sneak?
Predator: yes
Predator: so your dad doesn't know
Predator: would take a nap but I leave for bible study around 6:30
Predator: I know I'm bad, going to bible study and talking about sex with you
Predator: yeah, there's nothing wrong with us being friends, we have the same lord remember ;)
Predator: would take me like an hour and a half to get there
Predator: see you in a little while
Approach
Grooming
Trust
Development
Isolation
Physical
Approach
Physical
Approach
17. 17!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Predator: hey whats up?…
Predator: I like your pic, very cute
Predator: so you're in san diego?
13-yr-old-girl: not far
Predator: ok, you like older guys?
13-yr-old-girl: thers nice or bad ppl all ages
Predator: have some pics if you want to see
Predator: do your parents look on your computer?
Predator: so are you by yourself or is someone else there with you?
Predator: so it should just be us, our little secret
Predator: so have you ever snuck out?
13-yr-old-girl: not rlly lol
Predator: yeah, what about tonight?
Predator: think you could sneak out tonight?
Predator: well if the wrong person found out then I'd be screwed
13-yr-old-girl: im not a teller lol
Predator: I know, just wouldn't want your dad to find out
Predator: if you are still up why not sneak out for a few minutes
Predator: but that's the fun of it
13-yr-old-girl: fun to sneak?
Predator: yes
Predator: so your dad doesn't know
Predator: would take a nap but I leave for bible study around 6:30
Predator: I know I'm bad, going to bible study and talking about sex with you
Predator: yeah, there's nothing wrong with us being friends, we have the same lord remember ;)
Predator: would take me like an hour and a half to get there
Predator: see you in a little while
Approach
Grooming
Trust
Development
Isolation
Physical
Approach
Physical
Approach
Can we automatically
identify these stages?
18. 18!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
“think you could sneak out tonight?“
Grooming
Trust
Development
Physical
Approach other
Automatic Classifiers
Yes NoNoNo
Identifying Grooming Stages
19. 19!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Dataset
• 50 transcripts of conversations between
convicted predators and volunteers who
posed as minors
• Conversations vary between 83 to 12K
lines.
• Each predator line manually labelled by
two annotators.
• Annotations labels: 1)Trust
development, 2) Grooming, 3) Seek
physical approach, 4) Other.
Trust Dev. Grooming Phys. Approach Other
1225 3304 2700 3304sentences
20. 20!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Processing Chat Text
• Challenges in processing chat-room conversations
– Use of irregular and ill-formed words.
– Use of chat slang and teen-lingo
– Use of emoticons.
Generated a list of over 1K terms and definitions:
Chat term Translation Emoticon Translation
ASLP Age, sex, location, picture :’-( I’m crying
AWGTHTHTTA Are we going to have to go
through this again?
o/o High five
BRB Be right back @_@ I’m tired, trying to stay
awake
CWOT Complete waste of time ( ‘}{‘ ) kiss
21. 21!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Analysis Features and Results
Results - with all features:
Feature Description
N-gram word combinations extracted from text (N=1,2,3)
Part-of-speech tagging noun, verb, adjective, plural, etc.
sentiment average sentiment of terms in sentence
length number of words in sentence
Psycho-linguistic Patterns 62 psycho-linguistic patterns in English (swearing, sexual,
agreement, etc.) LIWC
Semantic frames Type of event, relation, or entity in text, e.g., secrecy,
desirability, emotion, kinship (SEMAPHORE)
Trust
Development
Grooming Phys. Approach average
Precision 79.2% 87.6% 87.2% 84.7%
Recall 82.3% 88.8% 88.7% 86.6%
F1 80.7% 88.2% 87.9% 85.6%
22. 22!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Next Steps
• Explore the development into Apps
• Understand how the alerts should be provided, when
• What action should they enforce or suggest
• How to assess vulnerability and how to inform the child
• Explore the use of more features, higher accuracy
23. 23!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
23!
Radicalisation detection
on Social Media
Fernandez M.. Asif, M. Alani, H. Understanding the roots of radicalisation on Twitter. WebScience2018
Saif H. Fernandez M. Dickinson T, Kastler L. & Alani H. A Semantic Graph-based Approach for Radicalisation
Detection on Social Media. ESWC 2017
Saif H. Fernandez, M. Rowe, M. & Alani H. On the Role of Semantics for Detecting pro-ISIS stances on social
media. ISWC 2016
Rowe M & Saif H. Mining Pro-ISIS Radicalisation Signals from Social Media Users. ICWSM 2016.
Some of the next slides from: https://www.slideshare.net/Staano/
24. 24!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Online
Radicalisation
• Is the process by which
individuals are introduced to
ideological messages and
belief systems that
encourage movement from
mainstream beliefs toward
extreme views, primarily
through the use of online
media[International Assoc of Chiefs of Police and United
States of America]
25. 25!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Islamic State in Iraq and Syria (ISIS)
Social Media Propaganda & Recruiting
29. 29!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Research Questions and Objectives
• RQ1: How can we detect when a user has adopted a pro-
ISIS stance?
• RQ2: What happens to Twitter users before and after the
exhibit radicalised behaviour?
• RQ3: What influences users to adopt pro-ISIS language?
30. 30!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Data Collection and Analysis
Kurdish
Jihadist
Pro-Assad
Secular/
Moderate
Fig. 1: Syrian account network (652 nodes, 3,260 edges). Four major categories; Jihadist (gold, right), Kurdish (red, top),
Pro-Assad (purple, left), and Secular/Moderate opposition (blue, center). Black nodes are members of multiple communities.
Visualization was performed with the OpenOrd layout in Gephi.
contrast with the polarization analyzed in certain studies of
mainstream political activism [3], [10], the three communities
selected consist of two polar opposites, jihadist and secular
revolutionary, with the third community considerably moderate
in comparison. The analysis process includes the generation
found few references to these from the liberal and conservative
blogs), but suggested that they could be considered in future
analysis. Progressive and conservative polarization on Twitter
was investigated by Conover et al. , where hashtags were used
to gather data leading to two network representations based on
O’Callaghan et al. 2014
625 Users
2.4M Users
154K EU Users
104M Tweets
English
43%
Arabic
41%
Others
16%
31. 31!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Identifying Signals of Radicalisation
Lexicon- and Network-based Approach
H1 – Sharing Incitement Material H2 – Using Extremist
Language
الخلافة دولة
ISIS
Shirk
Caliphate
Islamic State
ارهاب
Radicalization LexiconKnown suspended ISIS Accounts
727!
32. 32!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Activation Points (RQ1)
• Increase in users activated
between May 2014 and
November 2014 coincides with
execution of 6 hostages by ISIS
and the videos of these
executions posted via social
media
• The majority of users posts pro-
ISIS terms before sharing
content from pro-ISIS accounts
Table 2: Significant events involving ISIS/ISIL and the West.
Date Description
08-04-2013 ISIS expand into Syria
04-01-2014 Fallujah captured by ISIS
15-01-2014 ISIL retake Ar-Raqqah
01-05-2014 ISIS carry out public executions in Ar-Raqqah
09-06-2014 Mosul falls under ISIS control
02-09-2014 Hostage Steven Sotloff executed
13-09-2014 Hostage David Haines executed
22-09-2014 Hostage Samira Salih al-Nuaimi executed
03-10-2014 Hostage Alan Henning executed
07-10-2014 Abu Bakr al-Baghdadi injured in US air strike
16-10-2014 Hostage Peter Kassig executed
14-01-2015 Christopher Lee Cornell arrested for bomb plot
25-01-2015 Hostage Haruna Yukawa executed
31-01-2015 Hotage Kenji Goto executed
06-02-2015 Hostage Kayla Mueller killed in air strike
26-02-2015 Jihadi John is identified as Mohammed Emwazi
18-03-2015 ISIS responsible for Tunisia museum attack
15-05-2015 Abu Sayyaf killed by US special forces
30-06-2015 Alaa Saadeh arrested for attempts to aid ISIS
11-07-2015 Maher Meshaal killed in coalition air strike
ses. Figure 2(a) and figure 2(b) show the number of users
who are activated on each day according to each hypothesis.
We note that the span of activations of H1 users is shorter
than H2 users - as the former requires sharing content from
banned or pro-ISIS accounts, while the latter looks at the
use of pro-ISIS terms. One thing that is immediately appar-
ent from the plots is that there is a large surge in activity
from May 2014 onwards - for both H1 and H2 activations.
To investigate why this surge occurs, we identified a series
of key events related to ISIS/ISIL from 2013 onwards - these
are shown in Table 2. As noted, the increase in activations
between May 2014 and November 2014 coincides with exe-
cution of 6 hostages by ISIS and the videos of these execu-
tions posted via social media. Although we cannot discern
causation (of activation) from correlation here, there does
Detecting
Having det
the H1 and
amine wha
RQ2: What
icalised beh
haviour is
measureme
used by a u
tweets), (ii
(i.e. propag
the user has
as lexical, s
forms a dis
from a give
Each distrib
distribution
window: fo
(PL
[t,t0)) is t
within the u
dealing wit
cess of tran
to English u
guages to b
In order
changed on
(aka. Kullb
dows. Each
then forms
mension ha
fore the mi
denote the
distribution
puted using
As ment
over three w
33. 33!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Behaviour Before/After Activation (RQ2)
• Users exhibit a large divergence in their language once
activated
– Before activation the majority of topics users discuss focus on politics,
where words like Syria, Israel and Egypt are mentioned in a negative
context and with high frequency
– After activation religious words (e.g. Allah, muslims, quran) become
more popular.
Pre-Activation Activation Post-Activation
34. 34!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Influencing Pro-ISIS Term Adoption (RQ3)
• We study the effect of
– Lexical Homophily: similarity in language
– Sharing Homophily: diffusion of information from the same accounts
– Interaction Homophily: common communications
Social dynamics play a strong role in term uptake. Subcommunities act
as bridges between radicalised user and the future adopter
pro-ISIS UserPotential Adopter
36. 36!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Radicalisation Detection Background
Machine Learning ApproachesLexicon-based Approaches
Stance Label
Gonna kidnap journalists and cut their heads off
ISIS isn’t evil, it’s made up of people doing what
they think is best for their community
The brothers from Charlie Hebdo attack did
their part. It’s time for brothers in the UK to do
their part
الخلافة دولة
ISIS
Shirk
Caliphate
Islamic State
ارهاب
Radicalization
Lexicon
37. 37!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Tweets
Conceptual.
Semantics.
Extraction
DBpedia
Semantic.Graph.
Representation
Frequent.Semantic.
Subgraph.Mining
Classifier.Training
Pipeline of detecting pro-ISIS stances using semantic sub-graph mining-based feature extraction
• Extract and use the semantic interdependencies and relations between
words to learn patterns of radicalisation.
ISIS
Syria
Jihadist Group
Country
(Military Intervention Against ISIL, place, Syria)
Entities Concepts Semantic Relations
Semantic Graph-based Approach for Pro-ISIS
Stance Detection
38. 38!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Semantic Graph-based Approach for Pro-ISIS
Stance Detection
Step 1. Conceptual Semantic Extraction
Training Data: 566 pro-ISIS users / 566 anti-ISIS users (extracted using lexicons)
Entity Extraction and
Semantics Mapping
Syria -> Country
ISIL-> Jihadist Group
Syria -> Country
ISIL-> Jihadist Group
pro-ISIS
No. of Unique Entities 32,406
No. of Unique Concepts 35
Entity Concept E
Top 10 Frequent Entities & their
Concepts
MSNBC Company B
Iraq Country U
Allah Person K
America Continent L
Muslim Person IS
Officer JobTitle S
Wounds HealthCondition E
Syria Country Ir
WAPO PrintMedia K
Israel Country P
Table 1: Total number and top 10 frequent entities and their associated sem
from our dataset.
two named entities this approach takes as input the identifiers (i.e
entity es, the target entity et and an integer value K that determin
length of the relations between the two named entities. The outpu
queries that enable the retrieval of paths of length at most K conn
that in order to extract all the paths, all the combinations of ingoing
be considered. For example, if we were interested in finding pat
connecting es = Syria and et = ISIL our approach will consid
SPARQL queries:
SELECT * WHERE {:Syria ?p1 :ISIL}
SELECT * WHERE {:ISIL ?p1 :Syria}
SELECT * WHERE {:Syria ?p1 ?n1. ?n1 ?p2 :ISIL}
SELECT * WHERE {:Syria ?p1 ?n1. :ISIL ?p2 ?n1}
SELECT * WHERE {?n1 ?p1 :Syria. :ISIL ?p2 ?n1}
SELECT * WHERE {?n1 ?p1 :Syria. ?n1 ?p2 :ISIL}
As it can be observed, the first two queries consider paths o
path may exist in two directions, two queries are required. The
length 2 requires 4 queries. In general, given a value K, to retrie
2k
queries are required. Figure 2 shows an example of the sema
entities Syria and ISIL. As can be noted, these two entities are
direct semantic relation (e.g., ISIL < headquarters > Syria)
(e.g., ISIL < ideology > Pan Islam < ideology > Musl
DBpedia
Step 2. Semantic Graph Representation
Step 3. Sub-graph Mining CloseGraph Method (Yan and Han 2003)
!
39. 39!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Evaluation & Results
• Baseline for comparison SVM classifiers trained from Unigrams,
Topic, Sentiment and Network feature sets.
• 10-Folds cross validation over 30 runs
classifiers trained from the 4 sets of features described in Section 4.2. Results in all
experiments are computed using 10-fold cross validation over 10 runs of different ran-
dom splits of the data to test their significance. Statistical significance is done using
Wilcoxon signed-rank test [16]. Note that all the results in average Precision, Recall and
F1-measure reported in this section are statistically significant with ⇢ < 0.001.
Table 3 shows the results of our binary stance classification (pro-ISIS vs. anti-ISIS)
using Unigrams, Sentiment, Topic, and Semantic features after feature selection, applied
over the 1,132 users in our dataset. The table reports three sets of precision (P), recall
(R), and F1-measure (F1), one for anti-ISIS stance identification, one for pro-ISIS stance
identification, and the third shows the averages of the two. The table also reports the
total number of features used for classification under each feature set.
anti-ISIS pro-ISIS Average
No. of Features P R F1 P R F1 P R F1
UNIGRAMS 41,200 0.814 0.919 0.863 0.907 0.79 0.844 0.86 0.854 0.854
SENTIMENT 41,362 0.814 0.919 0.863 0.907 0.79 0.844 0.86 0.854 0.854
TOPICS 992 0.771 0.943 0.848 0.927 0.719 0.81 0.849 0.831 0.829
NETWORK 25,532 0.897 0.827 0.86 0.839 0.905 0.871 0.868 0.866 0.866
SEMANTICS 8,798 0.994 0.852 0.917 0.87 0.995 0.928 0.932 0.923 0.923
Table 3: Classification performance of the five feature sets with IG feature selection. The values
highlighted in grey correspond to the best results obtained for each feature. Results in average P, R
and F1 are statistically significant with ⇢ < 0.001.
According to the results presented in Table 3, the proposed Semantic features outper-
form the 4 baseline feature sets in all average measures by a large margin. In particular,
classifiers trained from Semantic features produce 7.8% higher Recall, 7.7% higher
precision, and 7.82% higher F1 than all baselines on average. Network features come
next, followed by Unigrams features, with approximately 87% and 85% in average
F1 respectively. On the other hand, Topic features produce the lowest classification
86.3 86.3
84.8
86
91.7
84.4 84.4
81
87.1
92.8
80
82
84
86
88
90
92
94
Unigrams Sen6ment Topics Network Seman6cs
an6-ISIS pro-ISIS
Exploration of semantic
sub-graphs
• pro-ISIS users tend to
discuss about
religion, historical
events and ethnicity
• anti-ISIS users focus
more on politics,
geographical
locations and
interventions against
ISIS
!
40. 40!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
40
Automatic Detection of pro-
ISIS stances
When pro-ISIS and “general”
users both use radical
terminology
41. 41!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Research Questions
• Problem
– Existing methods to automatically identify radical content online
mainly rely on the use of glossaries (i.e., lists of terms and
expressions associated with religion, war, offensive language,
etc.)
– These methods are not always effective and we continue to
observe that many who use radicalisation terminology in their
tweets are simply reporting current events, or sharing harmless
religious rhetoric
• Research question
– Are there significant variances between the semantic contexts of
radicalisation terminology when this terminology is used to
convey ’radicalised’ meaning vs. when it is not?
42. 42!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Contextual Divergence in the use of Radical
Terminology
17K Tweets
from pro-
ISIS users!
97K tweets
from
“general”
users using
the same
terminology!
Radicalisation Lexicon: 556 terms !
43. 43!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Results
Contextual divergence exist
The most discrimina6ve contextual dimension among
categories, topics, en66es and types is en##es
45. 45!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Social Science vs. Computer Science
What are the factors that drive
people to get radicalised? (e.g., failed
integration, poverty, discrimination)
What are the roots of radicalisation?
(micro-level, meso-level, macro-level)
How the radicalisation process
happens and evolves, i.e., what are
its different stages? (e.g., pre-
radicalisation, self-identification,
indoctrination, Jihadisation)!
• Analysis (how ISIS members use Twitter
to radicalise and recruit other users?)
• Detection (can we create methods for the
automatic detection of radical content and
radicalised users?)
• Prediction (can we predict whether
someone will interact with radical content
or users? Can we predict whether
someone will become radicalised?)
46. 46!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Roots of Radicalisation
Micro or Individual roots!
Macro or Global roots! Meso or group roots!
RADICALISATION INFLUENCE
49. 49!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
Challenges of researching online radicalisation
• Lack of gold-standard datasets
– Existing datasets are rarely verified by experts
– Annotating this data requires religious, cultural and political knowledge
• Data Collection
– Once accounts are closed is not possible to access the data
– Data among researchers is not commonly shared (sensitive data)
• Data Analysis
– Need to be *extremely careful* with the false positives
– Dynamics (changes in terminology, procedures, etc.)
• ETHICS!
51. 51!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
51!
Policing Engagement
via Social Media
Miriam Fernandez, Tom Dickinson, and Harith Alani. ”And analysis of UK policing
engagement via social media." International Conference on Social Informatics. Springer
International Publishing, 2017.
Miriam Fernandez, A. Elizabeth Cano, and Harith Alani. "Policing engagement via social
media." International Conference on Social Informatics. Springer International Publishing,
2014.
52. 52!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
52! Policing Engagement via Social Media
• Policing organisations use social
media to spread the word on
crime, severe weather, missing
people, …
• Many forces have staff dedicated
to this purpose and to improve the
spreading of key messages to
wider social media communities
• Research shows that exchanges
between police and citizens are
infrequent
53. 53!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
53! Goal
• Understand what attracts
citizen’s to social media
policing content
– What are the characteristics of the
content that generate higher
attention levels
• Writing style
• Time of posting
• Topics
– Help police forces to identify actions
and recommendations to increase
public engagement
55. 55!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
55! Understanding Engagement
• Social media engagement has been studied
– Through multiple lenses (marketing, social sciences, computer science)
– In multiple scenarios (product selling, elections, campaigns, etc.)
• Study the literature of social media engagement
– [Ariely] Very clear message with a very concrete action
• Patrol, missing persons, incidents, emergencies, local authorities? What
can/should I do?
– [Vaynerchuk] Need to differentiate each social medium (context)
• What happens in the world? To whom is the message targeted?
• Study the literature of social media police engagement
– Works mainly focus on studying the different social media strategies that police
forces use to interact with the public
• [Denef] UK Riots 2011. Instrumental vs. expressive approach
56. 56!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
56! Barriers of Social Media Police Engagement (I)
• Legitimacy
The police needs the trust and confidence
of the communities they serve
!
57. 57!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
57! Barriers of Social Media Police Engagement (II)
• Reputation
• Official communication
channels (911)
• Surveillance
• Variety of topics
• Budget
58. 58!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
58! Approach (I)
• Data Collection
– 154,679 posts from 48 corporate Twitter accounts
– 1,300,070 posts from 2,450 non-corporate Twitter
accounts
– January 2017
• Engagement Indicators
– Retweets
• % of tweets retweeted
• Average number of retweets per tweet
– Favourites (likes)
• % of tweets favourited (liked)
• Average number of likes per tweet
– Replies
• At the time of analysis Twitter API does not allow to
collect replies per tweet
59. 59!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
59! Approach (II)
• Feature Extractors
– Describe tweets in terms of their characteristics
– Content Features
• Length / Readability / Informativeness / Complexity / Sentiment
• Media / mentions / hashtags / URLs
• Time in the day
– User Features
• Network: In-degree / out-degree
• Activity: Post count / post rate / age in the system
– Semantic Features
• Use knowledge bases to extracts entities and concepts
– Persons / Organisations / Locations
• Using feature selection / regression to determine the
characteristics “patterns” of those tweets receiving higher
engagement levels
60. 60!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
60! Results (I)
• Tweets receiving higher engagement are:
– Longer, easier to read, more informative, lower complexity (avoid
complex terms), include media items (images, videos).
– In terms of user features they tend to be posted by accounts with a
high number of followers (corporate) or with a high post rate and a
high in-out degree ratio (non-corporate).
neg pos
051015202530
lenght
neg pos
020406080100
readability
neg pos
020406080100
informativeness
neg pos
−4−2024
polarity
61. 61!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
61! Results (II)
• Tweets receiving higher
engagement talk about
– Weather / roads and infrastructures /
events / missing persons
– Raise awareness (domestic abuse,
hate crime, modern slavery)
– Tend to mention locations
• Tweets receiving lower
engagement talk about
– Crime updates: such as burglary,
assault or driving under the influence
of alcohol
– Following requests (#ff)
– Advices to stay safe
62. 62!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
62! Results (III)
• Non-corporate accounts
generate in average higher
engagement
– Offer help, ask for help, advise
on local issues, reassure safety,
etc. (#wearehereforyou)
• Additional ingredients
– They engage closer with the
communities (direct messages
and mentions to citizens)
– They are fun!
63. 63!
Alberto Mendelzon Workshop (AWM) 23rd May 2018
63! Engagement Guidelines
• Focus
– Consider the key goal to achieve / the audience to engage (general public,
local communities, teenagers) & provide a clear message with a concrete set
of actions associated to it
• Be clear
– Complex messages with police jargon are difficult to understand. Messages
should be simple, informative and useful. Use images/videos and humour to
enhance dissemination
• Interact
– Engage with the communities rather than only broadcast. Identify highly
engaging police staff members and community leaders and involve them
• Stay active
– Engagement is a long-term commitment. Accounts active for longer time
receive higher engagement.
• Be respectful
– Reputation and legitimacy are extremely important. Post polite, safe and
respectful content