SlideShare una empresa de Scribd logo
1 de 97
Descargar para leer sin conexión
Making More Sense Out of 
Social Data 
Harith 
Alani 
h+p://people.kmi.open.ac.uk/harith/ 
@halani 
harith-alani 
@halani 
4th 
Workshop 
on 
Linked 
Science 
2014— 
Making 
Sense 
Out 
of 
Data 
(LISC2014) 
ISWC 
2014 
-­‐ 
Riva 
del 
Garda, 
Italy
Topics 
• Social media monitoring" 
• Behaviour role analysis" 
• Semantic sentiment " 
• Engagement in microblogs" 
• Cross platform and topic studies" 
• Semantic clustering" 
• Application examples"
Take home messages 
• Social media has many more challenges and 
opportunities to offer" 
• Fusing semantics and statistical methods is gooood" 
• Studying isolated social media platforms is baaaad … or 
not good enough … anymore!"
Sociograms 
• Capturing and graphing social 
relationships" 
• Moreno founder of sociograms and 
sociometry" 
• Assessing psychological well-being 
from social configurations of individuals 
and groups" 
Friendship 
Choices 
Among 
Fourth 
Graders 
(from 
Moreno, 
1934, 
p. 
38 
h+p://diana-­‐jones.com/wp-­‐content/uploads/EmoRons-­‐Mapped-­‐by-­‐New-­‐Geography.pdf
Computational Social Science 
Behaviour role Analysis 
“A 
field 
is 
emerging 
that 
leverages 
the 
capacity 
to 
collect 
and 
analyze 
data 
at 
a 
scale 
that 
may 
reveal 
pa+erns 
of 
individual 
and 
group 
behaviours.” 
“what 
does 
exisRng 
sociological 
network 
theory, 
built 
mostly 
on 
a 
foundaRon 
of 
one-­‐Rme 
“snapshot” 
data, 
typically 
with 
only 
dozens 
of 
people, 
tell 
us 
about 
massively 
longitudinal 
data 
sets 
of 
millions 
of 
people 
.. 
?” 
Original 
slide 
by 
Markus 
Strohmaier 
h+p://gking.harvard.edu/files/LazPenAda09.pdf
Social semantic linking …. in 2003 
! 
• Domain 
ontologies 
• SemanRcs 
for 
integraRng 
people, 
projects, 
and 
publicaRons 
• IdenRfy 
communiRes 
of 
pracRce 
• Browse 
evoluRon 
of 
social 
relaRonships 
and 
collaboraRons 
Alani, 
H.; 
Dasmahapatra, 
S.; 
O'Hara, 
K.and 
Shadbolt, 
N. 
IdenRfying 
communiRes 
of 
pracRce 
through 
ontology 
network 
analysis. 
IEEE 
Intelligent 
Systems, 
18(2) 
2003.
Linking 
scientists …. 
in 2005 
• Who 
is 
collaboraRng 
with 
whom? 
• How 
funding 
programmes 
impacted 
collaboraRons 
over 
Rme? 
data 
sources 
gatherers 
and 
mediators 
ontology 
knowledge 
repository 
(triplestore) 
applicaRons 
Alani, 
H.; 
Gibbins, 
N.; 
Glaser, 
H.; 
Harris, 
S. 
and 
Shadbolt, 
N. 
Monitoring 
research 
collaboraRons 
using 
semanRc 
web 
technologies. 
ESWC, 
Crete, 
2005.
Bigger data, greater sociograms
Social Media
Jan 29, 2013 
In-house Social Platforms
Tools for monitoring social networks
Reputation Monitoring 
• http://www.robust-project.eu/videos-demos "
Challenges and 
Opportunities • Integration" 
– How to represent and connect 
this data?" 
• Behaviour" 
– How can we measure and 
predict behaviour?" 
– Which behaviours are good/bad 
in which community type?" 
• Community Health" 
– What health signs should we 
look for? " 
– How to predict this health?" 
• Engagement" 
– How can we measure and 
maximise engagement? " 
• Sentiment" 
– How to measure it? " 
– Track it towards entities and 
contexts? "
Patterns
SemanRc 
Web 
& 
Linked 
Data 
SemanRc 
SenRment 
Analysis 
lurkers) 
ini#ators) 
followers) 
leaders) 
Macro/Micro 
Behaviour 
Analysis 
StaRsRcal 
Analysis 
Community 
Engagement 
Cumulative density functions of each dimension showing 
distributions for initiated and in-degree ratio 
and do not deviate away, at the other ex-treme 
users are found to post in a large range 
initiated (initiation) and in-degree ratio 
the density functions are skewed towards 
where only a few users initiate discussions 
to by large portions of the community. 
points per post (quality) is also skewed to-wards 
values indicating that the majority of users 
the best answers consistently. 
indicate that feature levels derived from 
distributions will be skewed towards lower values, 
initiated the definition of high for this 
anything exceeding 1.55x10−5. 
distribution of each dimension is shown in Fig-ure 
Figure 8: Boxplots of the feature distributions in each of the 11 Feature distributions are matched against the feature levels from equal-frequency binning 
ping. This mapping is shown in Table 2 where certain 
clusters are combined together as they have the same 
feature-level mapping patterns (i.e. 5,7 and 8,9). then interpreted the role labels from these clusters, and 
their subsequent patterns, as follows: 
• 0 - Focussed Expert Participant: this user type 
provides high quality answers but only within forums that they do not deviate from. They 
also have a mix of asking questions and answering 
them. 
• 1 - Focussed Novice: this user is focussed within few select forums but does not provide good qual-ity 
Technologies
MODELLING AND LINKING 
SOCIAL MEDIA DATA
June 25, 2013
Semantically-Interlinked Online 
Communities (SIOC) 
• SIOC is an ontology for representing and integrating data from the social web" 
• Simple, concise, and popular" 
SRll 
seeking 
the 
one 
size 
that’ll 
fit 
all 
sioc-project.org
SIOC for Discussion forums 
• SIOC is well 
tailored to fit 
discussion forum 
communities" 
• Needs extension 
to fit other 
communities, 
such as 
microblogs and 
Q&A"
Twitter in SIOC 
• Microblogs" 
• No forum structure"
IBM Connections in SIOC
SAP Community Network in SIOC
BEHAVIOUR ROLES
h+p://www.smrfoundaRon.org/wp-­‐content/uploads/2008/12/disRnguishing-­‐a+ributes-­‐of-­‐social-­‐roles.png
Why we monitor behaviour? 
• Understand role of people in a community 
• Monitor impact of behaviour on community evolution 
• Forecast community future 
• Learn which behaviour should be encouraged or discouraged 
• Find the best mix of behaviour to increase engagement in an online community 
• See which users need more support, which ones should be confined, and which ones 
should be promoted
Linking networks
Linking people via sensors, social 
media, papers, projects 
<?xml version="1.0"?>! 
<rdf:RDF! 
xmlns="http:// 
tagora.ecs.soton.ac.uk/schemas/ 
tagging#"! 
xmlns:rdf="http://www.w3.org/ 
1999/02/22-rdf-syntax-ns#"! 
xmlns:xsd="http://www.w3.org/2001/ 
XMLSchema#"! 
xmlns:rdfs="http://www.w3.org/ 
2000/01/rdf-schema#"! 
xmlns:owl="http://www.w3.org/ 
2002/07/owl#"! 
xml:base="http:// 
tagora.ecs.soton.ac.uk/schemas/ 
tagging">! 
<owl:Ontology rdf:about=""/>! 
<owl:Class rdf:ID="Post"/>! 
<owl:Class rdf:ID="TagInfo"/>! 
<owl:Class 
rdf:ID="GlobalCooccurrenceInfo"/>! 
<owl:Class 
rdf:ID="DomainCooccurrenceInfo"/>! 
<owl:Class rdf:ID="UserTag"/>! 
<owl:Class 
rdf:ID="UserCooccurrenceInfo"/>! 
<owl:Class rdf:ID="Resource"/>! 
<owl:Class rdf:ID="GlobalTag"/>! 
<owl:Class rdf:ID="Tagger"/>! 
<owl:Class rdf:ID="DomainTag"/>! 
<owl:ObjectProperty 
rdf:ID="hasPostTag">! 
<rdfs:domain 
rdf:resource="#TagInfo"/>! 
</owl:ObjectProperty>! 
<owl:ObjectProperty 
rdf:ID="hasDomainTag">! 
<rdfs:domain 
rdf:resource="#UserTag"/>! 
</owl:ObjectProperty>! 
<owl:ObjectProperty 
rdf:ID="isFilteredTo">! 
• Integration of physical presence and online <rdfs:range 
information" 
rdf:resource="#GlobalTag"/>! 
• <rdfs:domain 
Semantic user profile generation" 
rdf:resource="#GlobalTag"/>! 
</owl:ObjectProperty>! 
• <owl:ObjectProperty 
Logging of face-to-face contact" 
rdf:ID="hasResource">! 
<rdfs:domain rdf:resource="#Post"/>! 
<rdfs:range =…! 
• Social network browsing" 
• Analysis of online vs offline social networks" 
Alani, 
H.; 
Szomszor, 
M.; 
Ca+uto, 
C.; 
den 
Broeck, 
W.; 
Correndo, 
G. 
and 
Barrat, 
A.. 
Live 
social 
semanRcs. 
ISWC, 
Washington, 
DC, 
2009
1.2" 
1" 
0.8" 
0.6" 
0.4" 
0.2" 
0" 
Online+offline social networks 
H.Index" 
F2F"Degree" 
F2F"Strength" 
1" 5" 9" 13" 17" 21" 25" 29" 33" 37" 41" 45" 
• What’s 
your 
social 
configura-on? 
• What 
does 
it 
say 
about 
you? 
• And 
what 
you’ll 
become? 
Barrat, 
A.; 
C., 
Ca+uto; 
M., 
Szomszor; 
W., 
Van 
den 
Broeck 
and 
Alani, 
H. 
Social 
dynamics 
in 
conferences: 
analyses 
of 
data 
from 
the 
Live 
Social 
SemanRcs 
applicaRon. 
ISWC, 
Shanghai, 
China, 
2010.
h+p://www.tehowners.com/info/Popular%20Culture%20&%20Social%20Media/Online%20CommuniRes.jpg
1.000 0.274 0.086 0.909** 
0.274 1.000 -0.059 0.513 
0.086 -0.059 1.000 0.065 
0.909** 0.513 0.065 1.000 
Clustering for identifying emerging roles 
– Map the distribution of each 
feature in each cluster to a 
level (i.e. low, mid, high) 
– Align the mapping patterns 
with role labels 
Figure 8: Boxplots of the feature distributions in each of the 11 clus-ters. 
Mapping Table 2: Mapping of cluster of cluster dimensions dimensions to to levels 
levels 
Cluster Dispersion Initiation Quality Popularity 
0 L M H L 
1 L L L L 
2 M H L H 
3 H H H H 
4 L H H M 
5,7 H H L H 
6 L H M M 
8,9 M H H H 
10 L H M H 
• 1 - Focussed Novice: focussed within a few 
select forums but does not provide good quality 
content. 
• 2 - Mixed Novice: a novice across a medium 
range of topics 
• 3 - Distributed Expert: expert on a variety of 
topics and participates across many different 
forums 
…. 
• 3 - Distributed Expert: an expert on a variety of 
topics and participates across many different fo-rums 
• 4 - Focussed Expert Initiator: similar to cluster 
0 in that this type of user is focussed on certain 
topics and is an expert on those, but to a large ex-tent 
starts discussions and threads, indicating that 
his/her shared content is useful to the community 
• 5.7 - Distributed Novice: participates across a 
range of forums but is not knowledgeable on any 
topics
Encoding Roles in Ontologies with SPIN
Behaviour role extraction from 
Social Media Data 
Structural, social network, 
reciprocity, persistence, 
participation 
• Bottom Up analysis" 
– Every community member is 
classified into a “role”" 
– Unknown roles might be identified" 
– Copes with role changes over time " 
iniRators 
lurkers 
followers 
leaders 
Feature levels change with 
the 
dynamics of the community 
Associations of roles with a collection of 
feature-to-level mappings 
e.g. in-degree -> high, out-degree -> 
high 
Run rules over each user’s features 
and derive the community role 
composition 
Angeletou, 
S; 
Rowe, 
M, 
and 
Alani, 
H. 
Modelling 
and 
analysis 
of 
user 
behaviour 
in 
online 
communiRes. 
ISWC 
2011, 
Bonn, 
Germany
Correlation of behaviour roles with community activity 
• How certain behaviour roles impact activity in different community types?" 
Forum 
on 
CommuRng 
and 
Transport 
Forum 
on 
Rugby 
Forum 
on 
Mobile 
Phones 
and 
PDAs
Community types 
• So do communities of different types behave differently? 
• Analysed IBM Connections communities to study participation, 
activity, and behaviour of users 
• Compare exhibited community with what users 
say they use the community for 
– Does macro behaviour match micro needs?
Community types 
Community 
Wiki 
Page 
Blog 
Post 
Forum 
Thread 
Wiki 
Edit 
Blog 
Comment 
Forum 
Reply 
Tag 
Bookmark 
File 
§ Data consists of non-private 
info on IBM 
Connections Intranet 
deployment 
§ Communities: 
§ ID 
§ Creation date 
§ Members 
§ Used applications 
(blogs, Wikis, 
forums) 
§ Forums: 
§ Discussion threads 
§ Comments 
§ Dates 
§ Authors and 
responders
Community types 
• Muller, M. (CHI 2012) identified five distinct community types in 
IBM Connections:" 
– Communities of Practice (CoP): for sharing information and network" 
– Teams: shared goal for a particular project or client" 
– Technical Support: support for a specific technology" 
– Idea Labs Communities: for focused brainstorming " 
– Recreation Communities: recreational activities unrelated to work. 
" 
• Our data consisted of 186 most active communities:" 
– 100 CoPs, 72 Teams, and 14 Technical Support communities " 
– No Ideas of Recreation communities"
Behaviour roles in different community 
types 
• Members of Team communities are 
more engaged, popular, and initiate 
more discussions 
• Technical Support community 
members are mostly active in a few 
communities, and don’t initiate or 
contribute much! 
• CoP members are active across 
many communities, and contribute 
more 
Rowe, M. Fernandez, M., Alani, H., Ronen, I., Hayes, C., Karnstedt, M.: Behaviour Analysis across different types of Enterprise Online Communities. WebSci 2012
Behaviour roles and community health 
0.0 0.2 0.4 0.6 0.8 1.0 
0.0 0.2 0.4 0.6 0.8 1.0 
Churn Rate 
False Positive FPR 
Rate 
TPR 
0.0 0.2 0.4 0.6 0.8 1.0 
0.0 0.2 0.4 0.6 0.8 1.0 
User Count 
FPR 
TPR 
• Machine learning models to predict 
community health based on compositions and 
evolution of user behaviour 
• Churn rate: proportion of community leavers in a 
0.0 0.2 0.4 0.6 0.8 1.0 
0.0 0.2 0.4 0.6 0.8 1.0 
Seeds / Non−seeds Prop 
FPR 
TPR 
0.0 0.2 0.4 0.6 0.8 1.0 
0.0 0.2 0.4 0.6 0.8 1.0 
Clustering Coefficient 
FPR 
TPR 
given time segment. 
• User count: number of users who posted at least 
once. 
• Seeds to Non-seeds ratio: proportion of posts that 
get responses to those that don’t 
• Cluster coefficient: extent to which the community 
forms a clique. 
Health 
categories 
0.0 0.2 0.4 0.6 0.8 1.0 
0.0 0.2 0.4 0.6 0.8 1.0 
Seeds / Non−seeds Prop 
FPR 
TPR 
0.0 0.2 0.4 0.6 0.8 1.0 
0.0 0.2 0.4 0.6 0.8 1.0 
Clustering Coefficient 
FPR 
TPR 
False Positive Rate 
False Positive Rate False Positive Rate 
True Positive Rate True Positive Rate 
True Positive Rate True Positive Rate 
The 
fewer 
Focused 
Experts 
in 
the 
community, 
the 
more 
posts 
will 
received 
a 
reply! 
There 
is 
no 
“one 
size 
fits 
all” 
model! 
Rowe, 
M. 
and 
Alani, 
H. 
What 
makes 
communiRes 
Rck? 
Community 
health 
analysis 
using 
role 
composiRons. 
SocialCom 
2012, 
Amsterdam, 
The 
Netherlands.
SEMANTIC SENTIMENT 
ANALYSIS
Semantic sentiment analysis on social media 
• Range of features and statistical classifiers have been used in 
social media sentiment analysis in recent years 
• Semantics have often been overlooked 
– Semantic Features 
– Semantic Patterns 
• Semantic concepts 
can help 
determining 
sentiment even 
when no good 
lexical clues are 
present
Sentiment Analysis 
hate negative 
honest positive 
inefficient negative 
Love positive 
… 
Sentiment Lexicon 
I really love the iPhone 
I hate the iPhone 
Lexical-Based Approach 
Naïve 
Bayes, 
SVM, 
MaxEnt 
, 
etc. 
Learn 
Model 
Apply 
Model 
Training 
Set 
Test 
Set 
Model 
Machine Learning Approach
Semantic Concept Extraction 
• Extract semantic concepts from tweets data and incorporate 
them into the supervised classifier training. 
OpenCalais and Zemanta. Their experimental results showed that AlchemyAPI best for entity extraction and semantic concept mapping. Our datasets consist informal tweets, and hence are intrinsically different from those used in [10]. There-fore 
we conducted our own evaluation, and randomly selected 500 tweets from the STS 
corpus and asked 3 evaluators to evaluate the semantic concept extraction outputs gen-erated 
from AlchemyAPI, OpenCalais and Zemanta. 
No. of Concepts Entity-Concept Mapping Accuracy (%) 
Extraction Tool Extracted Evaluator 1 Evaluator 2 Evaluator 3 
AlchemyAPI 108 73.97 73.8 72.8 
Zemanta 70 71 71.8 70.4 
OpenCalais 65 68 69.1 68.7 
Table 2. Evaluation results of AlchemyAPI, Zemanta and OpenCalais. 
The assessment of the outputs was based on (1) the correctness of the extracted 
entities; and (2) the correctness of the entity-concept mappings. The evaluation results 
presented in Table 2 show that AlchemyAPI extracted the most number of concepts 
and it also has the highest entity-concept mapping accuracy compared to OpenCalais 
and Zematna. As such, we chose AlchemyAPI to extract the semantic concepts from 
our three datasets. Table 3 lists the total number of entities extracted and the number semantic concepts mapped against them for each dataset. 
STS HCR OMD 
No. of Entities 15139 723 1194 
No. of Concepts 29 17 14 
Table 3. Entity/concept extraction statistics of STS, OMD and HCR using AlchemyAPI.
Impact of adding semantic features 
• Incorporating semantics increases accuracy against the 
baseline by: 
– 6.5% for negative sentiment, 
– 4.8% for positive sentiment 
– F1 = 75.95%, with 77.18% Precision and 75.33% Recall 
Destroy(((Invading(Germs(( 
Nega%ve' Nega%ve'Concept' 
• OK, but what about 
such cases? 
• Can semantics help? 
Saif, 
H., 
He, 
Y. 
and 
Alani, 
H. 
SemanRc 
senRment 
analysis 
of 
twi+er. 
ISWC 
2012, 
Boston, 
US.
Semantic Pattern Approaches 
• Apply 
syntac-c 
and 
seman-c 
processing 
techniques 
• Use 
external 
semanRc 
resources 
(e.g. 
Dbpedia, 
Freebase) 
to 
idenRfy 
semanRc 
concepts 
in 
Tweets 
Threat 
Trojan 
Horse 
Hack 
Code 
Program 
Malware 
Dangerous 
Harm 
Spyware 
• Extract 
clusters 
of 
similar 
contextual 
semanRcs 
and 
senRment, 
and 
use 
as 
pa+erns 
in 
senRment 
analysis
Tweet-Level Sentiment Analysis 
Features 
Based 
on 
9 
Twi+er 
datasets 
MaxEnt Classifier 
Accuracy F-Measure 
Minimum Maximum Average Minimum Maximum Average 
Syntactic 
Twitter Features -0.23 3.91 1.24 -0.25 4.53 1.62 
POS -0.89 2.92 0.79 -0.91 5.67 1.25 
Lexicon -0.44 4.23 1.30 -0.38 5.81 1.83 
Average -0.52 3.69 1.11 -0.52 5.33 1.57 
Semantic 
Concepts -0.22 2.76 1.20 -0.40 4.80 1.51 
LDA-Topics -0.47 3.37 1.20 -0.68 6.05 1.68 
SS-Patterns 0.70 9.87 3.05 1.23 9.78 3.76 
Average 0.00 5.33 1.82 0.05 6.88 2.32 
Table 6: Win/Loss in Accuracy and F-measure of using different features for sentiment classifica-tion 
on all nine datasets. 
Win/Loss 
in 
Accuracy 
and 
F-­‐measure 
of 
using 
different 
features 
for 
senRment 
classificaRon 
on 
all 
nine 
datasets. 
classifier described in Section 4.2. Note that STS-Gold is the only dataset among the 
other 9 that provides named entities manually annotated with their sentiment labels 
(positive, negative, neutral). Therefore, our evaluation in this task is done using the 
Hassan 
S., 
He, 
Y., 
Miriam 
F.and 
Harith 
A., 
SemanRc 
Pa+erns 
for 
SenRment 
Analysis 
of 
Twi+er, 
ISWC 
2014, 
Trento, 
Italy
Entity-Level Sentiment Analysis 
67.00 
65.00 
63.00 
61.00 
59.00 
57.00 
55.00 
Gold 
standard 
of 
58 
enRRes 
Accuracy 
F1 
Unigrams 
LDA-­‐Topics 
SemanRc 
Concepts 
SS-­‐Pa+erns 
Hassan 
S., 
He, 
Y., 
Miriam 
F.and 
Harith 
A., 
SemanRc 
Pa+erns 
for 
SenRment 
Analysis 
of 
Twi+er, 
ISWC 
2014, 
Trento, 
Italy
ONLINE 
ENGAGEMENT 
ENGAGEMENT ANALYSIS
Different Engagement Patterns 
Forum on a celebrity 
Forum on transport
Different Engagement Parameters
Different Engagement Parameters
… “few people took part” 
• 309 invitees from 
media, academia, and 
public engagement 
bodies" 
• 2 invitees contributed 
to the site, with 
2 edits!!
Recipe for 
more engaging 
posts?
Ask the (Social) Data 
• What’s the model of good/bad tweets?" 
• What features are associated with each group?"
term influenced by external factors. Properties influencing popularity include 
content - generally referred to as content features. In Table 1 we define user and 
content features and study their influence on the discussion “continuation”. 
user attributes - describing the reputation of the user - and attributes of a post’s 
content - generally referred to as content features. In Table 1 we define user and 
content features and study their influence on the discussion “continuation”. 
Feature Engineering 
Table 1. User and Content Features 
User Features 
Table 1. User and Content Features 
In Degree: Number of followers of U # 
Out Degree: Number of users U follows # 
List Degree: Number of lists U appears User on. Features 
Lists group users by topic # 
Post Count: Total number of posts the user has ever posted # 
In Degree: Number of followers of U # 
Out Degree: Number of users U follows # 
List Degree: Number of lists U appears on. Lists group users by topic # 
Post Count: Total number of posts the user has ever posted # 
User Age: Number of minutes from user join date # 
Post Rate: Posting frequency of the user PostCount 
UserAge 
Content Features 
User Age: Number of minutes from user join date # 
Post Rate: Posting frequency of the user PostCount 
Post length: Length of the post in characters # 
Complexity: Cumulative entropy of the unique words in post p λ 
UserAge 
Content Features 
of total word length n and pi the frequency of each word 
! 
i∈[1,n] pi(log λ−log pi) 
Post length: Length of the post in characters # 
Complexity: Cumulative entropy of the unique words in post p λ 
Uppercase count: Number of uppercase words # 
Readability: Gunning fog index using average sentence length (ASL) [7] 
of total word length n and pi the frequency of each word 
λ 
! 
i∈[1,n] pi(log λ−log pi) 
and the percentage of complex words (PCW). 0.4(ASL + PCW) 
λ 
Uppercase count: Number of uppercase words # 
Verb Count: Number of verbs # 
Noun Count: Number of nouns # 
Readability: Gunning fog index using average sentence length (ASL) [7] 
and the percentage of complex words (PCW). 0.4(ASL + PCW) 
Adjective Count: Number of adjectives # 
Referral Verb Count: Count: Number Number of of @verbs user # 
# 
Time Noun in the Count: day: Number Normalised of nouns time in the day measured in minutes # 
# 
Informativeness: Terminological novelty of the post wrt other posts 
Adjective Count: Number of adjectives # 
Referral Count: The Number cumulative of @user tfIdf value of each term t in post p 
# 
Time in Polarity: the day: Cumulation Normalised time of polar in the term day weights measured in p in (using 
minutes # 
Informativeness: Terminological novelty of the post wrt other posts 
Sentiwordnet3 lexicon) normalised by polar terms count Po+Ne 
The cumulative tfIdf value of each term t in post p 
! 
t∈p tfidf(t, p) 
! 
t∈p tfidf(t, p) 
Polarity: Cumulation of polar term weights in p (using 
|terms| 
Sentiwordnet3 lexicon) normalised by polar terms count Po+Ne 
|terms| 
• Focus Features" 
– Topic entropy: the distribution of the author across community forums" 
– Topic Likelihood: the likelihood that a user posts in a specific forum given his post history" 
4.2 Experiments 
Experiments are intended to test the performance of different classification mod-els 
• Measures the affinity that a user has with a given forum" 
• Lower likelihood indicates a user posting on an unfamiliar topic" 
4.2 Experiments 
Experiments are intended to test the performance of different classification mod-els 
in identifying seed posts. Therefore we used four classifiers: discriminative 
classifiers Perceptron and SVM, the generative classifier Naive Bayes and the 
decision-tree classifier J48. For each classifier we used three feature settings: 
user features, content features and user+content features. 
in identifying seed posts. Therefore we used four classifiers: discriminative 
classifiers Perceptron and SVM, the generative classifier Naive Bayes and the
Classification of Posts 
Seed Posts Non-Seed 
Posts 
§ Binary classification model 
§ Trained with social, content, 
and combined features 
§ 80/20 training/testing 
§ Identify best feature types, and 
top individual features, in 
predicting post classification
Engagement on Boards.ie 
• Which posts are 
more likely to 
stimulate 
responses and 
discussions?" 
• What impacts 
engagement 
more; user 
features, post 
content, forum 
affinity?" 
• Which individual 
features are most 
influential?"
Top Features for Engagement on Boards.ie 
• Content features were key!" 
• Best predictions were achieved when combining user, content, and focus features" 
• URLs (Referral Count) in a post negatively impact discussion activity" 
• Seed Posts (posts that receive replies) are associated with greater activity levels, and because it has alreadfyorubme elinkeluihsoeodd"in other 
Lower informativeness 
is associated with seed 
posts" 
– i.e. seeds use 
investigations (e.g., [14]). 
Boards.ie does not provide explicit social relations be-tween 
community members, unlike for example Facebook and 
language that is 
familiar to the 
community" 
Twitter. We followed the same strategy proposed in [3] for 
extracting social networks from Digg, and built the Boards.ie 
social network for users, weighting edges cumulatively by the 
number of replies between any two users. 
TABLE I 
DESCRIPTION OF THE BOARDS.IE DATASET 
Posts Seeds Non-Seeds Replies Users 
1,942,030 90,765 21,800 1,829,465 29,908 
• Rowe, 
M.; 
Angeletou, 
S. 
and 
Alani, 
H. 
AnRcipaRng 
discussion 
acRvity 
on 
community 
forums. 
SocialCom 
2011, 
Boston, 
MA, 
USA.
former dataset contains tweets which relate to the Haiti earthquake disaster, 
covering a varying timespan. The latter dataset contains all tweets published 
during the duration of president Barack Obama’s State of the Union Address 
speech. Our goal is to predict discussion activity based on the features of a given 
post by first identifying seed posts, before moving on to predict the discussion 
level. 
12 user-age (0.015) content-noun-count (0.002) 
15 13 content-adj-uppercase-count (count 0.005) (0.012) content-adj-readability count (0.0) 
(0.001) 
16 14 content-complexity noun-count ((0.0) 0.010) content-informativeness verb-count (0.001) 
(17 15 adj-count (0.005) adj-count (0.0) 
16 content-complexity (0.0) content-informativeness (17 content-verb-count (0.0) content-uppercase-count (Fig. 3. Contributions of top-5 features to identifying Non-seeds (N) Upper plots are for the Haiti dataset and the lower plots are for the dataset. 
Top Features for Engagement on Twitter 
• Top are list-degree, 
in-degree, 
Within the above datasets many of the posts are not seeds, but are instead 
replies to previous posts, thereby featuring in the discussion chain as a node. 
In [13] retweets are considered as part of the discussion activity. In our work 
we identify discussions using the explicit “in reply to” information obtained 
by the Twitter API, which does not include retweets. We make this decision 
based on the work presented in boyd et.al [4], where an analysis of retweeting 
as a discussion practice is presented, arguing that message forwards adhere different motives which do not necessarily designate a response to the initial 
message. Therefore, we only investigate explicit replies to messages. To gather 
our discussions, and our seed posts, we iteratively move up the reply chain - i.from reply to parent post - until we reach the seed post in the discussion. We 
define this process as dataset enrichment, and is performed by querying Twitter’s 
REST API6 using the in reply to id of the parent post, and moving one-step a time up the reply chain. This same approach has been employed successfully 
in work by [12] to gather a large-scale conversation dataset from Twitter. 
informativeness, 
and #posts" 
" 
• Top are list-degree, 
time of 
posting, in-degree, 
and 
#posts" 
content-verb-count (0.0) content-uppercase-count (Fig. 3. Contributions of top-5 features to identifying Non-seeds (N) Upper plots are for the Haiti dataset and the lower plots are for the dataset. 
HaiR 
Earthquake 
State 
Union 
Address 
Table 2. Statistics of the datasets used for experiments 
The top-most ranks from each dataset are dominated by user features Dataset Users Tweets Seeds Non-Seeds Replies 
Haiti 44,497 65,022 1,405 60,686 2,931 
Union Address 66,300 80,272 7,228 55,169 17,875 
Rowe, 
M., 
Angeletou, 
S., 
Alani, 
H. 
PredicRng 
Discussions 
on 
the 
Social 
SemanRc 
Web. 
ESWC, 
Crete, 
2011 
Table 2 shows the statistics that explain our collected datasets. One can
Top Features for Engagement on Twitter – 
Earth Hour 2014 
neg pos 
0 5 10 15 20 25 30 
Length 
neg pos 
0.0 0.5 1.0 1.5 
Complexity 
neg pos 
0 10 20 30 40 
Readability 
neg pos 
−4 −2 0 2 4 
Polarity 
• Top influential 
features do not 
match those found 
for Board.ie or for 
two non-random 
Twitter datasets"
Top Features for Engagement on Twitter – 
Dorset Police 
neg pos 
5 10 15 20 25 30 
Length 
neg pos 
0.6 0.8 1.0 1.2 1.4 
complexity 
neg pos 
−4 −3 −2 −1 0 1 2 3 
polarity 
neg pos 
0 1 2 3 4 5 6 7 
mentions 
! 
• Top 4 features 
share 3 with 
Twitter Earth 
Hour dataset" 
Fernandez, 
M., 
Cano, 
E., 
and 
Alani, 
H. 
Policing 
Engagement 
via 
Social 
Media. 
CityLabs 
workshop, 
SocInfo, 
Barcelona, 
2014
Publications about social media 
by 
Katron 
Weller 
-­‐ 
h+p://kwelle.files.wordpress.com/2014/04/figure1.jpg
Moving on … 
§ How can we move on 
from these (micro) 
studies? 
§ Are results consistent 
across datasets, and 
platforms? 
§ One way forward is: 
§ Multiple platforms 
§ Multiple topics
Papers studying single/multiple 
social media platforms 
Survey 
done 
on 
all 
submi7ed 
papers 
to 
Web 
Science 
conferences
Papers studying single/multiple 
social media platforms 
Survey 
done 
on 
all 
submi7ed 
papers 
to 
Web 
Science 
conferences
Papers studying single/multiple 
social media platforms 
Survey 
done 
on 
all 
submi7ed 
papers 
to 
Web 
Science 
conferences
Papers studying single/multiple 
social media platforms 
Survey 
done 
on 
all 
submi7ed 
papers 
to 
Web 
Science 
conferences
Apples and Oranges 
• We mix and 
compare different 
datasets, topics, 
and platforms 
• Aim is to test 
consistency and 
transferability of 
results
7 datasets from 5 platforms 
Pla1orm 
Posts 
Users 
Seeds 
Non-­‐seeds 
Replies 
Boards.ie 
6,120,008 
65,528 
398,508 
81,273 
5,640,227 
Twi+er 
Random 
1,468,766 
753,722 
144,709 
930,262 
390,795 
Twi+er 
(HaiR 
Earthquake) 
65,022 
45,238 
1,835 
60,686 
2,501 
Twi+er 
(Obama 
State 
of 
Union 
Address) 
81,458 
67,417 
11,298 
56,135 
14,025 
SAP 
427,221 
32,926 
87,542 
7,276 
332,403 
Server 
Fault 
234,790 
33,285 
65,515 
6,447 
162,828 
Facebook 
118,432 
4,745 
15,296 
8,123 
95,013 
Seed posts are those that receive a reply 
Non-seed posts are those with no replies
Data Balancing 
Pla1orm 
Seeds 
Non-­‐seeds 
Instance 
Count 
Boards.ie 
398,508 
81,273 
162,546 
Twi+er 
Random 
144,709 
930,262 
289,418 
Twi+er 
(HaiR 
1,835 
60,686 
3,670 
Earthquake) 
Twi+er 
(Obama 
State 
of 
Union 
Address) 
11,298 
56,135 
22,596 
SAP 
87,542 
7,276 
14,552 
Server 
Fault 
65,515 
6,447 
12,894 
Facebook 
15,296 
8,123 
16,246 
Total 
521,922 
For each dataset, an equal number of seeds and non-seed 
posts are used in the analysis.
Classification Results 
Feature 
P 
R 
F1 
Social 
0.592 
0.591 
0.591 
Content 
0.664 
0.660 
0.658 
Social+Content 
0.670 
0.666 
0.665 
(Random) 
(HaiR 
Earthquake) 
(Obama’s 
State 
Union 
Address) 
P 
R 
F1 
0.561 
0.561 
0.560 
0.612 
0.612 
0.611 
0.628 
0.628 
0.628 
P 
R 
F1 
0.968 
0.966 
0.966 
0.752 
0.747 
0.747 
0.974 
0.973 
0.973 
Feature 
P 
R 
F1 
Social 
0.542 
0.540 
0.539 
Content 
0.650 
0.642 
0.639 
Social+Content 
0.656 
0.649 
0.646 
P 
R 
F1 
0.650 
0.631 
0.628 
0.575 
0.541 
0.521 
0.652 
0.632 
0.629 
P 
R 
F1 
0.528 
0.380 
0.319 
0.626 
0.380 
0.275 
0.568 
0.407 
0.359 
Feature 
P 
R 
F1 
Social 
0.635 
0.632 
0.632 
Content 
0.641 
0.641 
0.641 
Social+Content 
0.660 
0.660 
0.660 
§ Performance 
of 
the 
logisRc 
regression 
classifier 
trained 
over 
different 
feature 
sets 
and 
applied 
to 
the 
test 
set.
Effect of features on engagement 
Boards.ie 
β 
2 
1 
0 
−1 
−2 
Twitter Random 
β 
1.0 
0.5 
0.0 
−0.5 
Twitter Haiti 
6e+16 
4e+16 
2e+16 
0e+00 
−2e+16 
−4e+16 
−6e+16 
Twitter Union 
0.2 
0.0 
−0.2 
β 
−0.4 
−0.6 
−0.8 
Server Fault 
β 
2.0 
1.5 
1.0 
0.5 
0.0 
−0.5 
−1.0 
SAP 
β 
5 
0 
−5 
−10 
Facebook 
β 
0.5 
0.4 
0.3 
0.2 
0.1 
0.0 
−0.1 
In−degree 
Out−degree 
Post Count 
Age 
Post Rate 
Post Length 
Referrals Count 
Polarity 
Complexity 
Readability 
Readability Fog 
Informativeness 
Logistic regression coefficients for each platform's features
Comparison 
to literature 
§ How performance 
of our shared 
features compare 
to other studies on 
different datasets 
and platforms?
Positive impact 
Negative impact 
Mismatch 
Match 
Comparison 
to literature
Positive impact 
Negative impact 
Mismatch 
Match 
Comparison 
to literature
Let’s Share More Data!
Semantic Clustering 
• Statistical models play important roles in social data 
analyses 
• Keeping such models up to date often means regular, 
expensive, and time consuming retraining 
• Semantic Features are likely to decay more slowly than 
lexical features 
• Could adding semantics to the models extend their value 
and life expectancy? 
Cano, 
E., 
He, 
Y., 
Alani, 
H. 
Stretching the Life of Twitter Classifiers with Time-Stamped Semantic Graphs. ISWC 2014, Trento, Italy.
Semantic Representation of a Tweet 
<dbo:PresidentOfUnitedStateofAmerica> 
<skos:Nobel_Peace_Price_laureates> 
rdf:type 
dcterms:subject 
<dbp:Barack_Obama> 
dbprop:nationality 
American 
<skos:English-language_television_stations> 
<skos:PresidentsOfEgypt> 
<dbp:Hosni_Mubarak> 
<dbp:CNN> 
<dbp:Egypt> 
dbprop:languages 
<dbp:Egyptian_Arabic> 
<skos:Arab_republics> 
dcterms:subject 
dcterms:subject 
<dbp:Country> 
rdf:type 
rdf:type
Evolution of Semantics 
• Renewed DBpedia Graph snapshots are taken over time" 
• Semantic features updated based on new knowledge in 
DBpedia" 
v3.6 v3.7 v3.8 
<Budget_Control_Act_of_2011> 
wikiPageWikiLink 
<Barack_Obama> 
<UnitedStatesPresidentialCandidates> 
<Hawaii> 
spouse 
<MechelleObama> 
birth1place 
wikiPageWikiLink
Experiments 
Extending fitness of model to 
proceedings epochs 
• 12,000 annotated tweets" 
• Adding Classes as clustering features provide best performance" 
Cross-­‐ 
Epoch 
2010-­‐2011 2010-­‐2013 2011-­‐2013 Average 
F1 F1 F1 
BoW 0.634 
0.481 0.261 0.458 
Category 0.683 
0.539 0.524 0.582 
Property 0.665 
0.557 0.502 0.603 
Resource 0.774 
0.544 0.445 0.587 
Class 0.691 
0.665 0.669 0.675 
Same-­‐ 
epoch 
2010-­‐2010 2011-­‐2011 Average 
BoW 0.831 0.875 0.845
APPLICATIONS
What policymakers really want from Social 
Media? 
1. "Fish where the fish is" 
– one interface to access multiple SNS" 
– layman monitoring of users and topics " 
2. "My consistency first" 
– communicating with users in own 
constituency" 
– find local groups, events, and topics" 
3. "What are their needs, complaints, and 
preferences?" 
– what citizens talk about, complain about" 
– what are the top 5-10 topics of the day" 
4. Who should I talk to?" 
– who are the influential citizens" 
– whom to engage with" 
5. What about Tomorrow?" 
– which topics will get hotter?" 
– which discussions are likely to grow 
further?" 
6. Presence and popularity" 
– what writing recipe to follow to reach more 
people" 
7. Privacy" 
– concerns on citizens’ privacy when 
extracting info" 
– concerns on their own privacy with 3rd 
party SNS access tools" 
Interviews 
with 
31 
policymakers
Wandhöfer, 
T.; 
Taylor, 
S.; 
Alani, 
H.; 
Zoshi, 
S.; 
Sizov, 
S.; 
et 
al. 
Engaging 
poliRcians 
with 
ciRzens 
on 
social 
networking 
sites: 
the 
WeGov 
Toolbox. 
IJEGR, 
8(3), 
2012
Monitoring SCN 
" 
Monitoring of 
evolution of 
community 
activities and level 
of contributions in 
SAP Community 
Networks – SCN " 
Demo
SCN Behaviour 
" 
Community managers can monitor behaviour composition of forums, and its 
association to activity evolution "
For Education 
https://twitter.com/OpenUniversity/status/346911297704714240
FB Groups 
Sentiment 
Macro Behaviour 
Micro Behaviour 
Topics
Course 
tutors 
Real 
Rme 
monitoring 
Behaviour 
Analysis 
SenRment 
Analysis 
Topic 
Analysis 
• How 
acRve 
the 
engaged 
the 
course 
group 
is? 
• How 
is 
senRment 
towards 
a 
course 
evolving? 
• Are 
the 
leaders 
of 
the 
group 
providing 
posiRve/negaRve 
comments? 
• What 
topics 
are 
emerging? 
• Is 
the 
group 
flourishing 
or 
diminishing? 
• Do 
students 
get 
the 
answers 
and 
support 
they 
need? 
Thomas, 
K.; 
Fernández, 
M.; 
Brown, 
S., 
Alani, 
H. 
OUSocial2: 
a 
plaxorm 
for 
gathering 
students’ 
feedback 
from 
social 
media. 
(Demo) 
ISWC 
2014, 
Trento, 
Italy.
DEMO
Thanks to colaborators
Thanks to .. 
Hassan Saif Lara Piccolo Thomas Dickensen 
Gregoire Burel 
Miriam Fernandez 
Smitashree Choudhury 
Elizabeth Cano 
Matthew Rowe 
Keerthi Thomas 
Sofia Angeletou
Heads-up 
Semantic Patterns for Sentiment Analysis of Twitter 
Thursday 15.40 - Session: Social Media" 
Semantic Patterns for Sentiment Analysis of Twitter 
Thursday 16:00 - Session: Social Media" 
User Profile Modeling in Online Communities ! 
Sunday 2:05 pm - SWCS Workshop" 
OUSocial2: 
a 
pla1orm 
for 
gathering 
students’ 
feedback 
from 
social 
media 
(DEMO) 
The 
Topics 
they 
are 
a-­‐Changing 
— 
Characterising 
Topics 
with 
Time-­‐Stamped 
Semanc 
Graphs 
(POSTER)" 
! 
Automac 
Stopword 
Generaon 
using 
Contextual 
Semancs 
for 
Senment 
Analysis 
of 
Twi_er 
(POSTER)
Making More Sense Out of Social Data
Making More Sense Out of Social Data

Más contenido relacionado

La actualidad más candente

2010 sept - mobile web africa - marc smith - says who - mapping social medi...
2010   sept - mobile web africa - marc smith - says who - mapping social medi...2010   sept - mobile web africa - marc smith - says who - mapping social medi...
2010 sept - mobile web africa - marc smith - says who - mapping social medi...Marc Smith
 
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...Marc Smith
 
CrowdTruth @VU Faculty Colloquium (June 2015)
CrowdTruth @VU Faculty Colloquium (June 2015)CrowdTruth @VU Faculty Colloquium (June 2015)
CrowdTruth @VU Faculty Colloquium (June 2015)Lora Aroyo
 
20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...Marc Smith
 
Social Network Analysis and Partnerships SNA presentation Guevara 2015
Social Network Analysis and Partnerships SNA presentation Guevara 2015Social Network Analysis and Partnerships SNA presentation Guevara 2015
Social Network Analysis and Partnerships SNA presentation Guevara 2015Sophia Guevara
 
2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formatted2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formattedMarc Smith
 
Think Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming SkillsThink Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming SkillsMarc Smith
 
2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXLMarc Smith
 
Ph.D. defense: semantic social network analysis
Ph.D. defense: semantic social network analysisPh.D. defense: semantic social network analysis
Ph.D. defense: semantic social network analysisguillaume ereteo
 
Ona For Community Roundtable
Ona For Community RoundtableOna For Community Roundtable
Ona For Community RoundtablePatti Anklam
 
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNAMarc Smith
 
Social Network Analysis in Two Parts
Social Network Analysis in Two PartsSocial Network Analysis in Two Parts
Social Network Analysis in Two PartsPatti Anklam
 
Picturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter SchoolPicturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter SchoolFarida Vis
 
20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...Marc Smith
 
2013 NodeXL Social Media Network Analysis
2013 NodeXL Social Media Network Analysis2013 NodeXL Social Media Network Analysis
2013 NodeXL Social Media Network AnalysisMarc Smith
 
20121010 marc smith - mapping collections of connections in social media with...
20121010 marc smith - mapping collections of connections in social media with...20121010 marc smith - mapping collections of connections in social media with...
20121010 marc smith - mapping collections of connections in social media with...Marc Smith
 
Jill Freyne - Collecting community wisdom: integrating social search and soci...
Jill Freyne - Collecting community wisdom: integrating social search and soci...Jill Freyne - Collecting community wisdom: integrating social search and soci...
Jill Freyne - Collecting community wisdom: integrating social search and soci...DERIGalway
 
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...Shalin Hai-Jew
 

La actualidad más candente (20)

2010 sept - mobile web africa - marc smith - says who - mapping social medi...
2010   sept - mobile web africa - marc smith - says who - mapping social medi...2010   sept - mobile web africa - marc smith - says who - mapping social medi...
2010 sept - mobile web africa - marc smith - says who - mapping social medi...
 
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
 
CrowdTruth @VU Faculty Colloquium (June 2015)
CrowdTruth @VU Faculty Colloquium (June 2015)CrowdTruth @VU Faculty Colloquium (June 2015)
CrowdTruth @VU Faculty Colloquium (June 2015)
 
20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...
 
Social Network Analysis and Partnerships SNA presentation Guevara 2015
Social Network Analysis and Partnerships SNA presentation Guevara 2015Social Network Analysis and Partnerships SNA presentation Guevara 2015
Social Network Analysis and Partnerships SNA presentation Guevara 2015
 
2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formatted2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formatted
 
Think Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming SkillsThink Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming Skills
 
2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL
 
Roles In Networks
Roles In NetworksRoles In Networks
Roles In Networks
 
Social Media Mining and Analytics
Social Media Mining and AnalyticsSocial Media Mining and Analytics
Social Media Mining and Analytics
 
Ph.D. defense: semantic social network analysis
Ph.D. defense: semantic social network analysisPh.D. defense: semantic social network analysis
Ph.D. defense: semantic social network analysis
 
Ona For Community Roundtable
Ona For Community RoundtableOna For Community Roundtable
Ona For Community Roundtable
 
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
 
Social Network Analysis in Two Parts
Social Network Analysis in Two PartsSocial Network Analysis in Two Parts
Social Network Analysis in Two Parts
 
Picturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter SchoolPicturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter School
 
20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...
 
2013 NodeXL Social Media Network Analysis
2013 NodeXL Social Media Network Analysis2013 NodeXL Social Media Network Analysis
2013 NodeXL Social Media Network Analysis
 
20121010 marc smith - mapping collections of connections in social media with...
20121010 marc smith - mapping collections of connections in social media with...20121010 marc smith - mapping collections of connections in social media with...
20121010 marc smith - mapping collections of connections in social media with...
 
Jill Freyne - Collecting community wisdom: integrating social search and soci...
Jill Freyne - Collecting community wisdom: integrating social search and soci...Jill Freyne - Collecting community wisdom: integrating social search and soci...
Jill Freyne - Collecting community wisdom: integrating social search and soci...
 
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
 

Destacado

Growing Galway's Startup Community
Growing Galway's Startup CommunityGrowing Galway's Startup Community
Growing Galway's Startup CommunityJohn Breslin
 
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...John Breslin
 
SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...
SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...
SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...GUANGYUAN PIAO
 
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...GUANGYUAN PIAO
 
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...GUANGYUAN PIAO
 
Innovation and Entrepreneurship: Tips, Tools and Tricks
Innovation and Entrepreneurship: Tips, Tools and TricksInnovation and Entrepreneurship: Tips, Tools and Tricks
Innovation and Entrepreneurship: Tips, Tools and TricksJohn Breslin
 
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...GUANGYUAN PIAO
 

Destacado (7)

Growing Galway's Startup Community
Growing Galway's Startup CommunityGrowing Galway's Startup Community
Growing Galway's Startup Community
 
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
 
SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...
SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...
SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...
 
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
 
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
 
Innovation and Entrepreneurship: Tips, Tools and Tricks
Innovation and Entrepreneurship: Tips, Tools and TricksInnovation and Entrepreneurship: Tips, Tools and Tricks
Innovation and Entrepreneurship: Tips, Tools and Tricks
 
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
 

Similar a Making More Sense Out of Social Data

Seams2016 presentation calikli_et_al
Seams2016 presentation calikli_et_alSeams2016 presentation calikli_et_al
Seams2016 presentation calikli_et_alGul Calikli
 
2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network Analysis2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network AnalysisMarc Smith
 
TruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social NetworkTruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social NetworkLora Aroyo
 
The Impact of the Social Web on Freelance Translators' Support Networks
The Impact of the Social Web on Freelance Translators' Support NetworksThe Impact of the Social Web on Freelance Translators' Support Networks
The Impact of the Social Web on Freelance Translators' Support NetworksMarie Groß
 
Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Vala Ali Rohani
 
4C13 J.15 Larson "Twitter based discourse community"
4C13 J.15 Larson "Twitter based discourse community"4C13 J.15 Larson "Twitter based discourse community"
4C13 J.15 Larson "Twitter based discourse community"rhetoricked
 
QE. Strength of Ties under conditions of anonymity
QE. Strength of Ties under conditions of anonymityQE. Strength of Ties under conditions of anonymity
QE. Strength of Ties under conditions of anonymityHerbert Eng
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisFarida Vis
 
Wanted By The ODI!
Wanted By The ODI!Wanted By The ODI!
Wanted By The ODI!lisbk
 
The Research Data Alliance: Creating the culture and technology for an intern...
The Research Data Alliance: Creating the culture and technology for an intern...The Research Data Alliance: Creating the culture and technology for an intern...
The Research Data Alliance: Creating the culture and technology for an intern...Research Data Alliance
 
Netnography webinar
Netnography webinarNetnography webinar
Netnography webinarsuresh sood
 
Privacy Dynamics: Learning Privacy Norms for Social Software
Privacy Dynamics: Learning Privacy Norms for Social SoftwarePrivacy Dynamics: Learning Privacy Norms for Social Software
Privacy Dynamics: Learning Privacy Norms for Social SoftwareArosha Bandara
 
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting...
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...learjk
 
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...Shalin Hai-Jew
 
Tutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social NetworksTutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social Networkspjing2
 
20110128 connected action-node xl-sea of connections
20110128 connected action-node xl-sea of connections20110128 connected action-node xl-sea of connections
20110128 connected action-node xl-sea of connectionsMarc Smith
 
2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
2010-November-8-NIA - Smart Society and Civic Culture - Marc SmithMarc Smith
 

Similar a Making More Sense Out of Social Data (20)

Seams2016 presentation calikli_et_al
Seams2016 presentation calikli_et_alSeams2016 presentation calikli_et_al
Seams2016 presentation calikli_et_al
 
2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network Analysis2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network Analysis
 
TruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social NetworkTruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social Network
 
The Impact of the Social Web on Freelance Translators' Support Networks
The Impact of the Social Web on Freelance Translators' Support NetworksThe Impact of the Social Web on Freelance Translators' Support Networks
The Impact of the Social Web on Freelance Translators' Support Networks
 
Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Social Network Analysis (Part 1)
Social Network Analysis (Part 1)
 
4C13 J.15 Larson "Twitter based discourse community"
4C13 J.15 Larson "Twitter based discourse community"4C13 J.15 Larson "Twitter based discourse community"
4C13 J.15 Larson "Twitter based discourse community"
 
Network Awareness Tool - Learning Analytics in the workplace: 
Detecting and ...
Network Awareness Tool - Learning Analytics in the workplace: 
Detecting and ...Network Awareness Tool - Learning Analytics in the workplace: 
Detecting and ...
Network Awareness Tool - Learning Analytics in the workplace: 
Detecting and ...
 
ESWC 2014 Tutorial Part 4
ESWC 2014 Tutorial Part 4ESWC 2014 Tutorial Part 4
ESWC 2014 Tutorial Part 4
 
QE. Strength of Ties under conditions of anonymity
QE. Strength of Ties under conditions of anonymityQE. Strength of Ties under conditions of anonymity
QE. Strength of Ties under conditions of anonymity
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media Analysis
 
Wanted By The ODI!
Wanted By The ODI!Wanted By The ODI!
Wanted By The ODI!
 
ACCESS Behaviors
ACCESS BehaviorsACCESS Behaviors
ACCESS Behaviors
 
The Research Data Alliance: Creating the culture and technology for an intern...
The Research Data Alliance: Creating the culture and technology for an intern...The Research Data Alliance: Creating the culture and technology for an intern...
The Research Data Alliance: Creating the culture and technology for an intern...
 
Netnography webinar
Netnography webinarNetnography webinar
Netnography webinar
 
Privacy Dynamics: Learning Privacy Norms for Social Software
Privacy Dynamics: Learning Privacy Norms for Social SoftwarePrivacy Dynamics: Learning Privacy Norms for Social Software
Privacy Dynamics: Learning Privacy Norms for Social Software
 
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting...
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...
 
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
 
Tutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social NetworksTutorial on Relationship Mining In Online Social Networks
Tutorial on Relationship Mining In Online Social Networks
 
20110128 connected action-node xl-sea of connections
20110128 connected action-node xl-sea of connections20110128 connected action-node xl-sea of connections
20110128 connected action-node xl-sea of connections
 
2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
 

Más de The Open University

Misinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing BattleMisinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing BattleThe Open University
 
Co-Creating Misinformation Resilient Societies
Co-Creating Misinformation Resilient Societies Co-Creating Misinformation Resilient Societies
Co-Creating Misinformation Resilient Societies The Open University
 
SASIG Workshop on “Improving the digital landscape for our children”
SASIG Workshop on “Improving the digital landscape for our children”SASIG Workshop on “Improving the digital landscape for our children”
SASIG Workshop on “Improving the digital landscape for our children”The Open University
 
Co-Inform (Co-Creating Misinformation Resilient Societies)
Co-Inform (Co-Creating Misinformation Resilient Societies)Co-Inform (Co-Creating Misinformation Resilient Societies)
Co-Inform (Co-Creating Misinformation Resilient Societies)The Open University
 
Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.The Open University
 
H2020 COMRADES project introduction
H2020 COMRADES project introduction H2020 COMRADES project introduction
H2020 COMRADES project introduction The Open University
 
Radicalisation detection on social media
Radicalisation detection on social mediaRadicalisation detection on social media
Radicalisation detection on social mediaThe Open University
 
Analysing the dark side of Social Media
Analysing the dark side of Social MediaAnalysing the dark side of Social Media
Analysing the dark side of Social MediaThe Open University
 
Detecting online grooming and radicalisation
Detecting online grooming and radicalisationDetecting online grooming and radicalisation
Detecting online grooming and radicalisationThe Open University
 
Detecting Grooming Behaviour on Social Media
Detecting Grooming Behaviour on Social MediaDetecting Grooming Behaviour on Social Media
Detecting Grooming Behaviour on Social MediaThe Open University
 
Semantics, Sensors, and the Social Web
Semantics, Sensors, and the Social WebSemantics, Sensors, and the Social Web
Semantics, Sensors, and the Social WebThe Open University
 

Más de The Open University (15)

Misinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing BattleMisinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing Battle
 
knod22-Alani.pdf
knod22-Alani.pdfknod22-Alani.pdf
knod22-Alani.pdf
 
Co-Creating Misinformation Resilient Societies
Co-Creating Misinformation Resilient Societies Co-Creating Misinformation Resilient Societies
Co-Creating Misinformation Resilient Societies
 
SASIG Workshop on “Improving the digital landscape for our children”
SASIG Workshop on “Improving the digital landscape for our children”SASIG Workshop on “Improving the digital landscape for our children”
SASIG Workshop on “Improving the digital landscape for our children”
 
COMRADES summary
COMRADES summaryCOMRADES summary
COMRADES summary
 
COMRADES project introduction
COMRADES project introduction COMRADES project introduction
COMRADES project introduction
 
Co-Inform (Co-Creating Misinformation Resilient Societies)
Co-Inform (Co-Creating Misinformation Resilient Societies)Co-Inform (Co-Creating Misinformation Resilient Societies)
Co-Inform (Co-Creating Misinformation Resilient Societies)
 
COMRADES ICT2018
COMRADES ICT2018COMRADES ICT2018
COMRADES ICT2018
 
Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.
 
H2020 COMRADES project introduction
H2020 COMRADES project introduction H2020 COMRADES project introduction
H2020 COMRADES project introduction
 
Radicalisation detection on social media
Radicalisation detection on social mediaRadicalisation detection on social media
Radicalisation detection on social media
 
Analysing the dark side of Social Media
Analysing the dark side of Social MediaAnalysing the dark side of Social Media
Analysing the dark side of Social Media
 
Detecting online grooming and radicalisation
Detecting online grooming and radicalisationDetecting online grooming and radicalisation
Detecting online grooming and radicalisation
 
Detecting Grooming Behaviour on Social Media
Detecting Grooming Behaviour on Social MediaDetecting Grooming Behaviour on Social Media
Detecting Grooming Behaviour on Social Media
 
Semantics, Sensors, and the Social Web
Semantics, Sensors, and the Social WebSemantics, Sensors, and the Social Web
Semantics, Sensors, and the Social Web
 

Último

Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........EfruzAsilolu
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样wsppdmt
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制vexqp
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdftheeltifs
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 

Último (20)

Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 

Making More Sense Out of Social Data

  • 1. Making More Sense Out of Social Data Harith Alani h+p://people.kmi.open.ac.uk/harith/ @halani harith-alani @halani 4th Workshop on Linked Science 2014— Making Sense Out of Data (LISC2014) ISWC 2014 -­‐ Riva del Garda, Italy
  • 2. Topics • Social media monitoring" • Behaviour role analysis" • Semantic sentiment " • Engagement in microblogs" • Cross platform and topic studies" • Semantic clustering" • Application examples"
  • 3. Take home messages • Social media has many more challenges and opportunities to offer" • Fusing semantics and statistical methods is gooood" • Studying isolated social media platforms is baaaad … or not good enough … anymore!"
  • 4. Sociograms • Capturing and graphing social relationships" • Moreno founder of sociograms and sociometry" • Assessing psychological well-being from social configurations of individuals and groups" Friendship Choices Among Fourth Graders (from Moreno, 1934, p. 38 h+p://diana-­‐jones.com/wp-­‐content/uploads/EmoRons-­‐Mapped-­‐by-­‐New-­‐Geography.pdf
  • 5. Computational Social Science Behaviour role Analysis “A field is emerging that leverages the capacity to collect and analyze data at a scale that may reveal pa+erns of individual and group behaviours.” “what does exisRng sociological network theory, built mostly on a foundaRon of one-­‐Rme “snapshot” data, typically with only dozens of people, tell us about massively longitudinal data sets of millions of people .. ?” Original slide by Markus Strohmaier h+p://gking.harvard.edu/files/LazPenAda09.pdf
  • 6. Social semantic linking …. in 2003 ! • Domain ontologies • SemanRcs for integraRng people, projects, and publicaRons • IdenRfy communiRes of pracRce • Browse evoluRon of social relaRonships and collaboraRons Alani, H.; Dasmahapatra, S.; O'Hara, K.and Shadbolt, N. IdenRfying communiRes of pracRce through ontology network analysis. IEEE Intelligent Systems, 18(2) 2003.
  • 7. Linking scientists …. in 2005 • Who is collaboraRng with whom? • How funding programmes impacted collaboraRons over Rme? data sources gatherers and mediators ontology knowledge repository (triplestore) applicaRons Alani, H.; Gibbins, N.; Glaser, H.; Harris, S. and Shadbolt, N. Monitoring research collaboraRons using semanRc web technologies. ESWC, Crete, 2005.
  • 8. Bigger data, greater sociograms
  • 10. Jan 29, 2013 In-house Social Platforms
  • 11. Tools for monitoring social networks
  • 12.
  • 13. Reputation Monitoring • http://www.robust-project.eu/videos-demos "
  • 14. Challenges and Opportunities • Integration" – How to represent and connect this data?" • Behaviour" – How can we measure and predict behaviour?" – Which behaviours are good/bad in which community type?" • Community Health" – What health signs should we look for? " – How to predict this health?" • Engagement" – How can we measure and maximise engagement? " • Sentiment" – How to measure it? " – Track it towards entities and contexts? "
  • 16. SemanRc Web & Linked Data SemanRc SenRment Analysis lurkers) ini#ators) followers) leaders) Macro/Micro Behaviour Analysis StaRsRcal Analysis Community Engagement Cumulative density functions of each dimension showing distributions for initiated and in-degree ratio and do not deviate away, at the other ex-treme users are found to post in a large range initiated (initiation) and in-degree ratio the density functions are skewed towards where only a few users initiate discussions to by large portions of the community. points per post (quality) is also skewed to-wards values indicating that the majority of users the best answers consistently. indicate that feature levels derived from distributions will be skewed towards lower values, initiated the definition of high for this anything exceeding 1.55x10−5. distribution of each dimension is shown in Fig-ure Figure 8: Boxplots of the feature distributions in each of the 11 Feature distributions are matched against the feature levels from equal-frequency binning ping. This mapping is shown in Table 2 where certain clusters are combined together as they have the same feature-level mapping patterns (i.e. 5,7 and 8,9). then interpreted the role labels from these clusters, and their subsequent patterns, as follows: • 0 - Focussed Expert Participant: this user type provides high quality answers but only within forums that they do not deviate from. They also have a mix of asking questions and answering them. • 1 - Focussed Novice: this user is focussed within few select forums but does not provide good qual-ity Technologies
  • 17. MODELLING AND LINKING SOCIAL MEDIA DATA
  • 19. Semantically-Interlinked Online Communities (SIOC) • SIOC is an ontology for representing and integrating data from the social web" • Simple, concise, and popular" SRll seeking the one size that’ll fit all sioc-project.org
  • 20. SIOC for Discussion forums • SIOC is well tailored to fit discussion forum communities" • Needs extension to fit other communities, such as microblogs and Q&A"
  • 21. Twitter in SIOC • Microblogs" • No forum structure"
  • 26. Why we monitor behaviour? • Understand role of people in a community • Monitor impact of behaviour on community evolution • Forecast community future • Learn which behaviour should be encouraged or discouraged • Find the best mix of behaviour to increase engagement in an online community • See which users need more support, which ones should be confined, and which ones should be promoted
  • 28. Linking people via sensors, social media, papers, projects <?xml version="1.0"?>! <rdf:RDF! xmlns="http:// tagora.ecs.soton.ac.uk/schemas/ tagging#"! xmlns:rdf="http://www.w3.org/ 1999/02/22-rdf-syntax-ns#"! xmlns:xsd="http://www.w3.org/2001/ XMLSchema#"! xmlns:rdfs="http://www.w3.org/ 2000/01/rdf-schema#"! xmlns:owl="http://www.w3.org/ 2002/07/owl#"! xml:base="http:// tagora.ecs.soton.ac.uk/schemas/ tagging">! <owl:Ontology rdf:about=""/>! <owl:Class rdf:ID="Post"/>! <owl:Class rdf:ID="TagInfo"/>! <owl:Class rdf:ID="GlobalCooccurrenceInfo"/>! <owl:Class rdf:ID="DomainCooccurrenceInfo"/>! <owl:Class rdf:ID="UserTag"/>! <owl:Class rdf:ID="UserCooccurrenceInfo"/>! <owl:Class rdf:ID="Resource"/>! <owl:Class rdf:ID="GlobalTag"/>! <owl:Class rdf:ID="Tagger"/>! <owl:Class rdf:ID="DomainTag"/>! <owl:ObjectProperty rdf:ID="hasPostTag">! <rdfs:domain rdf:resource="#TagInfo"/>! </owl:ObjectProperty>! <owl:ObjectProperty rdf:ID="hasDomainTag">! <rdfs:domain rdf:resource="#UserTag"/>! </owl:ObjectProperty>! <owl:ObjectProperty rdf:ID="isFilteredTo">! • Integration of physical presence and online <rdfs:range information" rdf:resource="#GlobalTag"/>! • <rdfs:domain Semantic user profile generation" rdf:resource="#GlobalTag"/>! </owl:ObjectProperty>! • <owl:ObjectProperty Logging of face-to-face contact" rdf:ID="hasResource">! <rdfs:domain rdf:resource="#Post"/>! <rdfs:range =…! • Social network browsing" • Analysis of online vs offline social networks" Alani, H.; Szomszor, M.; Ca+uto, C.; den Broeck, W.; Correndo, G. and Barrat, A.. Live social semanRcs. ISWC, Washington, DC, 2009
  • 29. 1.2" 1" 0.8" 0.6" 0.4" 0.2" 0" Online+offline social networks H.Index" F2F"Degree" F2F"Strength" 1" 5" 9" 13" 17" 21" 25" 29" 33" 37" 41" 45" • What’s your social configura-on? • What does it say about you? • And what you’ll become? Barrat, A.; C., Ca+uto; M., Szomszor; W., Van den Broeck and Alani, H. Social dynamics in conferences: analyses of data from the Live Social SemanRcs applicaRon. ISWC, Shanghai, China, 2010.
  • 31. 1.000 0.274 0.086 0.909** 0.274 1.000 -0.059 0.513 0.086 -0.059 1.000 0.065 0.909** 0.513 0.065 1.000 Clustering for identifying emerging roles – Map the distribution of each feature in each cluster to a level (i.e. low, mid, high) – Align the mapping patterns with role labels Figure 8: Boxplots of the feature distributions in each of the 11 clus-ters. Mapping Table 2: Mapping of cluster of cluster dimensions dimensions to to levels levels Cluster Dispersion Initiation Quality Popularity 0 L M H L 1 L L L L 2 M H L H 3 H H H H 4 L H H M 5,7 H H L H 6 L H M M 8,9 M H H H 10 L H M H • 1 - Focussed Novice: focussed within a few select forums but does not provide good quality content. • 2 - Mixed Novice: a novice across a medium range of topics • 3 - Distributed Expert: expert on a variety of topics and participates across many different forums …. • 3 - Distributed Expert: an expert on a variety of topics and participates across many different fo-rums • 4 - Focussed Expert Initiator: similar to cluster 0 in that this type of user is focussed on certain topics and is an expert on those, but to a large ex-tent starts discussions and threads, indicating that his/her shared content is useful to the community • 5.7 - Distributed Novice: participates across a range of forums but is not knowledgeable on any topics
  • 32. Encoding Roles in Ontologies with SPIN
  • 33. Behaviour role extraction from Social Media Data Structural, social network, reciprocity, persistence, participation • Bottom Up analysis" – Every community member is classified into a “role”" – Unknown roles might be identified" – Copes with role changes over time " iniRators lurkers followers leaders Feature levels change with the dynamics of the community Associations of roles with a collection of feature-to-level mappings e.g. in-degree -> high, out-degree -> high Run rules over each user’s features and derive the community role composition Angeletou, S; Rowe, M, and Alani, H. Modelling and analysis of user behaviour in online communiRes. ISWC 2011, Bonn, Germany
  • 34. Correlation of behaviour roles with community activity • How certain behaviour roles impact activity in different community types?" Forum on CommuRng and Transport Forum on Rugby Forum on Mobile Phones and PDAs
  • 35. Community types • So do communities of different types behave differently? • Analysed IBM Connections communities to study participation, activity, and behaviour of users • Compare exhibited community with what users say they use the community for – Does macro behaviour match micro needs?
  • 36. Community types Community Wiki Page Blog Post Forum Thread Wiki Edit Blog Comment Forum Reply Tag Bookmark File § Data consists of non-private info on IBM Connections Intranet deployment § Communities: § ID § Creation date § Members § Used applications (blogs, Wikis, forums) § Forums: § Discussion threads § Comments § Dates § Authors and responders
  • 37. Community types • Muller, M. (CHI 2012) identified five distinct community types in IBM Connections:" – Communities of Practice (CoP): for sharing information and network" – Teams: shared goal for a particular project or client" – Technical Support: support for a specific technology" – Idea Labs Communities: for focused brainstorming " – Recreation Communities: recreational activities unrelated to work. " • Our data consisted of 186 most active communities:" – 100 CoPs, 72 Teams, and 14 Technical Support communities " – No Ideas of Recreation communities"
  • 38. Behaviour roles in different community types • Members of Team communities are more engaged, popular, and initiate more discussions • Technical Support community members are mostly active in a few communities, and don’t initiate or contribute much! • CoP members are active across many communities, and contribute more Rowe, M. Fernandez, M., Alani, H., Ronen, I., Hayes, C., Karnstedt, M.: Behaviour Analysis across different types of Enterprise Online Communities. WebSci 2012
  • 39. Behaviour roles and community health 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Churn Rate False Positive FPR Rate TPR 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 User Count FPR TPR • Machine learning models to predict community health based on compositions and evolution of user behaviour • Churn rate: proportion of community leavers in a 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Seeds / Non−seeds Prop FPR TPR 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Clustering Coefficient FPR TPR given time segment. • User count: number of users who posted at least once. • Seeds to Non-seeds ratio: proportion of posts that get responses to those that don’t • Cluster coefficient: extent to which the community forms a clique. Health categories 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Seeds / Non−seeds Prop FPR TPR 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Clustering Coefficient FPR TPR False Positive Rate False Positive Rate False Positive Rate True Positive Rate True Positive Rate True Positive Rate True Positive Rate The fewer Focused Experts in the community, the more posts will received a reply! There is no “one size fits all” model! Rowe, M. and Alani, H. What makes communiRes Rck? Community health analysis using role composiRons. SocialCom 2012, Amsterdam, The Netherlands.
  • 41. Semantic sentiment analysis on social media • Range of features and statistical classifiers have been used in social media sentiment analysis in recent years • Semantics have often been overlooked – Semantic Features – Semantic Patterns • Semantic concepts can help determining sentiment even when no good lexical clues are present
  • 42. Sentiment Analysis hate negative honest positive inefficient negative Love positive … Sentiment Lexicon I really love the iPhone I hate the iPhone Lexical-Based Approach Naïve Bayes, SVM, MaxEnt , etc. Learn Model Apply Model Training Set Test Set Model Machine Learning Approach
  • 43. Semantic Concept Extraction • Extract semantic concepts from tweets data and incorporate them into the supervised classifier training. OpenCalais and Zemanta. Their experimental results showed that AlchemyAPI best for entity extraction and semantic concept mapping. Our datasets consist informal tweets, and hence are intrinsically different from those used in [10]. There-fore we conducted our own evaluation, and randomly selected 500 tweets from the STS corpus and asked 3 evaluators to evaluate the semantic concept extraction outputs gen-erated from AlchemyAPI, OpenCalais and Zemanta. No. of Concepts Entity-Concept Mapping Accuracy (%) Extraction Tool Extracted Evaluator 1 Evaluator 2 Evaluator 3 AlchemyAPI 108 73.97 73.8 72.8 Zemanta 70 71 71.8 70.4 OpenCalais 65 68 69.1 68.7 Table 2. Evaluation results of AlchemyAPI, Zemanta and OpenCalais. The assessment of the outputs was based on (1) the correctness of the extracted entities; and (2) the correctness of the entity-concept mappings. The evaluation results presented in Table 2 show that AlchemyAPI extracted the most number of concepts and it also has the highest entity-concept mapping accuracy compared to OpenCalais and Zematna. As such, we chose AlchemyAPI to extract the semantic concepts from our three datasets. Table 3 lists the total number of entities extracted and the number semantic concepts mapped against them for each dataset. STS HCR OMD No. of Entities 15139 723 1194 No. of Concepts 29 17 14 Table 3. Entity/concept extraction statistics of STS, OMD and HCR using AlchemyAPI.
  • 44. Impact of adding semantic features • Incorporating semantics increases accuracy against the baseline by: – 6.5% for negative sentiment, – 4.8% for positive sentiment – F1 = 75.95%, with 77.18% Precision and 75.33% Recall Destroy(((Invading(Germs(( Nega%ve' Nega%ve'Concept' • OK, but what about such cases? • Can semantics help? Saif, H., He, Y. and Alani, H. SemanRc senRment analysis of twi+er. ISWC 2012, Boston, US.
  • 45. Semantic Pattern Approaches • Apply syntac-c and seman-c processing techniques • Use external semanRc resources (e.g. Dbpedia, Freebase) to idenRfy semanRc concepts in Tweets Threat Trojan Horse Hack Code Program Malware Dangerous Harm Spyware • Extract clusters of similar contextual semanRcs and senRment, and use as pa+erns in senRment analysis
  • 46. Tweet-Level Sentiment Analysis Features Based on 9 Twi+er datasets MaxEnt Classifier Accuracy F-Measure Minimum Maximum Average Minimum Maximum Average Syntactic Twitter Features -0.23 3.91 1.24 -0.25 4.53 1.62 POS -0.89 2.92 0.79 -0.91 5.67 1.25 Lexicon -0.44 4.23 1.30 -0.38 5.81 1.83 Average -0.52 3.69 1.11 -0.52 5.33 1.57 Semantic Concepts -0.22 2.76 1.20 -0.40 4.80 1.51 LDA-Topics -0.47 3.37 1.20 -0.68 6.05 1.68 SS-Patterns 0.70 9.87 3.05 1.23 9.78 3.76 Average 0.00 5.33 1.82 0.05 6.88 2.32 Table 6: Win/Loss in Accuracy and F-measure of using different features for sentiment classifica-tion on all nine datasets. Win/Loss in Accuracy and F-­‐measure of using different features for senRment classificaRon on all nine datasets. classifier described in Section 4.2. Note that STS-Gold is the only dataset among the other 9 that provides named entities manually annotated with their sentiment labels (positive, negative, neutral). Therefore, our evaluation in this task is done using the Hassan S., He, Y., Miriam F.and Harith A., SemanRc Pa+erns for SenRment Analysis of Twi+er, ISWC 2014, Trento, Italy
  • 47. Entity-Level Sentiment Analysis 67.00 65.00 63.00 61.00 59.00 57.00 55.00 Gold standard of 58 enRRes Accuracy F1 Unigrams LDA-­‐Topics SemanRc Concepts SS-­‐Pa+erns Hassan S., He, Y., Miriam F.and Harith A., SemanRc Pa+erns for SenRment Analysis of Twi+er, ISWC 2014, Trento, Italy
  • 49. Different Engagement Patterns Forum on a celebrity Forum on transport
  • 52.
  • 53. … “few people took part” • 309 invitees from media, academia, and public engagement bodies" • 2 invitees contributed to the site, with 2 edits!!
  • 54. Recipe for more engaging posts?
  • 55.
  • 56. Ask the (Social) Data • What’s the model of good/bad tweets?" • What features are associated with each group?"
  • 57. term influenced by external factors. Properties influencing popularity include content - generally referred to as content features. In Table 1 we define user and content features and study their influence on the discussion “continuation”. user attributes - describing the reputation of the user - and attributes of a post’s content - generally referred to as content features. In Table 1 we define user and content features and study their influence on the discussion “continuation”. Feature Engineering Table 1. User and Content Features User Features Table 1. User and Content Features In Degree: Number of followers of U # Out Degree: Number of users U follows # List Degree: Number of lists U appears User on. Features Lists group users by topic # Post Count: Total number of posts the user has ever posted # In Degree: Number of followers of U # Out Degree: Number of users U follows # List Degree: Number of lists U appears on. Lists group users by topic # Post Count: Total number of posts the user has ever posted # User Age: Number of minutes from user join date # Post Rate: Posting frequency of the user PostCount UserAge Content Features User Age: Number of minutes from user join date # Post Rate: Posting frequency of the user PostCount Post length: Length of the post in characters # Complexity: Cumulative entropy of the unique words in post p λ UserAge Content Features of total word length n and pi the frequency of each word ! i∈[1,n] pi(log λ−log pi) Post length: Length of the post in characters # Complexity: Cumulative entropy of the unique words in post p λ Uppercase count: Number of uppercase words # Readability: Gunning fog index using average sentence length (ASL) [7] of total word length n and pi the frequency of each word λ ! i∈[1,n] pi(log λ−log pi) and the percentage of complex words (PCW). 0.4(ASL + PCW) λ Uppercase count: Number of uppercase words # Verb Count: Number of verbs # Noun Count: Number of nouns # Readability: Gunning fog index using average sentence length (ASL) [7] and the percentage of complex words (PCW). 0.4(ASL + PCW) Adjective Count: Number of adjectives # Referral Verb Count: Count: Number Number of of @verbs user # # Time Noun in the Count: day: Number Normalised of nouns time in the day measured in minutes # # Informativeness: Terminological novelty of the post wrt other posts Adjective Count: Number of adjectives # Referral Count: The Number cumulative of @user tfIdf value of each term t in post p # Time in Polarity: the day: Cumulation Normalised time of polar in the term day weights measured in p in (using minutes # Informativeness: Terminological novelty of the post wrt other posts Sentiwordnet3 lexicon) normalised by polar terms count Po+Ne The cumulative tfIdf value of each term t in post p ! t∈p tfidf(t, p) ! t∈p tfidf(t, p) Polarity: Cumulation of polar term weights in p (using |terms| Sentiwordnet3 lexicon) normalised by polar terms count Po+Ne |terms| • Focus Features" – Topic entropy: the distribution of the author across community forums" – Topic Likelihood: the likelihood that a user posts in a specific forum given his post history" 4.2 Experiments Experiments are intended to test the performance of different classification mod-els • Measures the affinity that a user has with a given forum" • Lower likelihood indicates a user posting on an unfamiliar topic" 4.2 Experiments Experiments are intended to test the performance of different classification mod-els in identifying seed posts. Therefore we used four classifiers: discriminative classifiers Perceptron and SVM, the generative classifier Naive Bayes and the decision-tree classifier J48. For each classifier we used three feature settings: user features, content features and user+content features. in identifying seed posts. Therefore we used four classifiers: discriminative classifiers Perceptron and SVM, the generative classifier Naive Bayes and the
  • 58. Classification of Posts Seed Posts Non-Seed Posts § Binary classification model § Trained with social, content, and combined features § 80/20 training/testing § Identify best feature types, and top individual features, in predicting post classification
  • 59. Engagement on Boards.ie • Which posts are more likely to stimulate responses and discussions?" • What impacts engagement more; user features, post content, forum affinity?" • Which individual features are most influential?"
  • 60. Top Features for Engagement on Boards.ie • Content features were key!" • Best predictions were achieved when combining user, content, and focus features" • URLs (Referral Count) in a post negatively impact discussion activity" • Seed Posts (posts that receive replies) are associated with greater activity levels, and because it has alreadfyorubme elinkeluihsoeodd"in other Lower informativeness is associated with seed posts" – i.e. seeds use investigations (e.g., [14]). Boards.ie does not provide explicit social relations be-tween community members, unlike for example Facebook and language that is familiar to the community" Twitter. We followed the same strategy proposed in [3] for extracting social networks from Digg, and built the Boards.ie social network for users, weighting edges cumulatively by the number of replies between any two users. TABLE I DESCRIPTION OF THE BOARDS.IE DATASET Posts Seeds Non-Seeds Replies Users 1,942,030 90,765 21,800 1,829,465 29,908 • Rowe, M.; Angeletou, S. and Alani, H. AnRcipaRng discussion acRvity on community forums. SocialCom 2011, Boston, MA, USA.
  • 61. former dataset contains tweets which relate to the Haiti earthquake disaster, covering a varying timespan. The latter dataset contains all tweets published during the duration of president Barack Obama’s State of the Union Address speech. Our goal is to predict discussion activity based on the features of a given post by first identifying seed posts, before moving on to predict the discussion level. 12 user-age (0.015) content-noun-count (0.002) 15 13 content-adj-uppercase-count (count 0.005) (0.012) content-adj-readability count (0.0) (0.001) 16 14 content-complexity noun-count ((0.0) 0.010) content-informativeness verb-count (0.001) (17 15 adj-count (0.005) adj-count (0.0) 16 content-complexity (0.0) content-informativeness (17 content-verb-count (0.0) content-uppercase-count (Fig. 3. Contributions of top-5 features to identifying Non-seeds (N) Upper plots are for the Haiti dataset and the lower plots are for the dataset. Top Features for Engagement on Twitter • Top are list-degree, in-degree, Within the above datasets many of the posts are not seeds, but are instead replies to previous posts, thereby featuring in the discussion chain as a node. In [13] retweets are considered as part of the discussion activity. In our work we identify discussions using the explicit “in reply to” information obtained by the Twitter API, which does not include retweets. We make this decision based on the work presented in boyd et.al [4], where an analysis of retweeting as a discussion practice is presented, arguing that message forwards adhere different motives which do not necessarily designate a response to the initial message. Therefore, we only investigate explicit replies to messages. To gather our discussions, and our seed posts, we iteratively move up the reply chain - i.from reply to parent post - until we reach the seed post in the discussion. We define this process as dataset enrichment, and is performed by querying Twitter’s REST API6 using the in reply to id of the parent post, and moving one-step a time up the reply chain. This same approach has been employed successfully in work by [12] to gather a large-scale conversation dataset from Twitter. informativeness, and #posts" " • Top are list-degree, time of posting, in-degree, and #posts" content-verb-count (0.0) content-uppercase-count (Fig. 3. Contributions of top-5 features to identifying Non-seeds (N) Upper plots are for the Haiti dataset and the lower plots are for the dataset. HaiR Earthquake State Union Address Table 2. Statistics of the datasets used for experiments The top-most ranks from each dataset are dominated by user features Dataset Users Tweets Seeds Non-Seeds Replies Haiti 44,497 65,022 1,405 60,686 2,931 Union Address 66,300 80,272 7,228 55,169 17,875 Rowe, M., Angeletou, S., Alani, H. PredicRng Discussions on the Social SemanRc Web. ESWC, Crete, 2011 Table 2 shows the statistics that explain our collected datasets. One can
  • 62. Top Features for Engagement on Twitter – Earth Hour 2014 neg pos 0 5 10 15 20 25 30 Length neg pos 0.0 0.5 1.0 1.5 Complexity neg pos 0 10 20 30 40 Readability neg pos −4 −2 0 2 4 Polarity • Top influential features do not match those found for Board.ie or for two non-random Twitter datasets"
  • 63. Top Features for Engagement on Twitter – Dorset Police neg pos 5 10 15 20 25 30 Length neg pos 0.6 0.8 1.0 1.2 1.4 complexity neg pos −4 −3 −2 −1 0 1 2 3 polarity neg pos 0 1 2 3 4 5 6 7 mentions ! • Top 4 features share 3 with Twitter Earth Hour dataset" Fernandez, M., Cano, E., and Alani, H. Policing Engagement via Social Media. CityLabs workshop, SocInfo, Barcelona, 2014
  • 64.
  • 65. Publications about social media by Katron Weller -­‐ h+p://kwelle.files.wordpress.com/2014/04/figure1.jpg
  • 66. Moving on … § How can we move on from these (micro) studies? § Are results consistent across datasets, and platforms? § One way forward is: § Multiple platforms § Multiple topics
  • 67. Papers studying single/multiple social media platforms Survey done on all submi7ed papers to Web Science conferences
  • 68. Papers studying single/multiple social media platforms Survey done on all submi7ed papers to Web Science conferences
  • 69. Papers studying single/multiple social media platforms Survey done on all submi7ed papers to Web Science conferences
  • 70. Papers studying single/multiple social media platforms Survey done on all submi7ed papers to Web Science conferences
  • 71. Apples and Oranges • We mix and compare different datasets, topics, and platforms • Aim is to test consistency and transferability of results
  • 72. 7 datasets from 5 platforms Pla1orm Posts Users Seeds Non-­‐seeds Replies Boards.ie 6,120,008 65,528 398,508 81,273 5,640,227 Twi+er Random 1,468,766 753,722 144,709 930,262 390,795 Twi+er (HaiR Earthquake) 65,022 45,238 1,835 60,686 2,501 Twi+er (Obama State of Union Address) 81,458 67,417 11,298 56,135 14,025 SAP 427,221 32,926 87,542 7,276 332,403 Server Fault 234,790 33,285 65,515 6,447 162,828 Facebook 118,432 4,745 15,296 8,123 95,013 Seed posts are those that receive a reply Non-seed posts are those with no replies
  • 73. Data Balancing Pla1orm Seeds Non-­‐seeds Instance Count Boards.ie 398,508 81,273 162,546 Twi+er Random 144,709 930,262 289,418 Twi+er (HaiR 1,835 60,686 3,670 Earthquake) Twi+er (Obama State of Union Address) 11,298 56,135 22,596 SAP 87,542 7,276 14,552 Server Fault 65,515 6,447 12,894 Facebook 15,296 8,123 16,246 Total 521,922 For each dataset, an equal number of seeds and non-seed posts are used in the analysis.
  • 74. Classification Results Feature P R F1 Social 0.592 0.591 0.591 Content 0.664 0.660 0.658 Social+Content 0.670 0.666 0.665 (Random) (HaiR Earthquake) (Obama’s State Union Address) P R F1 0.561 0.561 0.560 0.612 0.612 0.611 0.628 0.628 0.628 P R F1 0.968 0.966 0.966 0.752 0.747 0.747 0.974 0.973 0.973 Feature P R F1 Social 0.542 0.540 0.539 Content 0.650 0.642 0.639 Social+Content 0.656 0.649 0.646 P R F1 0.650 0.631 0.628 0.575 0.541 0.521 0.652 0.632 0.629 P R F1 0.528 0.380 0.319 0.626 0.380 0.275 0.568 0.407 0.359 Feature P R F1 Social 0.635 0.632 0.632 Content 0.641 0.641 0.641 Social+Content 0.660 0.660 0.660 § Performance of the logisRc regression classifier trained over different feature sets and applied to the test set.
  • 75. Effect of features on engagement Boards.ie β 2 1 0 −1 −2 Twitter Random β 1.0 0.5 0.0 −0.5 Twitter Haiti 6e+16 4e+16 2e+16 0e+00 −2e+16 −4e+16 −6e+16 Twitter Union 0.2 0.0 −0.2 β −0.4 −0.6 −0.8 Server Fault β 2.0 1.5 1.0 0.5 0.0 −0.5 −1.0 SAP β 5 0 −5 −10 Facebook β 0.5 0.4 0.3 0.2 0.1 0.0 −0.1 In−degree Out−degree Post Count Age Post Rate Post Length Referrals Count Polarity Complexity Readability Readability Fog Informativeness Logistic regression coefficients for each platform's features
  • 76. Comparison to literature § How performance of our shared features compare to other studies on different datasets and platforms?
  • 77. Positive impact Negative impact Mismatch Match Comparison to literature
  • 78. Positive impact Negative impact Mismatch Match Comparison to literature
  • 80. Semantic Clustering • Statistical models play important roles in social data analyses • Keeping such models up to date often means regular, expensive, and time consuming retraining • Semantic Features are likely to decay more slowly than lexical features • Could adding semantics to the models extend their value and life expectancy? Cano, E., He, Y., Alani, H. Stretching the Life of Twitter Classifiers with Time-Stamped Semantic Graphs. ISWC 2014, Trento, Italy.
  • 81. Semantic Representation of a Tweet <dbo:PresidentOfUnitedStateofAmerica> <skos:Nobel_Peace_Price_laureates> rdf:type dcterms:subject <dbp:Barack_Obama> dbprop:nationality American <skos:English-language_television_stations> <skos:PresidentsOfEgypt> <dbp:Hosni_Mubarak> <dbp:CNN> <dbp:Egypt> dbprop:languages <dbp:Egyptian_Arabic> <skos:Arab_republics> dcterms:subject dcterms:subject <dbp:Country> rdf:type rdf:type
  • 82. Evolution of Semantics • Renewed DBpedia Graph snapshots are taken over time" • Semantic features updated based on new knowledge in DBpedia" v3.6 v3.7 v3.8 <Budget_Control_Act_of_2011> wikiPageWikiLink <Barack_Obama> <UnitedStatesPresidentialCandidates> <Hawaii> spouse <MechelleObama> birth1place wikiPageWikiLink
  • 83. Experiments Extending fitness of model to proceedings epochs • 12,000 annotated tweets" • Adding Classes as clustering features provide best performance" Cross-­‐ Epoch 2010-­‐2011 2010-­‐2013 2011-­‐2013 Average F1 F1 F1 BoW 0.634 0.481 0.261 0.458 Category 0.683 0.539 0.524 0.582 Property 0.665 0.557 0.502 0.603 Resource 0.774 0.544 0.445 0.587 Class 0.691 0.665 0.669 0.675 Same-­‐ epoch 2010-­‐2010 2011-­‐2011 Average BoW 0.831 0.875 0.845
  • 85. What policymakers really want from Social Media? 1. "Fish where the fish is" – one interface to access multiple SNS" – layman monitoring of users and topics " 2. "My consistency first" – communicating with users in own constituency" – find local groups, events, and topics" 3. "What are their needs, complaints, and preferences?" – what citizens talk about, complain about" – what are the top 5-10 topics of the day" 4. Who should I talk to?" – who are the influential citizens" – whom to engage with" 5. What about Tomorrow?" – which topics will get hotter?" – which discussions are likely to grow further?" 6. Presence and popularity" – what writing recipe to follow to reach more people" 7. Privacy" – concerns on citizens’ privacy when extracting info" – concerns on their own privacy with 3rd party SNS access tools" Interviews with 31 policymakers
  • 86. Wandhöfer, T.; Taylor, S.; Alani, H.; Zoshi, S.; Sizov, S.; et al. Engaging poliRcians with ciRzens on social networking sites: the WeGov Toolbox. IJEGR, 8(3), 2012
  • 87. Monitoring SCN " Monitoring of evolution of community activities and level of contributions in SAP Community Networks – SCN " Demo
  • 88. SCN Behaviour " Community managers can monitor behaviour composition of forums, and its association to activity evolution "
  • 90. FB Groups Sentiment Macro Behaviour Micro Behaviour Topics
  • 91. Course tutors Real Rme monitoring Behaviour Analysis SenRment Analysis Topic Analysis • How acRve the engaged the course group is? • How is senRment towards a course evolving? • Are the leaders of the group providing posiRve/negaRve comments? • What topics are emerging? • Is the group flourishing or diminishing? • Do students get the answers and support they need? Thomas, K.; Fernández, M.; Brown, S., Alani, H. OUSocial2: a plaxorm for gathering students’ feedback from social media. (Demo) ISWC 2014, Trento, Italy.
  • 92. DEMO
  • 94. Thanks to .. Hassan Saif Lara Piccolo Thomas Dickensen Gregoire Burel Miriam Fernandez Smitashree Choudhury Elizabeth Cano Matthew Rowe Keerthi Thomas Sofia Angeletou
  • 95. Heads-up Semantic Patterns for Sentiment Analysis of Twitter Thursday 15.40 - Session: Social Media" Semantic Patterns for Sentiment Analysis of Twitter Thursday 16:00 - Session: Social Media" User Profile Modeling in Online Communities ! Sunday 2:05 pm - SWCS Workshop" OUSocial2: a pla1orm for gathering students’ feedback from social media (DEMO) The Topics they are a-­‐Changing — Characterising Topics with Time-­‐Stamped Semanc Graphs (POSTER)" ! Automac Stopword Generaon using Contextual Semancs for Senment Analysis of Twi_er (POSTER)