SlideShare una empresa de Scribd logo
1 de 15
Descargar para leer sin conexión
WHAT MAKES COMMUNITIES
TICK?
COMMUNITY HEALTH ANALYSIS
USING ROLE COMPOSITIONS

MATTHEW ROWE1 AND HARITH ALANI2
1SCHOOL  OF COMPUTING AND COMMUNICATIONS,
LANCASTER UNIVERSITY, LANCASTER, UK
2KNOWLEDGE MEDIA INSTITUTE, THE OPEN UNIVERSITY,

MILTON KEYNES, UK

2012 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING
AMSTERDAM, THE NETHERLANDS

http://www.matthew-rowe.com | http://www.lancs.ac.uk/staff/rowem
m.rowe@lancaster.ac.uk
Managing Online Communities
 1


            Many businesses provide online communities to:
                 Increase customer loyalty
                 Raise brand awareness
                 Spread word-of-mouth
                 Facilitate idea generation
            Online communities incur significant investment in terms of:
                 Money spent on hosting and bandwidth
                 Time and effort for maintenance
            Community managers monitor community ‘health’ to:
               Ensure longevity

               Enable value generation

            However, the notion of ‘health’ is hard to pin down


What makes Communties Tick? Community Health Analysis using Role Compositions
The Need for Interpretation
 2


            Online communities are dynamic behavioural ecosystems
                 Users in communities can be defined by their roles
                      i.e. Exhibiting similar collective behaviour
                 Prevalent behaviour can impact upon community members and health
            Management of communities is helped by:
               Understanding the relation between behaviour and health
                      How user behaviour changes are associated with health
                      Encouraging users to modify behaviour, in turn affecting health
                              e.g. content recommendation to specific users
                 Predicting health changes
                      Enables early decision making on community policy


            Can we accurately and effectively detect positive and negative changes in
             community health from its composition of behavioural roles?
What makes Communties Tick? Community Health Analysis using Role Compositions
Outline
 3


            SAP Community Network
            Community Health Indicators
            Measuring Role Compositions:
                 Measuring user behaviour
                 Inferring behaviour roles
                 Mining behaviour roles
            Experiments:
                 Health Indicator Regression
                 Health Change Detection
            Findings and Conclusions




What makes Communties Tick? Community Health Analysis using Role Compositions
SAP Community Network
 4

            Collection of SAP forums in which users discuss:
                 Software development
                 SAP Products
                 Usage of SAP tools
            Points system for awarding best answers
                 Enables development of user reputation


            Provided with a dataset covering 33 communities:
                 Spanning 2004 - 2011



                                                                           1400
                 95,200 threads

                                                                           1000
                 421,098 messages                            Post Count

                      78,690 were allocated points                        600


                 32,942 users
                                                                           0 200




                                                                              2004   2005   2006   2007   2008   2009   2010   2011




What makes Communties Tick? Community Health Analysis using Role Compositions
Community Health Indicators
 5

            From the literature there is no single agreed measure of ‘community health’
                 Multi-faceted nature: loyalty, participation, activity, social capital
                 Different communities and platforms look at different indicators


            Indicator 1: Churn Rate (loyalty)
                 The proportion of users who participate in a community for the final time
            Indicator 2: User Count (participation)
                 The number of participating users in the community
            Indicator 3: Seeds-to-Non-Seeds Posts Proportion (activity)
                 The Proportion of seed posts (i.e. thread starters that receive a reply) to non-seeds (i.e. no
                  reply)
            Indicator 4: Clustering Coefficient (social capital)
                 The average of users’ clustering coefficients within the largest strongly connected
                  component


What makes Communties Tick? Community Health Analysis using Role Compositions
Measuring Role Compositions I:
        Modelling and Measuring User Behaviour
 6

            According to existing literature, user behaviour can be defined using 6
             dimensions:
                 (Hautz et al., 2010), (Nolker and Zhou, 2005), (Zhu et al., 2009), (Zhu et al.,
                  2011)
                 Focus Dispersion
                      Measure: Forum entropy of the user
                 Engagement
                      Measure: Out-degree proportioned by potential maximal out-degree
                 Popularity
                      Measure: In-degree proportioned by potential maximal in-degree
                 Contribution
                      Measure: Proportion of thread replies created by the user
                 Initiation
                      Measure: Proportion of threads that were initiated by the user
                 Content Quality
                      Measure: Average points per post awarded to the user



What makes Communties Tick? Community Health Analysis using Role Compositions
Measuring Role Compositions II:
        Inferring Roles
 7




            1. Construct features for community users at a given time step
            2. Derive bins using equal frequency binning
                 Popularity-low cutoff = 0.5, Initiation-high cutoff = 0.4!
            3. Use skeleton rule base to construct rules using bin levels
                 Popularity = low, Initiation = high -> roleA!
                 Popularity < 0.5, Initiation > 0.4 -> roleA!
            4. Apply rules to infer user roles and community composition
            5. Repeat 1-4 for following time steps

What makes Communties Tick? Community Health Analysis using Role Compositions
e as a parameter k. To judge the best model - i.e. cluster
hod and number of clusters - we measure the cohesion and
aration of a given clustering as follows: For each clustering
 rithm (Ψ) we iteratively increase the number of clusters
                             Measuring Role Compositions III:
 to use where 2 ≥ k ≥ 30. At each increment of k we
 rd the silhouette coefficient produced by Ψ, this is defined
                             Mining Roles (Skeleton rule base compilation)
a given element (i) in a given cluster as:
                8                          bi − a i
                         si =                                                                         (3)
                                          max(ai , bi )
                                        1. Select the tuning segment
Where ai denotes the average distance to all other items
he same cluster and  i is given by calculating thebehaviour dimensions
                       b 2. Discover correlated average
ance with all other items inRemoved Engagement and and Fig. 2. kept Popularityfeature distributions in each of the 11 clusters.
                            
                                  each other distinct cluster Contribution, Boxplots of the (Pearson r > 0.75, p < 0.01)
    taking the minimum distance. The value of s i ranges Feature distributions are matched against the feature levels derived from equal-
                                                                    frequency binning
ween −1 and 1 where the Clusterindicates a poor cluster- groups
                        3. former users into behavioural
                                                                                                       TABLE II
  where distinct items are grouped role labels for clusters
                        4. Derive
                                            together and the latter    M APPING OF CLUSTER DIMENSIONS TO LEVELS . T HE CLUSTERS ARE
 cates perfect cluster cohesion and separation. To derive             ORDERED FROM LOW PATTERNS TO HIGH PATTERNS TO AID LEGIBILITY.
  silhouette coefficient (s(Ψ(k)) for the entire clustering
                                                                            0.04




                                                                                    Cluster Dispersion   Initiation Quality Popularity
                                                                                    1           L             L       L         L
 take the average silhouette coefficient of all items. We
                                   0.6




                                                                            0.03




                                                                                    0           L            M        H         L
                                                                                    6           L            H        M        M
   that the best clustering model and number of clusters to
                      Dispersion




                                                                                    10          L            H        M         H
                                   0.4




                                                               Initiation

                                                                            0.02




                                                                                    4           L            H        H        M
  is K-means with 11 clusters. We found that for smaller                            2,5         M            H        L         H
                                                                                    8,9         M            H        H         H
                                   0.2




 ter numbers (k = [3, 8]) each clustering algorithm achieves
                                                                            0.01




                                                                                    7           H            H        L         H
                                                                                    3           H            H        H         H
  parable performance, however as we begin to increase the
                                                                            0.00
                                   0.0




 ter numbers K-means improves while the two remaining
                                         0 1 2 3 4 5 6 7 8 9
                        •  1 - Focussed Novice
                                                                                    0 1 2 3 4 5 6 7 8 9
                                                 Cluster
                                                                    decision node, we measure the entropy of the dimensions and
                                                                                            Cluster

  rithms produce worse cohesion and separation.
                        •  2,5 - Mixed Novice
                                                                            0.020
                                   10




                        •  7 Distributed with                       their levels across the clusters, we then choose the dimension
 ) Deriving Role Labels: -Provided Novice the most cohesive
                                                                            0.015
                                   8




                        •  3 - Distributed Expert                   with the largest entropy. This is defined formally as:
  separated clustering•  of users we then derive role labels
                           8,9 - Mixed Expert
                                   6




                                                               Popularity

                                                                            0.010
                      Quality




                                                                                            |levels|
each cluster. Role label 0derivation first Participant inspecting
                        •  - Focussed Expert involves
                                   4




                        •  - each cluster and
  dimension distribution4inFocussed Expert Initiator aligning the           H(dim) = −                 p(level|dim) log p(level|dim)   (4)
                                                                            0.005
                                   2




  ibution with a level • mapping (i.e. low, mid, high). This
                           6 - Knowledgeable Member                                           level
                                                                            0.000




                        •  10 - Knowledgeable Sink
                                   0




bles the conversion of Communties Tick? Community Health Analysis using Role Compositions
              What makes continuous dimension ranges into
                                         0 1 2 3 4 5 6 7 8 9                        0 1 2 3 4 5 6 7 8 9
                                                 Cluster                                    Cluster

 rete values which our rule-based approach requires in the
 eton Rule Base. To perform this alignment we assess the
Experiment 1: Health Indicator Regression
 9


            Managing online communities is helped by understanding the
             relation between behaviour and health


            Experimental Setup
                 Induced Linear Regression Models for each Health Indicator and
                  Community
                      Using a time-series dataset
                      Dependent variables: 9 roles with composition proportions as values at a given time
                       point
                             E.g. @ t = k: Mixed Expert = 0.05, Distributed Novice = 0.51, etc.
                      Independent variable: health indicator (e.g. churn rate) at the same time point
                             E.g. @ t = k: Churn Rate= 0.21

                 PCA of each community health indicator model using the model’s coefficients
                      Look for a common health composition pattern

What makes Communties Tick? Community Health Analysis using Role Compositions
Experiment 1: Health Indicator Regression
        Results
 10
                                  Churn Rate                                        User Count                                Seeds / Non−seeds Prop                         Clustering Coefficient




                                                                                                                                                                    50 100
                                                                                                                      300
                                    353                                                           353                                           264                                                256
                            419                                                                                                                                                              101
                    100




                                                                                                                      200
                                   161                                        419                                                                                                                   265
                                  412                                                             418
                                                                                                                                                                             419                 21056




                                                                100
                                                                                                                                                                                       50       413
                                                                                                                                                                                                 354
                                                     50                                                                                                                                           412
                                                                                                                                                                                                 252
                                                                                                                                                                                                 270
                                                                                                                                                                                                 414
                                                                                                                                                                                                 420
                                                                                                                                                                                                 319
                                                                                                                                                                                                 198
                                                                                                                                                                                                 226




                                                                                                                                                                    0
                                                                                                        101




                                                                                                                      100
                                 252 197
                                 226                                                                                                                                                        44 470
              PC2




                                                          PC2




                                                                                                                PC2




                                                                                                                                                              PC2
                                 319
                             270210    44
                    0




                                 414
                                 420
                                 198
                                 470
                                 354
                                  256
                                   265                                                     264 126570
                                                                                                   2
                                                                                                 226
                                                                                                 412                                 50                                             197




                                                                0
                                 101                                                             319
                                                                                                 414
                                                                                                 420
                                                                                                21056
                                                                                                 470                                                                                             418




                                                                                                                                                                    −50
                                413
                                  56   264                                                   1619798
                                                                                                 1413
                                                                                                 252
                                                                                                  354                                      161 354
                                                                                                                                               413
                                                                                                                               197             414                                          161
                                                                                                  256                                            470                                           264
                                                                                                                                             210
                                                                                                                                               198
                                                                                                                                               420
                                                                                                                                               319
                                                                                                                                          4425256




                                                                                                                      0
                                                                                                                                               226
                                                                                                                                                2 270   419
                                                                                             44
                                                                                                                                            101
                                                                                                                                              412




                                                                −200
                                                                                                                                           265 56




                                                                                                                      −100
                    −200




                                                                                                                                                                    −150
                                  418                                                 50                                                  353418                                                   353

                           −200     200        600                     −800     −400          0           400                −400          0     200                         −600   −200        200
                                         PC1                                           PC1                                                PC1                                        PC1


            Common Health Composition Pattern
                   Churn Rate: Differences for Focussed Expert Participant & Mixed Expert, similarities for
                    Focussed Expert Initiators (decrease in role correlated with increase in churn rate)
                   User Count: Differences for Focussed Expert Initiators, commonalities for knowledgeable roles
                   Seeds-to-Non-Seeds: Similar effects for Focussed Expert Initiators and Participants, and
                    Distributed Experts (all decrease in role correlated with increased proportion)
                   Clustering Coefficient: no common patterns
            Idiosyncratic Health Composition Pattern
                   Divergence patterns between outlier communities
            No general pattern exists that describes the relation between roles and health
What makes Communties Tick? Community Health Analysis using Role Compositions
Experiment 2: Health Change Detection
 11

            Can we accurately and effectively detect positive and negative changes in
             community health from its composition of behavioural roles?

            Experimental Setup
                 Binary classification of indicator change
                 At t=k+1: predict increase or decrease in health indicator from t=k
                 Time-ordered dataset:
                      Features @ t=k+1: 9 roles with composition proportions as values
                      Class @ t=k+1: positive (if increase from t=k), negative (if decrease)
                      Divide dataset into 80/20 split maintaining time-ordering
                 Tested using a logistic regression classifier
                      Platform-level model
                      Community-specific model
                 Evaluated using Matthews Correlation Coefficient (MCC) and Area under the ROC
                  Curve (AUC)


What makes Communties Tick? Community Health Analysis using Role Compositions
find that for the 412 and 414 central forums we achieve
                                                                  poorer performance than the baseline for the User Count and
                                                                  Clustering Coefficient.
       Experiment 2: Health Change Detection                                                                             TABLE IV
                               P                                     ERFORMANCE OF DETECTING HEALTH CHANGES USING A LOGISTIC
       Results                                                    REGRESSION MODEL INDUCED : ACROSS THE ENTIRE PLATFORM (F IGUR
                                                                   IV( A )), PER - FORUM (F IGURE IV( B )) AND FOR SPECIFIC CENTRAL AND
 12                                                               OUTLIER FORUMS (F IGURE IV( C )). I N THIS LATTER CASE WE REPORT TH
                                                                        M ATTHEWS C ORRELATION C OEFFICIENT AND THE F1 SCORE .
           Per-forum models outperform platform                                                                        (a) Platform
            models for each health indicator                                              Class
                                                                                          Churn
                                                                                                                   MCC Prec Recall         F1
                                                                                                                   0.047 0.573 0.630 0.531 0.590
                                                                                                                                                   AUC

                   Demonstrates the need to assess and understand                        User Count               0.035 0.591 0.646 0.522 0.598
                                                                                          Seeds / Non-seeds        0.078 0.592 0.640 0.566 0.617
                    communities individually                                              Clustering Coefficient 0.077. 0.591 0.641 0.581 0.647
                   We also yield good performance for outlier                              Signif. codes: p-value < 0.001 *** 0.01 ** 0.05 * 0.1 . 1

                    communities                                                                                         (b) Per-forum
           ROC Curves surpass baseline for:                                              Class
                                                                                          Churn
                                                                                                                    MCC     Prec Recall
                                                                                                                  0.110** 0.618 0.634 0.619
                                                                                                                                             F1           AUC
                                                                                                                                                          0.569
                                                                                          User Count              0.175** 0.652 0.661 0.650               0.589
                   Churn rate: 20/25 forums                                              Seeds / Non-seeds        0.163* 0.637 0.657 0.639               0.589
                                                                                          Clustering Coefficient 0.089** 0.624 0.642 0.626                0.568
                   User Count: 20/25 forums                                                 Signif. codes: p-value < 0.001 *** 0.01 ** 0.05 * 0.1       .1
                   Seeds-to-Non-Seeds: 19/25 forums                                           (c) Forum Specific Results. MCC / F1
                   Clustering Coefficient: 17/25 forums                                                       Central                                               Outliers
                                                                   Class                        252              412               414                353              419             50
                          Churn Rate                 User Count    Churn        Seeds / Non−seeds 0.564
                                                                                           0.105 / Prop                Clustering Coefficient
                                                                                                            0.042 / 0.621 0.284 / 0.700         -0.076 / 0.543    0.173 / 0.633   0.092 / 0.58
                                                                   User Count              0.088 / 0.543    0.580 / 0.903 -0.106 / 0.701         0.279 / 0.648    0.299 / 0.667   0.343 / 0.69
                   1.0




                                               1.0




                                                                             1.0




                                                                                                                  1.0
                                                                   Seeds / Non-seeds       0.117 / 0.575    0.339 / 0.717 0.189 / 0.744          0.007 / 0.519    0.265 / 0.632   0.400 / 0.81
                   0.8




                                               0.8




                                                                             0.8




                                                                                                                  0.8
                                                                   Clustering Coefficient 0.057 / 0.536     -0.043 / 0.568 0.353 / 0.727         0.156 / 0.582     0.127 / 0.568   0.282 / 0.64
                   0.6




                                               0.6




                                                                             0.6




                                                                                                                  0.6
             TPR




                                         TPR




                                                                       TPR




                                                                                                            TPR
                                                               1) Results: Health Danger Detection: Thus far we have
                   0.4




                                               0.4




                                                                             0.4




                                                                                                                  0.4
                                                          assessed how well our detection models work in both class
                   0.2




                                               0.2




                                                                             0.2




                                                                                                                  0.2
                                                          settings (i.e. increase and 0.2 0.4 0.6 0.8 1.0 We now move to a
                                                                                               decrease).
                   0.0




                                               0.0




                                                                             0.0




                                                                                                                  0.0
               0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0     0.0 0.2 0.4 0.6 0.8 1.0 0.0
                         FPR                     FPR      scenario in which we wish to FPR
                                                                             FPR                  detect health dangers, and in
What makes Communties Tick? Community Health Analysis using Role Compositions warnings to community managers of the
                                                          doing so provide
                                                          likely reduction in health of their communities. To do this
Findings and Conclusions
 13

            No global composition pattern for the entirety of SCN
                 Identified key differences as to ‘What makes Communities tick’
                 Decrease in Focussed Experts correlated with an increase in Seeds-to-Non-Seeds
            (Marin et al., 2009) found a correlation between increase in Core Users and
             Network Cohesion
                 We found a correlation between an increase in Knowledgeable Sinks and Social Capital
            Accurate detection of community health change is possible using role composition
             information
                 Significantly outperformed baseline models
                 Per-forum models outperformed platform-level models
            Future Work:
                 Explore co-dependencies between health indicators
                 Application of our approach over different communities and platforms
                      E.g. IBM Connections, Boards.ie



What makes Communties Tick? Community Health Analysis using Role Compositions
14




         Questions?
         Web: http://www.matthew-rowe.com |http://www.lancs.ac.uk/staff/rowem
         Email: m.rowe@lancaster.ac.uk
         Twitter: @mattroweshow




What makes Communties Tick? Community Health Analysis using Role Compositions

Más contenido relacionado

Similar a What makes communities tick? Community health analysis using role compositions

KASW'08 - Invited Talk
KASW'08 - Invited TalkKASW'08 - Invited Talk
KASW'08 - Invited TalkRalf Klamma
 
Hybrid sentiment and network analysis of social opinion polarization icoict
Hybrid sentiment and network analysis of social opinion polarization   icoictHybrid sentiment and network analysis of social opinion polarization   icoict
Hybrid sentiment and network analysis of social opinion polarization icoictAndry Alamsyah
 
02 Introduction to Social Networks and Health: Key Concepts and Overview
02 Introduction to Social Networks and Health: Key Concepts and Overview02 Introduction to Social Networks and Health: Key Concepts and Overview
02 Introduction to Social Networks and Health: Key Concepts and OverviewDuke Network Analysis Center
 
Taxonomy and survey of community
Taxonomy and survey of communityTaxonomy and survey of community
Taxonomy and survey of communityIJCSES Journal
 
Community Analysis of Deep Networks (poster)
Community Analysis of Deep Networks (poster)Community Analysis of Deep Networks (poster)
Community Analysis of Deep Networks (poster)Behrang Mehrparvar
 
Current trends of opinion mining and sentiment analysis in social networks
Current trends of opinion mining and sentiment analysis in social networksCurrent trends of opinion mining and sentiment analysis in social networks
Current trends of opinion mining and sentiment analysis in social networkseSAT Publishing House
 
The eigen rumor algorithm
The eigen rumor algorithmThe eigen rumor algorithm
The eigen rumor algorithmamooool2000
 
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
Mining and analyzing social media   part 2 - hicss47 tutorial - dave kingMining and analyzing social media   part 2 - hicss47 tutorial - dave king
Mining and analyzing social media part 2 - hicss47 tutorial - dave kingDave King
 
Community detection in social networks an overview
Community detection in social networks an overviewCommunity detection in social networks an overview
Community detection in social networks an overvieweSAT Publishing House
 
Community detection
Community detectionCommunity detection
Community detectionScott Pauls
 
00 Introduction to SN&H: Key Concepts and Overview
00 Introduction to SN&H: Key Concepts and Overview00 Introduction to SN&H: Key Concepts and Overview
00 Introduction to SN&H: Key Concepts and OverviewDuke Network Analysis Center
 
Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)es712
 
New Similarity Index for Finding Followers in Leaders Based Community Detection
New Similarity Index for Finding Followers in Leaders Based Community DetectionNew Similarity Index for Finding Followers in Leaders Based Community Detection
New Similarity Index for Finding Followers in Leaders Based Community DetectionIRJET Journal
 
Using content and interactions for discovering communities in
Using content and interactions for discovering communities inUsing content and interactions for discovering communities in
Using content and interactions for discovering communities inmoresmile
 
20142014_20142015_20142115
20142014_20142015_2014211520142014_20142015_20142115
20142014_20142015_20142115Divita Madaan
 
Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...
Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...
Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...IJMTST Journal
 
Ieml social recommendersystems
Ieml social recommendersystemsIeml social recommendersystems
Ieml social recommendersystemsAntonio Medina
 
Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020Michael Mathioudakis
 

Similar a What makes communities tick? Community health analysis using role compositions (20)

KASW'08 - Invited Talk
KASW'08 - Invited TalkKASW'08 - Invited Talk
KASW'08 - Invited Talk
 
Hybrid sentiment and network analysis of social opinion polarization icoict
Hybrid sentiment and network analysis of social opinion polarization   icoictHybrid sentiment and network analysis of social opinion polarization   icoict
Hybrid sentiment and network analysis of social opinion polarization icoict
 
02 Introduction to Social Networks and Health: Key Concepts and Overview
02 Introduction to Social Networks and Health: Key Concepts and Overview02 Introduction to Social Networks and Health: Key Concepts and Overview
02 Introduction to Social Networks and Health: Key Concepts and Overview
 
Taxonomy and survey of community
Taxonomy and survey of communityTaxonomy and survey of community
Taxonomy and survey of community
 
Community Analysis of Deep Networks (poster)
Community Analysis of Deep Networks (poster)Community Analysis of Deep Networks (poster)
Community Analysis of Deep Networks (poster)
 
Current trends of opinion mining and sentiment analysis in social networks
Current trends of opinion mining and sentiment analysis in social networksCurrent trends of opinion mining and sentiment analysis in social networks
Current trends of opinion mining and sentiment analysis in social networks
 
The eigen rumor algorithm
The eigen rumor algorithmThe eigen rumor algorithm
The eigen rumor algorithm
 
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
Mining and analyzing social media   part 2 - hicss47 tutorial - dave kingMining and analyzing social media   part 2 - hicss47 tutorial - dave king
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
 
Community detection in social networks an overview
Community detection in social networks an overviewCommunity detection in social networks an overview
Community detection in social networks an overview
 
Community detection
Community detectionCommunity detection
Community detection
 
00 Introduction to SN&H: Key Concepts and Overview
00 Introduction to SN&H: Key Concepts and Overview00 Introduction to SN&H: Key Concepts and Overview
00 Introduction to SN&H: Key Concepts and Overview
 
Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)
 
New Similarity Index for Finding Followers in Leaders Based Community Detection
New Similarity Index for Finding Followers in Leaders Based Community DetectionNew Similarity Index for Finding Followers in Leaders Based Community Detection
New Similarity Index for Finding Followers in Leaders Based Community Detection
 
Open science - Science 2.0
Open science - Science 2.0Open science - Science 2.0
Open science - Science 2.0
 
Using content and interactions for discovering communities in
Using content and interactions for discovering communities inUsing content and interactions for discovering communities in
Using content and interactions for discovering communities in
 
20142014_20142015_20142115
20142014_20142015_2014211520142014_20142015_20142115
20142014_20142015_20142115
 
Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...
Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...
Alluding Communities in Social Networking Websites using Enhanced Quasi-cliqu...
 
Ieml social recommendersystems
Ieml social recommendersystemsIeml social recommendersystems
Ieml social recommendersystems
 
unit-5.pdf
unit-5.pdfunit-5.pdf
unit-5.pdf
 
Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020
 

Más de Matthew Rowe

Social Computing Research with Apache Spark
Social Computing Research with Apache SparkSocial Computing Research with Apache Spark
Social Computing Research with Apache SparkMatthew Rowe
 
Predicting Online Community Churners using Gaussian Sequences
Predicting Online Community Churners using Gaussian SequencesPredicting Online Community Churners using Gaussian Sequences
Predicting Online Community Churners using Gaussian SequencesMatthew Rowe
 
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...Matthew Rowe
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings Matthew Rowe
 
The Semantic Evolution of Online Communities
The Semantic Evolution of Online CommunitiesThe Semantic Evolution of Online Communities
The Semantic Evolution of Online CommunitiesMatthew Rowe
 
From Mining to Understanding: The Evolution of Social Web Users
From Mining to Understanding: The Evolution of Social Web UsersFrom Mining to Understanding: The Evolution of Social Web Users
From Mining to Understanding: The Evolution of Social Web UsersMatthew Rowe
 
Mining User Lifecycles from Online Community Platforms and their Application ...
Mining User Lifecycles from Online Community Platforms and their Application ...Mining User Lifecycles from Online Community Platforms and their Application ...
Mining User Lifecycles from Online Community Platforms and their Application ...Matthew Rowe
 
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...Matthew Rowe
 
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...Matthew Rowe
 
Attention Economics in Social Web Systems
Attention Economics in Social Web SystemsAttention Economics in Social Web Systems
Attention Economics in Social Web SystemsMatthew Rowe
 
Existing Research and Future Research Agenda
Existing Research and Future Research AgendaExisting Research and Future Research Agenda
Existing Research and Future Research AgendaMatthew Rowe
 
Tutorial: Social Semantics
Tutorial: Social SemanticsTutorial: Social Semantics
Tutorial: Social SemanticsMatthew Rowe
 
Modelling and Analysis of User Behaviour in Online Communities
Modelling and Analysis of User Behaviour in Online CommunitiesModelling and Analysis of User Behaviour in Online Communities
Modelling and Analysis of User Behaviour in Online CommunitiesMatthew Rowe
 
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsUsing Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsMatthew Rowe
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsMatthew Rowe
 
Semantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataSemantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataMatthew Rowe
 
Forecasting Audience Increase on Youtube
Forecasting Audience Increase on YoutubeForecasting Audience Increase on Youtube
Forecasting Audience Increase on YoutubeMatthew Rowe
 
Predicting Discussions on the Social Semantic Web
Predicting Discussions on the Social Semantic WebPredicting Discussions on the Social Semantic Web
Predicting Discussions on the Social Semantic WebMatthew Rowe
 
PhD Viva - Disambiguating Identity Web References using Social Data
PhD Viva - Disambiguating Identity Web References using Social DataPhD Viva - Disambiguating Identity Web References using Social Data
PhD Viva - Disambiguating Identity Web References using Social DataMatthew Rowe
 
Integrating and Interpreting Social Data from Heterogeneous Sources
Integrating and Interpreting Social Data from Heterogeneous SourcesIntegrating and Interpreting Social Data from Heterogeneous Sources
Integrating and Interpreting Social Data from Heterogeneous SourcesMatthew Rowe
 

Más de Matthew Rowe (20)

Social Computing Research with Apache Spark
Social Computing Research with Apache SparkSocial Computing Research with Apache Spark
Social Computing Research with Apache Spark
 
Predicting Online Community Churners using Gaussian Sequences
Predicting Online Community Churners using Gaussian SequencesPredicting Online Community Churners using Gaussian Sequences
Predicting Online Community Churners using Gaussian Sequences
 
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
 
The Semantic Evolution of Online Communities
The Semantic Evolution of Online CommunitiesThe Semantic Evolution of Online Communities
The Semantic Evolution of Online Communities
 
From Mining to Understanding: The Evolution of Social Web Users
From Mining to Understanding: The Evolution of Social Web UsersFrom Mining to Understanding: The Evolution of Social Web Users
From Mining to Understanding: The Evolution of Social Web Users
 
Mining User Lifecycles from Online Community Platforms and their Application ...
Mining User Lifecycles from Online Community Platforms and their Application ...Mining User Lifecycles from Online Community Platforms and their Application ...
Mining User Lifecycles from Online Community Platforms and their Application ...
 
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
 
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
 
Attention Economics in Social Web Systems
Attention Economics in Social Web SystemsAttention Economics in Social Web Systems
Attention Economics in Social Web Systems
 
Existing Research and Future Research Agenda
Existing Research and Future Research AgendaExisting Research and Future Research Agenda
Existing Research and Future Research Agenda
 
Tutorial: Social Semantics
Tutorial: Social SemanticsTutorial: Social Semantics
Tutorial: Social Semantics
 
Modelling and Analysis of User Behaviour in Online Communities
Modelling and Analysis of User Behaviour in Online CommunitiesModelling and Analysis of User Behaviour in Online Communities
Modelling and Analysis of User Behaviour in Online Communities
 
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsUsing Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community Forums
 
Semantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataSemantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic Data
 
Forecasting Audience Increase on Youtube
Forecasting Audience Increase on YoutubeForecasting Audience Increase on Youtube
Forecasting Audience Increase on Youtube
 
Predicting Discussions on the Social Semantic Web
Predicting Discussions on the Social Semantic WebPredicting Discussions on the Social Semantic Web
Predicting Discussions on the Social Semantic Web
 
PhD Viva - Disambiguating Identity Web References using Social Data
PhD Viva - Disambiguating Identity Web References using Social DataPhD Viva - Disambiguating Identity Web References using Social Data
PhD Viva - Disambiguating Identity Web References using Social Data
 
Integrating and Interpreting Social Data from Heterogeneous Sources
Integrating and Interpreting Social Data from Heterogeneous SourcesIntegrating and Interpreting Social Data from Heterogeneous Sources
Integrating and Interpreting Social Data from Heterogeneous Sources
 

Último

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Último (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

What makes communities tick? Community health analysis using role compositions

  • 1. WHAT MAKES COMMUNITIES TICK? COMMUNITY HEALTH ANALYSIS USING ROLE COMPOSITIONS MATTHEW ROWE1 AND HARITH ALANI2 1SCHOOL OF COMPUTING AND COMMUNICATIONS, LANCASTER UNIVERSITY, LANCASTER, UK 2KNOWLEDGE MEDIA INSTITUTE, THE OPEN UNIVERSITY, MILTON KEYNES, UK 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING AMSTERDAM, THE NETHERLANDS http://www.matthew-rowe.com | http://www.lancs.ac.uk/staff/rowem m.rowe@lancaster.ac.uk
  • 2. Managing Online Communities 1   Many businesses provide online communities to:   Increase customer loyalty   Raise brand awareness   Spread word-of-mouth   Facilitate idea generation   Online communities incur significant investment in terms of:   Money spent on hosting and bandwidth   Time and effort for maintenance   Community managers monitor community ‘health’ to:   Ensure longevity   Enable value generation   However, the notion of ‘health’ is hard to pin down What makes Communties Tick? Community Health Analysis using Role Compositions
  • 3. The Need for Interpretation 2   Online communities are dynamic behavioural ecosystems   Users in communities can be defined by their roles   i.e. Exhibiting similar collective behaviour   Prevalent behaviour can impact upon community members and health   Management of communities is helped by:   Understanding the relation between behaviour and health   How user behaviour changes are associated with health   Encouraging users to modify behaviour, in turn affecting health   e.g. content recommendation to specific users   Predicting health changes   Enables early decision making on community policy   Can we accurately and effectively detect positive and negative changes in community health from its composition of behavioural roles? What makes Communties Tick? Community Health Analysis using Role Compositions
  • 4. Outline 3   SAP Community Network   Community Health Indicators   Measuring Role Compositions:   Measuring user behaviour   Inferring behaviour roles   Mining behaviour roles   Experiments:   Health Indicator Regression   Health Change Detection   Findings and Conclusions What makes Communties Tick? Community Health Analysis using Role Compositions
  • 5. SAP Community Network 4   Collection of SAP forums in which users discuss:   Software development   SAP Products   Usage of SAP tools   Points system for awarding best answers   Enables development of user reputation   Provided with a dataset covering 33 communities:   Spanning 2004 - 2011 1400   95,200 threads 1000   421,098 messages Post Count   78,690 were allocated points 600   32,942 users 0 200 2004 2005 2006 2007 2008 2009 2010 2011 What makes Communties Tick? Community Health Analysis using Role Compositions
  • 6. Community Health Indicators 5   From the literature there is no single agreed measure of ‘community health’   Multi-faceted nature: loyalty, participation, activity, social capital   Different communities and platforms look at different indicators   Indicator 1: Churn Rate (loyalty)   The proportion of users who participate in a community for the final time   Indicator 2: User Count (participation)   The number of participating users in the community   Indicator 3: Seeds-to-Non-Seeds Posts Proportion (activity)   The Proportion of seed posts (i.e. thread starters that receive a reply) to non-seeds (i.e. no reply)   Indicator 4: Clustering Coefficient (social capital)   The average of users’ clustering coefficients within the largest strongly connected component What makes Communties Tick? Community Health Analysis using Role Compositions
  • 7. Measuring Role Compositions I: Modelling and Measuring User Behaviour 6   According to existing literature, user behaviour can be defined using 6 dimensions:   (Hautz et al., 2010), (Nolker and Zhou, 2005), (Zhu et al., 2009), (Zhu et al., 2011)   Focus Dispersion   Measure: Forum entropy of the user   Engagement   Measure: Out-degree proportioned by potential maximal out-degree   Popularity   Measure: In-degree proportioned by potential maximal in-degree   Contribution   Measure: Proportion of thread replies created by the user   Initiation   Measure: Proportion of threads that were initiated by the user   Content Quality   Measure: Average points per post awarded to the user What makes Communties Tick? Community Health Analysis using Role Compositions
  • 8. Measuring Role Compositions II: Inferring Roles 7   1. Construct features for community users at a given time step   2. Derive bins using equal frequency binning   Popularity-low cutoff = 0.5, Initiation-high cutoff = 0.4!   3. Use skeleton rule base to construct rules using bin levels   Popularity = low, Initiation = high -> roleA!   Popularity < 0.5, Initiation > 0.4 -> roleA!   4. Apply rules to infer user roles and community composition   5. Repeat 1-4 for following time steps What makes Communties Tick? Community Health Analysis using Role Compositions
  • 9. e as a parameter k. To judge the best model - i.e. cluster hod and number of clusters - we measure the cohesion and aration of a given clustering as follows: For each clustering rithm (Ψ) we iteratively increase the number of clusters Measuring Role Compositions III: to use where 2 ≥ k ≥ 30. At each increment of k we rd the silhouette coefficient produced by Ψ, this is defined Mining Roles (Skeleton rule base compilation) a given element (i) in a given cluster as: 8 bi − a i si = (3) max(ai , bi )   1. Select the tuning segment Where ai denotes the average distance to all other items he same cluster and  i is given by calculating thebehaviour dimensions b 2. Discover correlated average ance with all other items inRemoved Engagement and and Fig. 2. kept Popularityfeature distributions in each of the 11 clusters.   each other distinct cluster Contribution, Boxplots of the (Pearson r > 0.75, p < 0.01) taking the minimum distance. The value of s i ranges Feature distributions are matched against the feature levels derived from equal- frequency binning ween −1 and 1 where the Clusterindicates a poor cluster- groups   3. former users into behavioural TABLE II where distinct items are grouped role labels for clusters   4. Derive together and the latter M APPING OF CLUSTER DIMENSIONS TO LEVELS . T HE CLUSTERS ARE cates perfect cluster cohesion and separation. To derive ORDERED FROM LOW PATTERNS TO HIGH PATTERNS TO AID LEGIBILITY. silhouette coefficient (s(Ψ(k)) for the entire clustering 0.04 Cluster Dispersion Initiation Quality Popularity 1 L L L L take the average silhouette coefficient of all items. We 0.6 0.03 0 L M H L 6 L H M M that the best clustering model and number of clusters to Dispersion 10 L H M H 0.4 Initiation 0.02 4 L H H M is K-means with 11 clusters. We found that for smaller 2,5 M H L H 8,9 M H H H 0.2 ter numbers (k = [3, 8]) each clustering algorithm achieves 0.01 7 H H L H 3 H H H H parable performance, however as we begin to increase the 0.00 0.0 ter numbers K-means improves while the two remaining 0 1 2 3 4 5 6 7 8 9 •  1 - Focussed Novice 0 1 2 3 4 5 6 7 8 9 Cluster decision node, we measure the entropy of the dimensions and Cluster rithms produce worse cohesion and separation. •  2,5 - Mixed Novice 0.020 10 •  7 Distributed with their levels across the clusters, we then choose the dimension ) Deriving Role Labels: -Provided Novice the most cohesive 0.015 8 •  3 - Distributed Expert with the largest entropy. This is defined formally as: separated clustering•  of users we then derive role labels 8,9 - Mixed Expert 6 Popularity 0.010 Quality |levels| each cluster. Role label 0derivation first Participant inspecting •  - Focussed Expert involves 4 •  - each cluster and dimension distribution4inFocussed Expert Initiator aligning the H(dim) = − p(level|dim) log p(level|dim) (4) 0.005 2 ibution with a level • mapping (i.e. low, mid, high). This 6 - Knowledgeable Member level 0.000 •  10 - Knowledgeable Sink 0 bles the conversion of Communties Tick? Community Health Analysis using Role Compositions What makes continuous dimension ranges into 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 Cluster Cluster rete values which our rule-based approach requires in the eton Rule Base. To perform this alignment we assess the
  • 10. Experiment 1: Health Indicator Regression 9   Managing online communities is helped by understanding the relation between behaviour and health   Experimental Setup   Induced Linear Regression Models for each Health Indicator and Community   Using a time-series dataset   Dependent variables: 9 roles with composition proportions as values at a given time point   E.g. @ t = k: Mixed Expert = 0.05, Distributed Novice = 0.51, etc.   Independent variable: health indicator (e.g. churn rate) at the same time point   E.g. @ t = k: Churn Rate= 0.21   PCA of each community health indicator model using the model’s coefficients   Look for a common health composition pattern What makes Communties Tick? Community Health Analysis using Role Compositions
  • 11. Experiment 1: Health Indicator Regression Results 10 Churn Rate User Count Seeds / Non−seeds Prop Clustering Coefficient 50 100 300 353 353 264 256 419 101 100 200 161 419 265 412 418 419 21056 100 50 413 354 50 412 252 270 414 420 319 198 226 0 101 100 252 197 226 44 470 PC2 PC2 PC2 PC2 319 270210 44 0 414 420 198 470 354 256 265 264 126570 2 226 412 50 197 0 101 319 414 420 21056 470 418 −50 413 56 264 1619798 1413 252 354 161 354 413 197 414 161 256 470 264 210 198 420 319 4425256 0 226 2 270 419 44 101 412 −200 265 56 −100 −200 −150 418 50 353418 353 −200 200 600 −800 −400 0 400 −400 0 200 −600 −200 200 PC1 PC1 PC1 PC1   Common Health Composition Pattern   Churn Rate: Differences for Focussed Expert Participant & Mixed Expert, similarities for Focussed Expert Initiators (decrease in role correlated with increase in churn rate)   User Count: Differences for Focussed Expert Initiators, commonalities for knowledgeable roles   Seeds-to-Non-Seeds: Similar effects for Focussed Expert Initiators and Participants, and Distributed Experts (all decrease in role correlated with increased proportion)   Clustering Coefficient: no common patterns   Idiosyncratic Health Composition Pattern   Divergence patterns between outlier communities   No general pattern exists that describes the relation between roles and health What makes Communties Tick? Community Health Analysis using Role Compositions
  • 12. Experiment 2: Health Change Detection 11   Can we accurately and effectively detect positive and negative changes in community health from its composition of behavioural roles?   Experimental Setup   Binary classification of indicator change   At t=k+1: predict increase or decrease in health indicator from t=k   Time-ordered dataset:   Features @ t=k+1: 9 roles with composition proportions as values   Class @ t=k+1: positive (if increase from t=k), negative (if decrease)   Divide dataset into 80/20 split maintaining time-ordering   Tested using a logistic regression classifier   Platform-level model   Community-specific model   Evaluated using Matthews Correlation Coefficient (MCC) and Area under the ROC Curve (AUC) What makes Communties Tick? Community Health Analysis using Role Compositions
  • 13. find that for the 412 and 414 central forums we achieve poorer performance than the baseline for the User Count and Clustering Coefficient. Experiment 2: Health Change Detection TABLE IV P ERFORMANCE OF DETECTING HEALTH CHANGES USING A LOGISTIC Results REGRESSION MODEL INDUCED : ACROSS THE ENTIRE PLATFORM (F IGUR IV( A )), PER - FORUM (F IGURE IV( B )) AND FOR SPECIFIC CENTRAL AND 12 OUTLIER FORUMS (F IGURE IV( C )). I N THIS LATTER CASE WE REPORT TH M ATTHEWS C ORRELATION C OEFFICIENT AND THE F1 SCORE .   Per-forum models outperform platform (a) Platform models for each health indicator Class Churn MCC Prec Recall F1 0.047 0.573 0.630 0.531 0.590 AUC   Demonstrates the need to assess and understand User Count 0.035 0.591 0.646 0.522 0.598 Seeds / Non-seeds 0.078 0.592 0.640 0.566 0.617 communities individually Clustering Coefficient 0.077. 0.591 0.641 0.581 0.647   We also yield good performance for outlier Signif. codes: p-value < 0.001 *** 0.01 ** 0.05 * 0.1 . 1 communities (b) Per-forum   ROC Curves surpass baseline for: Class Churn MCC Prec Recall 0.110** 0.618 0.634 0.619 F1 AUC 0.569 User Count 0.175** 0.652 0.661 0.650 0.589   Churn rate: 20/25 forums Seeds / Non-seeds 0.163* 0.637 0.657 0.639 0.589 Clustering Coefficient 0.089** 0.624 0.642 0.626 0.568   User Count: 20/25 forums Signif. codes: p-value < 0.001 *** 0.01 ** 0.05 * 0.1 .1   Seeds-to-Non-Seeds: 19/25 forums (c) Forum Specific Results. MCC / F1   Clustering Coefficient: 17/25 forums Central Outliers Class 252 412 414 353 419 50 Churn Rate User Count Churn Seeds / Non−seeds 0.564 0.105 / Prop Clustering Coefficient 0.042 / 0.621 0.284 / 0.700 -0.076 / 0.543 0.173 / 0.633 0.092 / 0.58 User Count 0.088 / 0.543 0.580 / 0.903 -0.106 / 0.701 0.279 / 0.648 0.299 / 0.667 0.343 / 0.69 1.0 1.0 1.0 1.0 Seeds / Non-seeds 0.117 / 0.575 0.339 / 0.717 0.189 / 0.744 0.007 / 0.519 0.265 / 0.632 0.400 / 0.81 0.8 0.8 0.8 0.8 Clustering Coefficient 0.057 / 0.536 -0.043 / 0.568 0.353 / 0.727 0.156 / 0.582 0.127 / 0.568 0.282 / 0.64 0.6 0.6 0.6 0.6 TPR TPR TPR TPR 1) Results: Health Danger Detection: Thus far we have 0.4 0.4 0.4 0.4 assessed how well our detection models work in both class 0.2 0.2 0.2 0.2 settings (i.e. increase and 0.2 0.4 0.6 0.8 1.0 We now move to a decrease). 0.0 0.0 0.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 FPR FPR scenario in which we wish to FPR FPR detect health dangers, and in What makes Communties Tick? Community Health Analysis using Role Compositions warnings to community managers of the doing so provide likely reduction in health of their communities. To do this
  • 14. Findings and Conclusions 13   No global composition pattern for the entirety of SCN   Identified key differences as to ‘What makes Communities tick’   Decrease in Focussed Experts correlated with an increase in Seeds-to-Non-Seeds   (Marin et al., 2009) found a correlation between increase in Core Users and Network Cohesion   We found a correlation between an increase in Knowledgeable Sinks and Social Capital   Accurate detection of community health change is possible using role composition information   Significantly outperformed baseline models   Per-forum models outperformed platform-level models   Future Work:   Explore co-dependencies between health indicators   Application of our approach over different communities and platforms   E.g. IBM Connections, Boards.ie What makes Communties Tick? Community Health Analysis using Role Compositions
  • 15. 14 Questions? Web: http://www.matthew-rowe.com |http://www.lancs.ac.uk/staff/rowem Email: m.rowe@lancaster.ac.uk Twitter: @mattroweshow What makes Communties Tick? Community Health Analysis using Role Compositions

Notas del editor

  1. Assess three forums in the central cluster252 SAP Business One E-Commerce412
  2. For common health composition pattern:Assess three forums in the central cluster and differences in coefficients252 SAP Business One E-Commerce412 Business Planning414 Strategy ManagementDifferences show that no general pattern exists