Social computation of emergent networks on user generated content
1. Knowledge Management Institute
Social Computation of Emergent
Networks on User-Generated Content
GI Workshop on “Web-Science” at
Informatik 2010 der 40. Jahrestagung der Gesellschaft für Informatik
2010, 40
Leipzig, Germany
Markus Strohmaier
Assistant Professor
Knowledge Management Institute
g g
Graz University of Technology, Austria
e-mail: markus.strohmaier@tugraz.at
web: http://www.kmi.tugraz.at/staff/markus
Markus Strohmaier 2010
1
2. Knowledge Management Institute
Social-Computational Systems
… is the title of a new National Science Foundation (NSF) Program.
( ) g
the genesis of a new class of computational systems,
which generate emergent behaviors that arise out of the complex and
dynamic interactions among people and computers.
Source: National Science Foundation http://www.nsf.gov/pubs/2010/nsf10600/nsf10600.htm
p g p
3 observations:
• Rise of User Generated Content
• 5 out of the top 10 websites in the world have a focus on user-generated-content
(Alexa.com 2010)
• Rise of Online Social Networks
– More than 500 million active Facebook users, 50% log on any given day (Facebook 2010)
• Integration of user data and system functionality
• User data becomes an integral part of system functions
Markus Strohmaier 2010
(Facebook 2010) https://www.facebook.com/press/info.php?statistics 2
3. Knowledge Management Institute
Social Computational Systems
Interaction between individuals and
computational systems
is mediated by the aggregate behavior of
y gg g
users.
Markus Strohmaier 2010
3
4. Knowledge Management Institute
Social Computation
p
influences system properties (X)
X=Findability X=Utility
It is through the process of social computation, i.e.
the combination of social behavior and algorithmic computation,
that system properties and functions emerge.
X=Navigability
X Navigability X=Relevance
X R l
Markus Strohmaier 2010
4
5. Knowledge Management Institute
System Properties of
Social-Computational Systems
• Findability:
• the ease at which a document can be found by a user
• Utility:
U ili
• the degree to which a system maximizes usefulness of its functions for users
• Navigability:
• the
th ease at which a user can navigate f
t hi h i t from A t B
to
• Relevance:
• the extent to which offered information is considered relevant
• Privacy:
• the extent to which private information is kept private
• Profit:
• The extent to which functions can be monetized
• …
influenced by social computation processes
Markus Strohmaier 2010
5
6. Knowledge Management Institute
Agenda
1. Social-Computational S t
1 S i lC t ti l Systems
2. Navigability of Social-Computational Systems
3. Semantics in Social-Computational Systems
4. Social-Computational Systems & the Future
Markus Strohmaier 2010
6
7. Knowledge Management Institute
Agenda
1. Social-Computational S t
1 S i lC t ti l Systems
2. Navigability of Social-Computational Systems
3. Semantics in Social-Computational Systems
4. Social-Computational Systems & the Future
Markus Strohmaier 2010
7
8. Knowledge Management Institute
Example:
X = Connectivity (of the web graph)
Questions:
• What is X like? • What causes X?
bow-tie architecture
of the web
[Broder et al 2000]
Markus Strohmaier 2010
8
9. Knowledge Management Institute
Example:
X = Connectivity (of the web graph)
Questions:
• What is X like? • What causes X? • How can we
bow-tie architecture Social mechanisms, such as improve X?
of the web preferential attachment
an open issue
p
[Broder et al 2000] [Barabasi 1999]
Markus Strohmaier 2010
9
10. Knowledge Management Institute
Social Computational Systems:
What type of questions are we asking?
e.g. X = Connectivity of the web graph
C ti it f th b h
• Description and Classification: • Causality:
• What is X like? • Does X cause Y?
• What are its properties? • Does X prevent Y?
• How can it be categorized? • What causes X?
• How can we measure it? • What effect does X have on Y?
• Descriptive Process: • Causality - Comparative:
• How does X work? • Does X cause more Y than does Z?
• What is the process by which X • Is X better at preventing Y than is Z?
pp
happens? • Does X cause more Y than does Z
• How does X evolve? under one condition but not others?
• Descriptive Comparative: • Design
• How does X differ from Y? • What is an effective way to achieve X?
y
• Relationship: • How can we improve X?
• Are X and Y related?
• Do occurences of X correlate with
occurences of Y?
cf. [Easterbrook 2007 et al.]
Markus Strohmaier 2010
Selecting Empirical Methods for Software Engineering Research, Steve Easterbrook, Janice Singer, Margaret-Anne Storey, Daniela Damian, "Selecting Empirical Methods for Software 10
Engineering Research", Guide to Advanced Empirical Software Engineering, 2007
11. Knowledge Management Institute
Attempting a Definition:
Social-Computational Systems
…refer to systems in which essential system properties and
functions (“X”) are influenced by the behavior of users.
Thus, certain system properties and functions are not engineered
by a single person, but they are emergent, i.e. the result of
aggregating information from a large group of usersusers.
In this sense, certain system properties and functions of social-
computational systems are b
i l beyond the direct control of system
d h di l f
designers.
New approaches for designing and shaping
social-computational systems are needed.
Markus Strohmaier 2010
11
12. Knowledge Management Institute
The Dual Nature of Web-Science
Science Engineering
What is X like?
Improve X? Prevent Y?
typically
beyond
control
social computation =
social behavior + algorithmic computation
emergent social-computational
system properties and f
functions
through aggregation
Markus Strohmaier 2010
12
13. Knowledge Management Institute
Social Computational Systems:
What type of questions are we asking?
• Description and Classification: • Causality:
• What is X like? • Does X cause Y?
• What are its properties? • Does X prevent Y?
• How can it be categorized? • What causes X?
• How can we measure it? • What effect does X have on Y?
• Descriptive Process: • Causality - Comparative:
• How does X work? • Today‘s talk: Y than does Z?
Does X cause more
• What is the process by which X • X1=Navigability
Is X better at preventing Y than is Z?
pp
happens? • X2=Semantics Y than does Z
Semantics
Does X cause more
• How does X evolve? of User-Generated not others?
under one condition but Content
• Descriptive Comparative:
• How does X differ from Y? • Design
• Relationship: • What is an effective way to achieve X?
• Are X and Y related? • How can we improve X?
• Do occurences of X correlate with
occurences of Y?
cf. [Easterbrook 2007 et al.]
Markus Strohmaier 2010
Selecting Empirical Methods for Software Engineering Research, Steve Easterbrook, Janice Singer, Margaret-Anne Storey, Daniela Damian, "Selecting Empirical Methods for Software 14
Engineering Research", Guide to Advanced Empirical Software Engineering, 2007
14. Knowledge Management Institute
Agenda
1. Social-Computational S t
1 S i lC t ti l Systems
2. Navigability of Social-Computational Systems
3. Semantics in Social-Computational Systems
4. Social-Computational Systems & the Future
Markus Strohmaier 2010
15
15. Knowledge Management Institute
X1=Navigability
g y
Question:
How can we Measure and Improve
Navigability in Social Tagging S t
N i bilit i S i l T i Systems?
?
Tag clouds as an instrument for
g
navigation
Markus Strohmaier 2010
16
16. Knowledge Management Institute
Tag Clouds are Supposed to be Efficient
Tools for Navigating Tagging Systems
The Navigability Assumption:
• An implicit assumption among designers of social tagging
systems that tag clouds are specifically useful to
support navigation.
• This has hardly been tested or critically reflected in the past
past.
Navigating tagging systems via tag clouds:
1) The system presents a tag cloud to the user.
) y p g
2) The user selects a tag from the tag cloud.
3) The system presents a list of resources tagged with the
selected tag
tag.
4) The user selects a resource from the list of resources.
5) The system transfers the user to the selected resource,
and th process potentially starts anew.
d the t ti ll t t
Markus Strohmaier 2010
17
17. Knowledge Management Institute
Navigability of Social Tagging Systems
Question: How does
(i) th size of t clouds and
the i f tag l d d
(ii) number of resources / tag
influence the navigability (X1) of social tagging systems?
established
systems,
many users
New system,
few users
Markus Strohmaier 2010
18
18. Knowledge Management Institute
Defining Navigability
A network is navigable iff:
There is a path between all or almost all pairs of nodes
in the t
i th network. k
Formally:
1. There exists a giant component
2.
2 The effective diameter is low (bounded by log n)
J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science
Technical Report 99-1776 (October 1999)
Markus Strohmaier 2010
19
19. Knowledge Management Institute
Navigability: Examples
Example 1:
Not navigable: No giant component
Example 2:
Not navigable: giant component BUT
component,
avg. shortest path > log2(9)
Markus Strohmaier 2010
20
20. Knowledge Management Institute
Navigability: Examples
Example 3:
Navigable: Giant component AND
avg.
avg shortest path ≤ 2 < log2(9)
Is this efficiently navigable?
There are short paths between all nodes, but can an
agent or algorithm find them with local knowledge
only?
Markus Strohmaier 2010
21
21. Knowledge Management Institute
Efficiently navigable
A network is efficiently navigable iff:
If there is an algorithm that can find a short path with
only l
l local k
l knowledge ( ith b
l d (with branching f t k) and
hi factor k), d
the delivery time of the algorithm is bounded
polynomially by logk(n).
B
Example 4:
p
A C
Efficiently navigable, if the algorithm knows it needs to
go through A B C
Markus Strohmaier 2010
J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science
22
Technical Report 99-1776 (October 1999)
22. Knowledge Management Institute
User Interface constraints
Tag Cloud Size n
n: number of tags
shown per tag cloud
(topN most common algorithm)
Pagination of resources / tag
k: number of resources
shown per page
(reverse chronological ordering)
Markus Strohmaier 2010
23
23. Knowledge Management Institute
How UI constraints effect Navigability
Tag Cloud Size
Pagination
Limiting the tag cloud size n to practically feasible sizes (e.g. 5, 10, or more) does
not influence navigability (this is not very surprising).
BUT: Limiting the out-degree of high frequency tags k (e.g. through pagination
with resources sorted in reverse-chronological order) leaves the network
vulnerable to fragmentation. This destroys navigability of prevalent approaches
to tag clouds.
Markus Strohmaier 2010
24
24. Knowledge Management Institute
Findings
1. For
1 F certain specific, b t popular, t cloud scenarios, th
t i ifi but l tag l d i the
so-called Navigability Assumption does not hold.
2. While we could confirm that tag-resource networks have
g
efficient navigational properties in theory, we found that
popular user interface decisions significantly impair
navigability.
navigability
These results make a theoretical and an empirical argument
against existing approaches to tag cloud construction.
How can we improve the navigability of social tagging
systems?
Markus Strohmaier 2010
25
25. Knowledge Management Institute
Recovering Navigability in Social Tagging
Systems
Instead of reverse-chronological ordering of resources,
we apply a random ordering.
Markus Strohmaier 2010
26
26. Knowledge Management Institute
Efficient Navigability in Social Tagging
Systems
Instead of random ordering, we use hierarchical
background knowledge for ranking paginated
resources [Kleinberg 2001].
Markus Strohmaier 2010
J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,
27
2001, p. 2001.
27. Knowledge Management Institute
Social Computational Systems
Implications
• Navigability in social tagging systems is an emergent
system property
• S
Some of our initial intuitions about navigability (t
f i iti l i t iti b t i bilit (tag
clouds) are wrong
• The UI represents an opportunity to influence
emergent system properties
Markus Strohmaier 2010
28
28. Knowledge Management Institute
Agenda
1. Social-Computational S t
1 S i lC t ti l Systems
2. Navigability of Social-Computational Systems
3. Semantics in Social-Computational Systems
4. Social-Computational Systems & the Future
Markus Strohmaier 2010
29
29. Knowledge Management Institute
X1=Semantics
Question:
How can we Measure and Influence
Emergent Semantics in Social Tagging
Systems?
S t ?
Markus Strohmaier 2010
30
31. Knowledge Management Institute
Pragmatics influence emergent properties
Motivations for Tagging
M ti ti f T i Kinds f T
Ki d of Tags
• Future Retrieval • Content-based
• Contribution and Sharing • Context-based
• Attracting Attention (Flickr) • Attribute Tags
• Play and Competition (ESP
This suggests that … • Ownership Tags
Game) emergent semantics are influenced by the Tags
• Subjective
• underlying motivation for tagging
Self Presentation
(cf. f
( f for example, [Heckner 2009])
l [H k • Organizational Tags
• Opinion Expression • Purpose Tags
• Task Organization (“toread”) • Factual Tags
• ( for:scott )
Social Signalling (“for:scott”) • P
Personal T
l Tags
• Money (Amazon Mechanical • Self-referential tags
Turk) • Tag Bundles
g
• Categorization / Description
Markus Strohmaier 2010
Gupta et al. 2010 32
32. Knowledge Management Institute
Why Do Users Tag?
One ( f
O (of many) answers:
)
To categorize or to describe resources
Categorizer (C) Describer (D)
Goal later browsing later search
Change of vocabulary costly cheap
Size f
Si of vocabulary
b l limited
li it d Open
O
Tags subjective objective
Example tag clouds
Semantic Assumption:
Categorizers produce more precise emergent semantics than Describers.
Markus Strohmaier 2010
M. Strohmaier, C. Koerner, R. Kern, Why do Users Tag? Detecting Users' Motivation for Tagging in Social Tagging Systems, 4th International AAAI Conference on Weblogs and Social Media
33
(ICWSM2010), Washington, DC, USA, May 23-26, 2010.
33. Knowledge Management Institute
Measures for
Tagging Pragmatics vs. Tag Semantics
Categorizer/Describer:
C t i /D ib Semantics: [Cattuto et al 2008]
S ti
• Size of tag vocabulary • Co-occurrence count
• Tags per resource • Cosine similarity (TagCont)
• Tags per post • FolkRank
[Hotho et al 2006]
• Orphaned tags
Markus Strohmaier 2010
C. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Emerge From Collaborative Verbosity, 19th International World Wide Web Conference
(WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.
34
34. Knowledge Management Institute
Experimental Setup
As dataset, we used
A ad t t d
• a crawl from Delicious (University of Kassel)
• from November 2006 (containing 667,128 users)
• 10.000 most common tags, minimum of 100 resources / user
For semantic grounding, we used
• WordNet as a knowledge base (cf. [Cattuto et al. 2008])
• Jiang-Conrath as a measure of similarity
• combines the taxonomic path length between to nodes in WordNet with an information-
theoretic similarity measure [Jiang and Conrath 1997]
• A WordNet library as an implementation
• by [Pedersen et al 2004]
Markus Strohmaier 2010
C. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Emerge From Collaborative Verbosity, 19th International World Wide Web Conference
(WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.
35
35. Knowledge Management Institute
Results
Describers outperform categorizers on precision of
emergent tag semantics
Categorizers perform Describers perform
worse than random better than random
worse Random Random
users users
better
Categorizers Describers
C. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Emerge From Collaborative Verbosity, 19th International World Wide Web Conference
Markus Strohmaier
(WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.
2010
36
36. Knowledge Management Institute
Social Computational Systems
Implications
• Semantics in social tagging systems is an emergent
system property
• S
Some of our initial i t iti
f i iti l intuitions about semantics are
b t ti
wrong
• describers outperform categorizers on a particular task
• User behavior influences emergent system properties
Markus Strohmaier 2010
37
37. Knowledge Management Institute
Agenda
1. Social-Computational S t
1 S i lC t ti l Systems
2. Navigability of Social-Computational Systems
3. Semantics in Social-Computational Systems
4. Social-Computational Systems & the Future
Markus Strohmaier 2010
38
38. Knowledge Management Institute
Social-Computational Systems:
Conclusions
1. Certain properties of social computational systems (such as
navigability or semantics) are emergent p p
g y ) g properties, they are
, y
beyond the direct influence of system designers
2. The user interface is an opportunity to influence these emergent
properties
3. If user motivation or behavior changes over time, system
properties may change.
It is through the process of social computation, i.e.
the combination of social behavior and algorithmic computation,
that system properties and functions emerge.
Markus Strohmaier 2010
39
39. Knowledge Management Institute
Web-Science: A Call to Action
As web scientists, we need to
• study and map the complex relationships between user behavior
behavior,
user interfaces and emergent properties
• understand the potentials and limits of influencing emergent
system properties
t ti
As web engineers, we need to
• shift perspective away from designing towards shaping social-
computational systems
• reconcile user behaviors with desired system properties
Markus Strohmaier 2010
40
40. Knowledge Management Institute
End of Presentation
Thank you!
Markus Strohmaier
Graz University of Technology, Austria
y gy,
in collaboration with:
H.P. Grahsl, D. Helic, C. Körner, R. Kern, C. Trattner,
D. Benz, A. Hotho, G. Stumme
Markus Strohmaier 2010
42
41. Knowledge Management Institute
Related Publications
• Intent and motivation in social media
M. Strohmaier, C. Koerner, R. Kern, Why do Users Tag? Detecting Users' Motivation for Tagging in Social
Users
Tagging Systems, 4th International AAAI Conference on Weblogs and Social Media (ICWSM2010), Washington,
DC, USA, May 23-26, 2010.
• Social computation and emergent structures
C. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Arise
From Collaborative Verbosity, 19th International World Wide Web Conference (WWW2010), Raleigh, NC, USA,
April 26-30, ACM, 2010.
26 30,
D. Helic, C. Trattner, M. Strohmaier and K. Andrews, On the Navigability of Social Tagging Systems, The 2nd
IEEE International Conference on Social Computing (SocialCom 2010), Minneapolis, Minnesota, USA, 2010.
• Knowledge acquisition from social media
C. Wagner, M. Strohmaier, The Wisdom in Tweetonomies: Acquiring Latent Conceptual Structures from
Social Awareness Streams, Semantic Search 2010 Workshop (SemSearch2010), in conjunction with the 19th
International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.
Markus Strohmaier 2010
43