ASC Research given at the PARC Forum on 2008-05-01

Ed H. Chi 

Augmented Social Cognition Area 
Palo Alto Research Center 

Peter Pirolli, Lichan Hong, Bongwon Suh, Les Nelson, Rowan Nairn 
Alumni: Raluca Budiu, Bryan Pendleton, Niki Kittur, Todd Mytkowicz 

Image from: http://www.flickr.com/photos/ourcommon/480538715/
2008-05-01 Ed H. Chi ASC Overview 1

And how are they related? 


12 years of work in foraging and sensemaking
Information Scent 
 

–  WUFIS / IUNIS (Basic scent modeling algorithms) 
[CHI2000,2001] 
–  Bloodhound (Simulation of web navigation) [CHI2003] 
–  LumberJack (Log analysis of user needs) [CHI2002] 
Foraging 
 

–  ScentTrails [TOCHI2003] 
–  ScentIndex [CHI2004] 
–  ScentHighlight [IUI2005] 
–  Visual foraging of highlighted text [to appear, HCII] 
–  Proximal Search [to be published] 
Sensemaking 
 

–  Visualization of Web Ecologies [CHI98] 
–  Visualization Spreadsheets [Infovis97, Infovis99] 


Source: Starship Exeter

Lessig

http://en.wikipedia.org/wiki/Star_Trek_fan_productions


Groups utilize systems to 
 

make sense and share 
complex topics and 
materials. 

Wikipedia (social status) 
 

Slashdot (karma points) 
 

WikiHow.com 
 

Lostpedia.com 
 


Systems that evolve structures 
 

that can be used to organize 
information. 

Del.icio.us  
 

Flickr  
 

YouTube  
 

Friendster 
 


Counting votes 
 

–  A way to increase signal‐to‐noise ratio 
–  Information faddishness 
Examples: 
 

–  Digg.com 
–  Most bookmarked items on del.icio.us 

–  Estimating the weight of an ox or 
temperature of a room 
–  The true value of a stock 

–  PageRank or Hub / Authority algorithms 


Col. Information Collaborative
Voting systems
Structures Co-Creation

Digg.com eHow.com
Wikipedia
IBM dogear
PageRank
Slashdot Naver
Del.icio.us Flickr

Heavier
collaboration


Col. Information Collaborative
Voting systems
Structures Co-Creation

Digg.com eHow.com
Understanding of Understanding of info Understanding of
Wikipedia
micro-economics and social networks conflicts and
IBM dogear
PageRank coordination
Slashdot Naver
Del.icio.us Flickr
•  of foraging [PARC] •  Tag network analysis [PARC,
•  Wikipedia coordination
Golder, Yahoo]
costs [PARC]
•  Personal vs. group
•  Structural holes (info brokerage) Heavier
[Huberman, Adamic]
•  Invisible Colleges [Sandstrom]
•  Wisdom of Crowd [Burt] collaboration effects [Pirolli]
•  Interference
•  Co-laboratories [Olson and
[Surowieki] •  Network constraints and
Olson]
•  Information cascades structure [various]
•  Community networks / Col.
[Anderson and Holt] •  Semantic of semiotic structures /
Problem solving [Carroll]
words [IR, LSA]


Cognition: the ability to remember, think, and reason; the faculty of 
 

knowing. 
Social Cognition: the ability of a group to remember, think, and 
 

reason; the construction of knowledge structures by a group. 
–  (not quite the same as in the branch of psychology that studies the 
cognitive processes involved in social interaction, though included) 
Augmented Social Cognition: Supported by systems, the 
 

enhancement  of the ability of a group to remember, think, and 
reason; the system‐supported construction of knowledge 
structures by a group.  


Characteriza*on  Models 

Evalua*ons  Prototypes 


John Tukey  
(not a direct quote) 


(joint work with Niki Kittur, Bongwon Suh, 
Bryan Pendleton) 

Aniket Kittur, Bongwon Suh, Bryan Pendleton, Ed H. Chi. He Says, She 
Says: Conﬂict and Coordination in Wikipedia. In Proc. of ACM Conference 
on Human Factors in Computing Systems (CHI2007), pp. 453‐‐462, April 
2007. ACM Press. San Jose, CA. 


Wikipedia is the best thing ever. Anyone in the world can write
anything they want about any subject, so you know you’re getting the
best possible information.”
– Steve Carell, The Office


Understanding coordination costs is vital for long‐term 
 

viability of collaborative information environment 

Data: 
 

–  Entire dump on July 2, 2006 
–  58 million revisions 
–  4.7 million wiki pages 
–  2.4 million article pages 
–  800 gigabytes 


source: xkcd


Decrease in proportion of edits to article page 
 

1
0.95
0.9

70%
0.85
Edit proportion

0.8
0.75
0.7
0.65
0.6
0.55
0.5
2001 2002 2003 2004 2005 2006


Increase in proportion of edits to user talk 
 
0.2

8%
0.18

0.16

0.14
Edit Proportion

0.12

0.1

0.08

0.06

0.04

0.02

0
2001 2002 2003 2004 2005 2006


Increase in proportion of edits to user talk 
 

Increase in proportion of edits to procedure 
 

0.2
11%
0.18
0.16
0.14
Edit proportion

0.12
0.1
0.08
0.06
0.04
0.02
0
2001 2002 2003 2004 2005 2006


Increase in proportion of edits that are reverts 
 

0.2
7%
0.18
0.16
Edit proportion

0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
2001 2002 2003 2004 2005 2006


Increase in proportion of edits that are reverts 
 

Increase in proportion of edits reverting vandalism 
 
% Edits (marked Vandalism)

0.03
1-2%
0.025
Edit proportion

0.02
0.015
0.01
0.005
0
2001 2002 2003 2004 2005


Conﬂict and coordination costs are growing 
 

–  Less direct work (articles) 
+  More indirect work (article talk, user, procedure) 
+  More maintenance work (reverts, vandalism) 

100%

Maintenance
95%

90%
Percentage of total edits

Other
85%

80%
User Talk
75%
User
70%
Article Talk
65%
Article
60%
2001 2002 2003 2004 2005 2006


Conflict is growing at the global level, and we have 
 

some idea about where it is. 
But what defines conflict inside Wikipedia? 
 

Build a characterization model of article conflict 
 

–  Identify metrics relevant to conflict 
–  Automatically identify high‐conflict articles 


Controversial” tag 
 

Use # revisions tagged controversial 
 


Possible metrics for identifying conﬂict in articles 
 

Metric type Page Type
Revisions (#) Article, talk, article/talk
Page length Article, talk, article/talk
Unique editors Article, talk, article/talk
Unique editors / revisions Article, talk
Links from other articles Article, talk
Links to other articles Article, talk
Anonymous edits (#, %) Article, talk
Administrator edits (#, %) Article, talk
Minor edits (#, %) Article, talk
Reverts (#, by unique
Article
editors)


5x cross‐validation, R2 = 0.897 
 

10000

9000
Actual controversial revisions

8000

7000

6000

5000

4000

3000

2000

1000

0

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Predicted controversial revisions

Highly weighted features of conflict model:
 

Revisions (talk) 

Minor edits (talk) 

Unique editors (talk) 

Revisions (article) 

Unique editors (article) 

Anonymous edits (talk) 

Anonymous edits (article) 



(joint work with Todd Mytkowicz) 

Ed H. Chi, Todd Mytkowicz. Understanding the Eﬃciency of Social 
Tagging Systems using Information Theory. In Proc. of ACM Conference 
on Hypertext 2008. (to appear). ACM Press, 2008. Pittsburgh, PA. 


Topics 
Concepts 

Documents 
Users 

Noise 
Tags 
Decoding  Encoding 
T1…Tn 

46
2008-05-01 Ed H. Chi ASC Overview

How do we evaluate a tagging system? 
 

Given a tag vocabulary, how eﬀective is it in describing a 
 
set of URLs?  

Approach: 
 

–  Crawled the del.icio.us bookmark set 
–  Information theory provides a nice framework for analysis 


Measures the uncertainty about a particular event associated with a 
 

probability distribution 

Thought experiment: drawing colored balls out of a box 
 

–  Maximum when p is uniform, no single color predominates 
–  Minimum when p is 1, only one color 
Entropy measure the amount of information associated with a 
 

drawn ball. 


Entropy increases when 
 

–  (a) total number of events x increases 
–  (b) distribution on X becomes more uniform 
Conditional Entropy, H(Y|X) 
 

–  Measures how much entropy a random variable Y has 
remaining if we have already learned completely the value 
of a second variable X. 
–  Can be understood by thinking about the joint entropy 
H(Y|X) = H(X,Y) – H(X) 

2008-03-28 ICWSM Poster 49

Source: Hypertext 2008 study on del.icio.us (Chi & Mytkowicz)


Entropy can be used eﬀectively as a measure for social 
 

tagging systems. 
As a map, over time, social tagging systems seems to 
 
lose their ability to guide users eﬃciently. 
–  However, there are ways to deal with this pressure. 


Create a Living Laboratory as a platform to 
develop, test, and market innovations 


Joint work with  
Rowan Nairn, Peter Lai, Lichan Hong, Lawrence Lee 


ﬁtness 
 

Java, AJAX 
 

Ireland travel 
 

Web2.0 
 

Social search 
 

Second Life 
 

Su 
 


Joint work with  
Bongwon Suh, Aniket Kittur, Bryan Pendleton 

Bongwon Suh, Ed H. Chi, Aniket Kittur, Bryan A. Pendleton. Lifting the 
Veil: Improving Accountability and Social Transparency in Wikipedia with 
WikiDashboard. In Proceedings of the ACM Conference on Human‐factors 
in Computing Systems (CHI2008). ACM Press, 2008. Florence, Italy. 


Factual accuracy 
 

Motives of editors 
 

Uncertain expertise 
 

Volatility 
 

Spotty coverage 
 

Unproven/non‐independent source 
 


Social translucent for eﬀective communication and collaboration  
 

[Erickson and Kellogg 2002] 
–  Make socially signiﬁcant information visible and salient 
–  Support awareness of the rules and constraints 
–  Accountability for actions 

Wikis can be a prime candidate 
 

–  Every edit is logged and retrievable 
–  WikiScanner.com: analyze anonymous IP edits 
–  WikiRage.com: top edits 


List of every edits that a user made 
 

Let readers examine each individual revision for validity, which is hard to accomplish 
 
when only provided with aggregate visual summaries. 


Surfacing hidden social context to users 
 

For readers 
 
–  Any incidents in the past e.g. A sudden burst of edits? 
–  Who are the editors? 
–  What is their motivation / point of views / expertise / topics of 
interest  
–  Help them judging the quality/trustworthiness/usefulness of an 
article 
For writers 
 
–  Measure expertise / contribution / reputation 
–  Motivate them to be more active / responsible (?) 


Joint work with 
Lichan Hong, Raluca Budiu, Les Nelson, Peter Pirolli  

Lichan Hong, Ed H. Chi, Raluca Budiu, Peter Pirolli, and Les Nelson. SparTag.us: A Low 
Cost Tagging System for Foraging of Web Content. In Proceedings of the Advanced 
Visual Interface (AVI2008), (to appear). ACM Press, 2008. 


Interaction costs 
 

# People willing to produce for “free”
determine number of 
people who participate 
Surplus of attention & 
 

motivation at small 
transaction costs 
Therefore… 
 

Important to keep 
 

interaction costs low 
Cost of participation


In situ tagging while reading
 

–  No new window
–  Clicking vs typing
Tagging + highlighting
 


Intuition: sub‐doc nuggets useful 
 
–  Entities, facts, concepts, paragraphs 
Annotations attached to  paragraphs 
 

Portable across pages and other contents (e.g. 
 
Word documents) 
–  Dynamic pages 
–  Duplicate content 


Encoding  Retrieval
 
“video  people  talks technology”  

h?p://www.ted.com/index.php/speakers 

h?p://edge.org 

“science  research cogni*on” 

75

Crowdsourcing [collaborative co‐creation] 
 

–  Is there a wisdom of the crowd in Wikipedia? 
Collective Intelligence [folksonomy] 
 

–  Are social tags collectively gathered useful for organization of a large 
document collection? 
Collective Averaging [social attention]  
 

–  Does voting systems identify the best quality and most interesting 
information for that community? 
Participation Architecture [AJAX]  
 

–  Does lowering the interaction cost barrier increase participation 
productively? 
Expertise ﬁnding [social networking]  
 

–  Does getting experts through social network gets you to better quality 
information sooner? 


Research Vision: Understand how social computing 
 

systems can enhance the ability of a group of 
people to remember, think, and reason. 
Living Laboratory: Create applications that harness 
 

collective intelligence to improve knowledge 
capture, transfer, and discovery. 

http://asc‐parc.blogspot.com 
 

http://www.edchi.net 
 

echi@parc.com 
 

Image from: http://www.flickr.com/photos/ourcommon/480538715/


ASC Research given at the PARC Forum on 2008-05-01

Recommended

Recommended

More Related Content

Similar to ASC Research given at the PARC Forum on 2008-05-01

Similar to ASC Research given at the PARC Forum on 2008-05-01 (20)

More from Ed Chi

More from Ed Chi (20)

Recently uploaded

Recently uploaded (20)

ASC Research given at the PARC Forum on 2008-05-01

Editor's Notes