SlideShare una empresa de Scribd logo
1 de 48
Death and edits
Miles Lincoln

LIS590MT
Wikipedia!

You probably already know
what it is!
Bursts in social networks

 Bursts of edits on Wikipedia in particular

 When do those occur?
What can we learn by looking at
spikes in edit frequency?

 How have edit spikes changed over Wikipedia’s ten
  years of existence?

 Does the size of an edit spike correlate to anything?
Bursts in other social networks
Google Trends
Celebrity deaths!
Revision history
Revision history
But first…

 We need to process the data so that we can answer that
  question
Perl
Regular Expressions (Regex)




        Perl script uses regular expressions to find and
        output matching pieces of text.
        In this case, I am pulling out dates in Wikipedia’s
        day month year format and re-writing them in a
        more machine-readable MM/DD/YYYY format.

                                  11/08/2011
Data manipulation


Copy/pase the revision history of wiki
pages into a text document which I
feed to my perl script

Results in lists consisting of one date
per edit that occurred on that date

Copying/pasting isn’t super
elegant, but I haven’t gotten
LWP/useragent stuff to work yet
Excel!

 Throw my lists of dates into a pivot table, which
  shows me the frequency that each date occurs

 Some vlookup magic allows me to combine
  these edit frequencies of individual actors into
  one big list covering every day from 6/1/2001 to
  the present
Et Voila!
Problems

9 actors over 10 years means close to 100k cells
Excel is not built for speed
Matlab might work better
What does the data look like over
time?

 6/1-5/31 from 2001 (when Wikipedia’s current edit no.’s
  begin) to 2010 (when all of the bursts have settled down)
6/1/2001-5/31/2002
1.2




 1




0.8                                                                                                             Series1
                                                                                                                Series2
                                                                                                                Series3
                                                                                                                Series4
0.6
                                                                                                                Series5
                                                                                                                Series6
                                                                                                                Series7
0.4
                                                                                                                Series8
                                                                                                                Series9


0.2




 0
 6/1/01   7/1/01   8/1/01   9/1/01   10/1/01   11/1/01   12/1/01   1/1/02   2/1/02   3/1/02   4/1/02   5/1/02
6/1/2002-5/31/2003
14



12



10
                                                                                                                Series1
                                                                                                                Series2
 8                                                                                                              Series3
                                                                                                                Series4
                                                                                                                Series5
 6                                                                                                              Series6
                                                                                                                Series7
                                                                                                                Series8
 4
                                                                                                                Series9


 2



 0
 6/1/02   7/1/02   8/1/02   9/1/02   10/1/02   11/1/02   12/1/02   1/1/03   2/1/03   3/1/03   4/1/03   5/1/03
6/1/2003-5/31/2004
30




25




20                                                                                                              Series1
                                                                                                                Series2
                                                                                                                Series3
                                                                                                                Series4
15
                                                                                                                Series5
                                                                                                                Series6
                                                                                                                Series7
10
                                                                                                                Series8
                                                                                                                Series9


 5




 0
 6/1/03   7/1/03   8/1/03   9/1/03   10/1/03   11/1/03   12/1/03   1/1/04   2/1/04   3/1/04   4/1/04   5/1/04
6/1/2004-5/31/2005
60




50




40                                                                                                              Series1
                                                                                                                Series2
                                                                                                                Series3
                                                                                                                Series4
30
                                                                                                                Series5
                                                                                                                Series6
                                                                                                                Series7
20
                                                                                                                Series8
                                                                                                                Series9


10




 0
 6/1/04   7/1/04   8/1/04   9/1/04   10/1/04   11/1/04   12/1/04   1/1/05   2/1/05   3/1/05   4/1/05   5/1/05
6/1/2005-5/31/2006
30




25




20                                                                                                              Series1
                                                                                                                Series2
                                                                                                                Series3
                                                                                                                Series4
15
                                                                                                                Series5
                                                                                                                Series6
                                                                                                                Series7
10
                                                                                                                Series8
                                                                                                                Series9


 5




 0
 6/1/05   7/1/05   8/1/05   9/1/05   10/1/05   11/1/05   12/1/05   1/1/06   2/1/06   3/1/06   4/1/06   5/1/06
6/1/2006-5/31/2007
50


45


40


35
                                                                                                                Series1
                                                                                                                Series2
30
                                                                                                                Series3
                                                                                                                Series4
25
                                                                                                                Series5

20                                                                                                              Series6
                                                                                                                Series7

15                                                                                                              Series8
                                                                                                                Series9
10


 5


 0
 6/1/06   7/1/06   8/1/06   9/1/06   10/1/06   11/1/06   12/1/06   1/1/07   2/1/07   3/1/07   4/1/07   5/1/07
6/1/2007-5/31/2008
400



350



300

                                                                                                                 Series1
250                                                                                                              Series2
                                                                                                                 Series3
                                                                                                                 Series4
200
                                                                                                                 Series5
                                                                                                                 Series6
150                                                                                                              Series7
                                                                                                                 Series8
100                                                                                                              Series9



 50



  0
  6/1/07   7/1/07   8/1/07   9/1/07   10/1/07   11/1/07   12/1/07   1/1/08   2/1/08   3/1/08   4/1/08   5/1/08
6/1/2008-5/31/2009
80



70



60

                                                                                                                Series1
50                                                                                                              Series2
                                                                                                                Series3
                                                                                                                Series4
40
                                                                                                                Series5
                                                                                                                Series6
30                                                                                                              Series7
                                                                                                                Series8
20                                                                                                              Series9



10



 0
 6/1/08   7/1/08   8/1/08   9/1/08   10/1/08   11/1/08   12/1/08   1/1/09   2/1/09   3/1/09   4/1/09   5/1/09
6/1/2009-5/31/2010
200


180


160


140                                                                                                              Series1
                                                                                                                 Series2
120                                                                                                              Series3
                                                                                                                 Series4
100                                                                                                              Series5
                                                                                                                 Series6
 80                                                                                                              Series7
                                                                                                                 Series8
 60
                                                                                                                 Series9
                                                                                                                 Series10
 40


 20


  0
  6/1/09   7/1/09   8/1/09   9/1/09   10/1/09   11/1/09   12/1/09   1/1/10   2/1/10   3/1/10   4/1/10   5/1/10
Spike sizes over the years
 400


 350


 300


 250


 200
                                                               Series2

 150


 100


  50


   0
       2002   2003   2004   2005   2006   2007   2008   2009
Let’s take a closer look at the more
interesting actors
Actors #4-9 6/1/2008-5/31/2009
80



70



60



50
                                                                                                                Series1
                                                                                                                Series2
40                                                                                                              Series3
                                                                                                                Series4

30                                                                                                              Series5
                                                                                                                Series6

20



10



 0
 6/1/08   7/1/08   8/1/08   9/1/08   10/1/08   11/1/08   12/1/08   1/1/09   2/1/09   3/1/09   4/1/09   5/1/09
Actors #4-9 6/1/2008-5/31/2009 -log
 2


1.8


1.6


1.4


1.2                                                                                                             Series1
                                                                                                                Series2
 1                                                                                                              Series3
                                                                                                                Series4
0.8                                                                                                             Series5
                                                                                                                Series6
0.6


0.4


0.2


 0
 6/1/08   7/1/08   8/1/08   9/1/08   10/1/08   11/1/08   12/1/08   1/1/09   2/1/09   3/1/09   4/1/09   5/1/09
One actor at a time ~10 years
Actor #1 DoD: 6/27/2001 -edits/day
14



12



10



 8


                                                                                                              Series1
 6



 4



 2



 0
6/28/01   6/28/02   6/28/03   6/28/04   6/28/05   6/28/06   6/28/07   6/28/08   6/28/09   6/28/10   6/28/11
Actor #1 –log(edits)/day
1.2




 1




0.8




0.6
                                                                                                               Series1



0.4




0.2




 0
 6/28/01   6/28/02   6/28/03   6/28/04   6/28/05   6/28/06   6/28/07   6/28/08   6/28/09   6/28/10   6/28/11
Actor #7 -edits/day
100


 90


 80


 70


 60


 50
                                                                                           Series1

 40


 30


 20


 10


  0
 9/24/03   9/24/04   9/24/05   9/24/06   9/24/07   9/24/08   9/24/09   9/24/10   9/24/11
Actor #7 –log(edits)/day
2.5




 2




1.5



                                                                                           Series1

 1




0.5




 0
 9/24/03   9/24/04   9/24/05   9/24/06   9/24/07   9/24/08   9/24/09   9/24/10   9/24/11
Actor #8 -edits/day
400



350



300



250



200
                                                                                        Series1


150



100



 50



  0
12/10/03   12/10/04   12/10/05   12/10/06   12/10/07   12/10/08   12/10/09   12/10/10
Actor #8 –log(edits)/day
 3




2.5




 2




1.5
                                                                                        Series1



 1




0.5




 0
12/10/03   12/10/04   12/10/05   12/10/06   12/10/07   12/10/08   12/10/09   12/10/10
Actor #9 –edits/day
200


180


160


140


120


100
                                                                                 Series1

 80


 60


 40


 20


  0
 2/28/04   2/28/05   2/28/06   2/28/07   2/28/08   2/28/09   2/28/10   2/28/11
Actor #9 –log(edits)/day
2.5




 2




1.5



                                                                                 Series1

 1




0.5




 0
 2/28/04   2/28/05   2/28/06   2/28/07   2/28/08   2/28/09   2/28/10   2/28/11
If we tweak the data to take
importance into consideration…

 Average gross, adjusted for inflation*
   Only available for a small amount of actors chosen in the
    sample set
   Taken from boxofficemojo.com
      Extremely reliable source
50
               100
                     150
                                      200
                                             250
                                                   300
                                                         350
                                                               400




      0
  1
  9
 17
 25
 33
 41
 49
 57
 65
 73
 81
 89
 97
105
113
121
129
137
145
153
161
169
177
185
193
201
209
217
225
233
241
249
                                                                     Actor #8 vs. Actor #9




257
265
273
281
289
297
305
313
321
329
337
345
353
361
                                    ledger
                           swayze
50
               100
                     150
                                               200
                                                      250
                                                            300
                                                                  350
                                                                        400




      0
  1
  9
 17
 25
 33
 41
 49
 57
 65
 73
 81
 89
 97
105
113
121
129
137
145
153
161
169
177
185
193
201
209
217
225
233
241
249
257
265
273
281
289
297
305
313
321
329
337
345
353
361
                                                                              Actor #8 vs. Actor #9 (adjusted)




                                             ledger
                           swayze adjusted
1.5
                                                           2.5




          0.5




      0
                1
                                                       2
                                                                 3
  1
 10
 19
 28
 37
 46
 55
 64
 73
 82
 91
100
109
118
127
136
145
154
163
172
181
190
199
208
217
226
235
244
253
262
271
280
289
298
307
316
325
334
343
352
361
                                                                     Actor #8 Vs. Actor #9 (adjusted)




                                          ledger log
                    swayze adjusted log
The same data on Google trends
-10 days to +40 days (log)
 3




2.5




 2                                                                                                           coburn log
                                                                                                             peck log
                                                                                                             brando log
1.5                                                                                                          davis log
                                                                                                             palance log
                                                                                                             goulet log
 1                                                                                                           ledger log
                                                                                                             swayze log


0.5




 0
      1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132333435363738394041424344454647484950
Other things I should consider

 Age at death

 Cause of death

 Were they still acting?
Future directions

 New sample of Wikipedia pages
   Need to compare more contemporary pages
   Need new metrics for comparison

 Better workflows
Thanks!

 Questions?

 http://www.slideshare.net/mlincol2/informetrics

Más contenido relacionado

Destacado

Nazism and the rise of hitler ix a(ashay)1
Nazism and the rise of hitler ix a(ashay)1Nazism and the rise of hitler ix a(ashay)1
Nazism and the rise of hitler ix a(ashay)1
Ashay Ash
 
The Rise of Hitler ppt
The Rise of Hitler pptThe Rise of Hitler ppt
The Rise of Hitler ppt
quillinn
 

Destacado (9)

Nazism and the rise of hitler ix a(ashay)1
Nazism and the rise of hitler ix a(ashay)1Nazism and the rise of hitler ix a(ashay)1
Nazism and the rise of hitler ix a(ashay)1
 
Nazism and Rise of Hitler
Nazism and Rise of HitlerNazism and Rise of Hitler
Nazism and Rise of Hitler
 
The Rise of Hitler - Hitler's Leadership Abilities
The Rise of Hitler  - Hitler's Leadership AbilitiesThe Rise of Hitler  - Hitler's Leadership Abilities
The Rise of Hitler - Hitler's Leadership Abilities
 
Weimar Constitution
Weimar ConstitutionWeimar Constitution
Weimar Constitution
 
Russian Revolution
Russian RevolutionRussian Revolution
Russian Revolution
 
Weimar Republic
Weimar RepublicWeimar Republic
Weimar Republic
 
The Rise of Hitler ppt
The Rise of Hitler pptThe Rise of Hitler ppt
The Rise of Hitler ppt
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 

Similar a Death and Edits (11)

Testing
TestingTesting
Testing
 
Ppt.chart.dos.donts.2
Ppt.chart.dos.donts.2Ppt.chart.dos.donts.2
Ppt.chart.dos.donts.2
 
Dots template1
Dots template1Dots template1
Dots template1
 
marketing management
marketing managementmarketing management
marketing management
 
Template3
Template3Template3
Template3
 
Questionnaire results
Questionnaire resultsQuestionnaire results
Questionnaire results
 
Pptsample3
Pptsample3Pptsample3
Pptsample3
 
Testing
TestingTesting
Testing
 
1 soft blue texture template segoe
1 soft blue texture template segoe1 soft blue texture template segoe
1 soft blue texture template segoe
 
Pptsample5
Pptsample5Pptsample5
Pptsample5
 
IBOtoolbox
IBOtoolboxIBOtoolbox
IBOtoolbox
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Death and Edits

  • 1. Death and edits Miles Lincoln LIS590MT
  • 3. Bursts in social networks  Bursts of edits on Wikipedia in particular  When do those occur?
  • 4. What can we learn by looking at spikes in edit frequency?  How have edit spikes changed over Wikipedia’s ten years of existence?  Does the size of an edit spike correlate to anything?
  • 5. Bursts in other social networks
  • 10. But first…  We need to process the data so that we can answer that question
  • 11. Perl
  • 12. Regular Expressions (Regex) Perl script uses regular expressions to find and output matching pieces of text. In this case, I am pulling out dates in Wikipedia’s day month year format and re-writing them in a more machine-readable MM/DD/YYYY format. 11/08/2011
  • 13. Data manipulation Copy/pase the revision history of wiki pages into a text document which I feed to my perl script Results in lists consisting of one date per edit that occurred on that date Copying/pasting isn’t super elegant, but I haven’t gotten LWP/useragent stuff to work yet
  • 14. Excel!  Throw my lists of dates into a pivot table, which shows me the frequency that each date occurs  Some vlookup magic allows me to combine these edit frequencies of individual actors into one big list covering every day from 6/1/2001 to the present
  • 16. Problems 9 actors over 10 years means close to 100k cells Excel is not built for speed Matlab might work better
  • 17. What does the data look like over time?  6/1-5/31 from 2001 (when Wikipedia’s current edit no.’s begin) to 2010 (when all of the bursts have settled down)
  • 18. 6/1/2001-5/31/2002 1.2 1 0.8 Series1 Series2 Series3 Series4 0.6 Series5 Series6 Series7 0.4 Series8 Series9 0.2 0 6/1/01 7/1/01 8/1/01 9/1/01 10/1/01 11/1/01 12/1/01 1/1/02 2/1/02 3/1/02 4/1/02 5/1/02
  • 19. 6/1/2002-5/31/2003 14 12 10 Series1 Series2 8 Series3 Series4 Series5 6 Series6 Series7 Series8 4 Series9 2 0 6/1/02 7/1/02 8/1/02 9/1/02 10/1/02 11/1/02 12/1/02 1/1/03 2/1/03 3/1/03 4/1/03 5/1/03
  • 20. 6/1/2003-5/31/2004 30 25 20 Series1 Series2 Series3 Series4 15 Series5 Series6 Series7 10 Series8 Series9 5 0 6/1/03 7/1/03 8/1/03 9/1/03 10/1/03 11/1/03 12/1/03 1/1/04 2/1/04 3/1/04 4/1/04 5/1/04
  • 21. 6/1/2004-5/31/2005 60 50 40 Series1 Series2 Series3 Series4 30 Series5 Series6 Series7 20 Series8 Series9 10 0 6/1/04 7/1/04 8/1/04 9/1/04 10/1/04 11/1/04 12/1/04 1/1/05 2/1/05 3/1/05 4/1/05 5/1/05
  • 22. 6/1/2005-5/31/2006 30 25 20 Series1 Series2 Series3 Series4 15 Series5 Series6 Series7 10 Series8 Series9 5 0 6/1/05 7/1/05 8/1/05 9/1/05 10/1/05 11/1/05 12/1/05 1/1/06 2/1/06 3/1/06 4/1/06 5/1/06
  • 23. 6/1/2006-5/31/2007 50 45 40 35 Series1 Series2 30 Series3 Series4 25 Series5 20 Series6 Series7 15 Series8 Series9 10 5 0 6/1/06 7/1/06 8/1/06 9/1/06 10/1/06 11/1/06 12/1/06 1/1/07 2/1/07 3/1/07 4/1/07 5/1/07
  • 24. 6/1/2007-5/31/2008 400 350 300 Series1 250 Series2 Series3 Series4 200 Series5 Series6 150 Series7 Series8 100 Series9 50 0 6/1/07 7/1/07 8/1/07 9/1/07 10/1/07 11/1/07 12/1/07 1/1/08 2/1/08 3/1/08 4/1/08 5/1/08
  • 25. 6/1/2008-5/31/2009 80 70 60 Series1 50 Series2 Series3 Series4 40 Series5 Series6 30 Series7 Series8 20 Series9 10 0 6/1/08 7/1/08 8/1/08 9/1/08 10/1/08 11/1/08 12/1/08 1/1/09 2/1/09 3/1/09 4/1/09 5/1/09
  • 26. 6/1/2009-5/31/2010 200 180 160 140 Series1 Series2 120 Series3 Series4 100 Series5 Series6 80 Series7 Series8 60 Series9 Series10 40 20 0 6/1/09 7/1/09 8/1/09 9/1/09 10/1/09 11/1/09 12/1/09 1/1/10 2/1/10 3/1/10 4/1/10 5/1/10
  • 27. Spike sizes over the years 400 350 300 250 200 Series2 150 100 50 0 2002 2003 2004 2005 2006 2007 2008 2009
  • 28. Let’s take a closer look at the more interesting actors
  • 29. Actors #4-9 6/1/2008-5/31/2009 80 70 60 50 Series1 Series2 40 Series3 Series4 30 Series5 Series6 20 10 0 6/1/08 7/1/08 8/1/08 9/1/08 10/1/08 11/1/08 12/1/08 1/1/09 2/1/09 3/1/09 4/1/09 5/1/09
  • 30. Actors #4-9 6/1/2008-5/31/2009 -log 2 1.8 1.6 1.4 1.2 Series1 Series2 1 Series3 Series4 0.8 Series5 Series6 0.6 0.4 0.2 0 6/1/08 7/1/08 8/1/08 9/1/08 10/1/08 11/1/08 12/1/08 1/1/09 2/1/09 3/1/09 4/1/09 5/1/09
  • 31. One actor at a time ~10 years
  • 32. Actor #1 DoD: 6/27/2001 -edits/day 14 12 10 8 Series1 6 4 2 0 6/28/01 6/28/02 6/28/03 6/28/04 6/28/05 6/28/06 6/28/07 6/28/08 6/28/09 6/28/10 6/28/11
  • 33. Actor #1 –log(edits)/day 1.2 1 0.8 0.6 Series1 0.4 0.2 0 6/28/01 6/28/02 6/28/03 6/28/04 6/28/05 6/28/06 6/28/07 6/28/08 6/28/09 6/28/10 6/28/11
  • 34. Actor #7 -edits/day 100 90 80 70 60 50 Series1 40 30 20 10 0 9/24/03 9/24/04 9/24/05 9/24/06 9/24/07 9/24/08 9/24/09 9/24/10 9/24/11
  • 35. Actor #7 –log(edits)/day 2.5 2 1.5 Series1 1 0.5 0 9/24/03 9/24/04 9/24/05 9/24/06 9/24/07 9/24/08 9/24/09 9/24/10 9/24/11
  • 36. Actor #8 -edits/day 400 350 300 250 200 Series1 150 100 50 0 12/10/03 12/10/04 12/10/05 12/10/06 12/10/07 12/10/08 12/10/09 12/10/10
  • 37. Actor #8 –log(edits)/day 3 2.5 2 1.5 Series1 1 0.5 0 12/10/03 12/10/04 12/10/05 12/10/06 12/10/07 12/10/08 12/10/09 12/10/10
  • 38. Actor #9 –edits/day 200 180 160 140 120 100 Series1 80 60 40 20 0 2/28/04 2/28/05 2/28/06 2/28/07 2/28/08 2/28/09 2/28/10 2/28/11
  • 39. Actor #9 –log(edits)/day 2.5 2 1.5 Series1 1 0.5 0 2/28/04 2/28/05 2/28/06 2/28/07 2/28/08 2/28/09 2/28/10 2/28/11
  • 40. If we tweak the data to take importance into consideration…  Average gross, adjusted for inflation*  Only available for a small amount of actors chosen in the sample set  Taken from boxofficemojo.com  Extremely reliable source
  • 41. 50 100 150 200 250 300 350 400 0 1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 145 153 161 169 177 185 193 201 209 217 225 233 241 249 Actor #8 vs. Actor #9 257 265 273 281 289 297 305 313 321 329 337 345 353 361 ledger swayze
  • 42. 50 100 150 200 250 300 350 400 0 1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 145 153 161 169 177 185 193 201 209 217 225 233 241 249 257 265 273 281 289 297 305 313 321 329 337 345 353 361 Actor #8 vs. Actor #9 (adjusted) ledger swayze adjusted
  • 43. 1.5 2.5 0.5 0 1 2 3 1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 163 172 181 190 199 208 217 226 235 244 253 262 271 280 289 298 307 316 325 334 343 352 361 Actor #8 Vs. Actor #9 (adjusted) ledger log swayze adjusted log
  • 44. The same data on Google trends
  • 45. -10 days to +40 days (log) 3 2.5 2 coburn log peck log brando log 1.5 davis log palance log goulet log 1 ledger log swayze log 0.5 0 1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132333435363738394041424344454647484950
  • 46. Other things I should consider  Age at death  Cause of death  Were they still acting?
  • 47. Future directions  New sample of Wikipedia pages  Need to compare more contemporary pages  Need new metrics for comparison  Better workflows