3. Bursts in social networks
Bursts of edits on Wikipedia in particular
When do those occur?
4. What can we learn by looking at
spikes in edit frequency?
How have edit spikes changed over Wikipedia’s ten
years of existence?
Does the size of an edit spike correlate to anything?
12. Regular Expressions (Regex)
Perl script uses regular expressions to find and
output matching pieces of text.
In this case, I am pulling out dates in Wikipedia’s
day month year format and re-writing them in a
more machine-readable MM/DD/YYYY format.
11/08/2011
13. Data manipulation
Copy/pase the revision history of wiki
pages into a text document which I
feed to my perl script
Results in lists consisting of one date
per edit that occurred on that date
Copying/pasting isn’t super
elegant, but I haven’t gotten
LWP/useragent stuff to work yet
14. Excel!
Throw my lists of dates into a pivot table, which
shows me the frequency that each date occurs
Some vlookup magic allows me to combine
these edit frequencies of individual actors into
one big list covering every day from 6/1/2001 to
the present
16. Problems
9 actors over 10 years means close to 100k cells
Excel is not built for speed
Matlab might work better
17. What does the data look like over
time?
6/1-5/31 from 2001 (when Wikipedia’s current edit no.’s
begin) to 2010 (when all of the bursts have settled down)
40. If we tweak the data to take
importance into consideration…
Average gross, adjusted for inflation*
Only available for a small amount of actors chosen in the
sample set
Taken from boxofficemojo.com
Extremely reliable source