Finding self-organized criticality in collaborative work via repository mining
1. Finding self-organized
criticality in collaborative
work via repository mining
J. J. Merelo1, P. A. Castillo1, Mario García-Valdez2
1 University of Granada (Spain)
2 Instituto Tecnológico de Tijuana (México)
1
2. 2
Motivation
Development teams eventually become complex
systems, mainly in collaborative work environments.
Relations and collaborations take place through the
environment.
Pattern mining and analysing social-based information
is a complex problem.
3. 3
Objectives
Analysing self-organization in collaborative work
environments.
Using graphic tools to analyse the dynamics in
collaborative work environments.
To explore and analyse relations-based data:
Do developers self-organize?
Contribute to open science tools and methodologies.
4. 4
Theory
In Statistical Physics, criticality is defined as a type of
behaviour observed when a system undergoes a
phase transition.
A state on the edge between two different types of
behaviour is called the critical state, and in this state
the system is at criticality.
5. 5
Example: The sandpile model
The sandpile model of self-organized criticality:
Dropping an additional grain on the pile may set off
avalanches that slide down the pile's slopes.
Image: h)p://journal.frontiersin.org/article/10.3389/fnsys.2014.00166/full
6. 6
Small variation, large effect
We add one grain to the pile, so in average the
steepness of slopes increases.
The slopes evolve to a critical state where a single
grain of sand is likely to settle on the pile, or to trigger
an avalanche.
Image: h)ps://es.pinterest.com/pin/222435669066944427/
7. Our aim
To present the underlying concepts and ideas from
Statistical Physics and nonlinear dynamics that could
explain relations in collaborative work environments.
Find out the dynamics underlying collaboration and
their mechanisms.
7
8. The study
We examined 4 repositories where the collaborative
writing of scientific papers take place.
Analysing changes in files, looking for the existence of:
1. a scale free structure
2. long-distance correlations
3. pink noise
8
9. The study
In this report we work on a repository for several papers.
Repositories with a certain “length”: more than 50
commits (changes)
Macro measures extracted from the size of changes.
9
10. Measures
Several macro measures extracted from the size of
changes to the files in the repository.
• Sequence of changes
• Timeline of commit sizes
• Change sizes ranked in descending order
• Long-distance correlations
• Presence of pink noise (1/f)
10
15. 15
text
the spectrum should present a
slope equal to -1
There is not a clear trend
downwards. The presence of
pink noise is not as clear as
the other two characteristics
Presence of pink noise, as measured
by the power spectral density (1/f)
16. 16
Conclusions
After analysing several repositories for scientific papers,
they are in a critical state:
• changes have a scale-free form, and
• there are long-distance correlations
• pink noise (only in some cases)
Open Science + reproducibility: draw your own
conclusions using the programs and data published at:
http://github.com/JJ/literaturame
“Measuring progress in literature and in other creative
endeavours, like programming”