Data quality can be evaluated with a knowledge of the processes underlying the production of
the data. Understanding users' interactions is a necessary step to find which areas are the most curated in OpenStreetMap. Extracting information from each user's contribution history it is possible to find the users' interaction network and their preferred area of activity. In this presentation we want to show how social network analysis over a given area can be performed to obtain a "collaboration score" for a single user and we present our work on the analysis of the OSM users' social network for Italy.
The collaboration network in OSM - the case of Italy
1. The collaboration network in OSM:
the case of Italy.
Maurizio Napolitano
<napo@fbk.eu>
State of the Map 2013
The OpenStreetMap Event
6-8 September 2013
Birmingham, UK
2. How is the collaboration in OpenStreetMap?
What is possible to understand from the data?
3. Construct the collaborative network
simone modifies
a tag made by Tim
SteveC adds
a point
simone adds
a tag
1 2 3
4
tim assigns a name
to a street drawed
by simone
5
SteveC adds a tag
6
Tim moves
the point
5. What we did
Historic openstreetmap of 3
cities:
- Trento
- Rome
- Milan
source code: https://github.com/napo/osmsna/
social graph
+ users details
6. The amazing tools created by Pascal Neis
How did you contribute to OSM? - user EdoM
Who's around me? - Milan City
7. ABC of SNA
by Michela Ferron
http://www.slideshare.net/fbk.eu/fbk-seminar-michela-ferron-presentation
8. Some social network analisys indicators (1/3)
DEGREE: number of lines incident with a node.
IN-DEGREE: number of lines directed into a node
measure of RECEPTIVITY
OUT-DEGREE: number of lines directed from anode to
another one
measure of EXPANSIVENESS
9. Some social network analisys indicators (2/3)
An actor has a high betweenness centrality if he/she lies between
many of other actors (technically, on their geodesic)
Prominence = “CONTROL ON COMMUNICATION”
BETWEENNESS centrality: Interactions between two nonadjacent actors
might depend on other actors, who might have some control over the
interactions of the others.
10. Density of a graph: proportion of possible lines that
are actually present in the graph (the ratio of the
number of the present lines to the maximum
possible)
measure of COHESION
Some social network analisys indicators (3/3)
HIGH DENSITY LOW DENSITY
11. • DEGREE: level of activity in the community
• IN-DEGREE: level of corrections received
• OUT-DEGREE: level of corrections made
• BETWEENNESS: level of collaboration in the
community
• DENSITY: community cohesion indicator
In the case of the OpenStreetMap users:
12. The three cities
ROME
People
2.638.842
Area
1,285.31 km2
Density
2,100/km2
MILAN
People
1.247.379
Area
181.76 km2
Density
6,900/km2
TRENTO
People
117.307
Area
157.9 km2
Density
740/km2
data & pictures from wikipedia
15. The social graph - Trento
nodes: 289
edges: 1169
average degree: 4.05
network diameter: 7
graph density: 0.014
modularity: 0.308 | 71 communities
Number of Weakly Connected Components: 64
Number of Stronlgy Connected Components: 136
graph made with gephi
17. The social graph - Milan
nodes: 519
edges: 1730
average degree: 3.333
network diameter: 8
graph density: 0.006
modularity: 0.25 | 171 communities
Number of Weakly Connected Components: 151
Number of Stronlgy Connected Components: 307
graph made with gephi
18. Social Graph Milan – users' centroids view
Data calculated using Pascal Neis' tool:
“How did you contribute to OpenStreetMap ?”
http://hdyc.neis-one.org/
20. The social graph - Rome
nodes: 793
edges: 162
average degree: 0,2
network diameter: 7
graph density: 0
modularity: 0.45 | 743 communities
Number of Weakly Connected Components: 732
Number of Stronlgy Connected Components: 770
graph made with gephi
21. Rome - 3D View of the social graph)
A HUGE NUMBER
OF CONTRIBUTORS
Dimension of nodes based
on the degree indicator
A huge number of
contributors with
small degree index
23. comparison results Social Netwok Analysis
TRENTO MILAN ROME
nodes 289 519 793
edges 1169 1730 162
graph density 0.014 0.006 0
modularity 0.308 0.250 0.45
communities 71 171 743
24. SNA metrics and more for a single user
http://napo.github.io/osmsna/
25. Summary
from the history OpenStreetMap file is possible to
extract a social graph
the results of the social network analysis return useful
information to understand the community and individual
users' behavior
Next steps
implement longitudinal analyzes
extend the analysis to larger regions
implement a continuous auto-update
define an indicator of "crowdquality" in order to provide
a level of the quality of data
Conclusion and future work
26. Thank for your attention!
twitter: @napo
blog: http://de.straba.us
email: napo@fbk.eu
slide: http://slideshare.net/napo
This work is supported by T2DataExchange – http://trentino.dandelion.eu/
a project by Spaziodati Srl, Edizioni Curcu&Genovese, Fondazione Bruno Kessler
with funds from the European Regional Development Fund