SlideShare una empresa de Scribd logo
1 de 22
Analyzing Rich-Club Behavior
in Open Source Projects
OpenSym 2019, the 15th International Symposium on Open Collaboration
Skövde, Sweden
Mattia Gasparini1, Javier Luis Cànovas Izquierdo2,
Robert Clarisò2, Marco Brambilla1, Jordi Cabot2
Politecnico di Milano1 Universitat Oberta de la Catalunya2
Introduction
• Git and Github data to analyze evolution,
success and management of Open Source
Software.
• Define developers behavioral patterns.
• Discover how collaborations between
developers work.
2
Problem
Statement
ANALYSIS OF
COLLABORATION
NETWORKS
COMMITS, ISSUES AND
PULL REQUESTS AS
SOURCES
DISCOVER PRESENCE OF
SPECIFIC COLLABORATION
STRUCTURES: RICH-CLUBS
3
Rich-club coefficient
• Graph structural property:
It represents the tendency of well-connected nodes (i.e.: hubs) to interact with other well-
connected nodes.
• Formulation:
𝜙 𝑘 =
2𝐸 𝑘
𝑁𝑘(𝑁𝑘 − 1)
𝜌 𝑘 =
𝜙(𝑘)
𝜙 𝑟𝑎𝑛𝑑𝑜𝑚(𝑘)
𝐸 𝑘: number of edges between nodes of degree greater or equal to 𝑘
𝑁𝑘: number of nodes with degree greater or equal to 𝑘
𝜙 𝑘 : rich-club coefficient
𝜌 𝑘 : normalized rich-club coefficient
4
Related Work
• Rich-club phenomenon for a specific project [2],
or for a single FLOSS community [3].
• Study of the presence of a rich-club effect
across the whole GitHub social network [4].
• Analysis on open source communities exploiting
email exchanges among participants [5].
5
[2] Weifeng Pan, Bing Li, Yutao Ma, and Jing Liu. 2011. Multi-granularity evolution analysis of software using complex network theory
[3] Guido Conaldi. 2010. Flat for the few, steep for the many: Structural cohesion and Rich-Club effect as measures of hierarchy and control in FLOSS communities
[4] Antonio Lima, Luca Rossi, and Mirco Musolesi. 2014. Coding Together at Scale: GitHub as a Collaborative Social Network
[5] Sergi Valverde and Ricard V. Solé. 2007. Self-organization versus hierarchy in open-source social networks
Case Study
6
Top-100 starred projects in 2016 on
GitHub
926K commits produced by 50K Git users
1.3M issues-related events generated by
118K GitHub users
280K pullrequest-related events
generated by 20K GitHub users
Analysis Pipeline
7
Data Collection &
Preprocessing
• Git repository cloning for
commits data using Gitana
• Github activities for issues
and PR activities querying
GHArchive
• Duplicity and clashing
problem
8
Graphs Construction
• Definition of 4 undirected graphs:
a. PR graph
b. Commits graph
c. Issues graph
d. Supergraph (a + b + c)
• Nodes: users
• Edges connect a pair of users if
they interacted on the same
element (issue, PR, file)
9
Graphs Example
Materialize PR graph (a) Materialize commits graph (b) Materialize issues graph (c) Materialize supergraph (d)
10
Rich-club Coefficient
Calculation
• Calculation using algorithm
implementation included in
NetworkX6
• Normalized coefficient
𝜌(𝑘): rich-club effect
relevant if 𝜌 𝑘 > 1
• Discard networks for which
randomization fails
11
[6] https://networkx.github.io/documentation/stable/reference/algorithms/rich_club.html
Rich-club Coefficient
Results
• 60 projects have a defined
coefficient for the
supergraph.
• Each graph presents a rich-
club effect, since 𝜌 𝑘 > 1
for some 𝑘
Materialize7:
Rich-Club
Supergraph
Coefficient
Maximum normalized coefficient (k =
49) corresponds to maximum club effect
with nodes of degree at least 49.
13[7] https://materializecss.com
Materialize:
Supergraph
14
Swift8:
Rich-Club
Supergraph
Coefficient
15[8] https://swift.org/
Swift:
Supergraph
16
Rich-club Coefficient Results
17
Maximum coefficient distribution
• Distribution of the maximum
rich-club coefficient for each
type of graph across the studied
projects.
• Mean value around 1 for issues
and commits graphs
coefficients: weak rich-club
presence.
• Mean value around 1.4 for PR
graphs coefficient: strong rich-
club presence.
Further insights
18
Multi-club users
• 25 over 60 projects present a set
of users belonging to multiple rich-
clubs.
• Distribution of multi-club users
across the 25 projects.
• Developers form community with
strong influence in each project
level.
Further insights
19
Conclusions
First systematic evaluation of the rich-club
behaviour on open source projects:
• 60% of projects shows rich-clubs in the
supergraph, mostly with a slight effect.
• Rich-club behavior could undermine the open
paradigma, but phenomeon requires further
analysis.
• Strong rich-club presence in PR graphs may
reside to criticality of the activity.
• 25 over 60 projects have users belonging to
multiple rich-clubs.
20
Future Work
Weighted rich-club
coefficient
Rich-club effect at module
and ecosystem level
Time dimension to
highlight temporal clubs
21
Questions?

Más contenido relacionado

La actualidad más candente

Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHIBig Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHIRuchika Sharma
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social networkFiras Husseini
 
Social media community using optimized algorithm by M. Gomathi / Lecturer
Social media community using optimized algorithm by M. Gomathi / LecturerSocial media community using optimized algorithm by M. Gomathi / Lecturer
Social media community using optimized algorithm by M. Gomathi / Lecturergomathi chlm
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computingElena Simperl
 
Identifying news clusters using Q-analysis and Modularity
Identifying news clusters using Q-analysis and ModularityIdentifying news clusters using Q-analysis and Modularity
Identifying news clusters using Q-analysis and ModularityDavid Sousa-Rodrigues
 
Big Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust network
Big Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust networkBig Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust network
Big Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust networkRuchika Sharma
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social mediarangesharp
 
From Argument Mapping to Argument Mining, and Back
From Argument Mapping to Argument Mining, and BackFrom Argument Mapping to Argument Mining, and Back
From Argument Mapping to Argument Mining, and BackEDV Project
 
Navigating large graphs like a breeze with Linkurious
Navigating large graphs like a breeze with LinkuriousNavigating large graphs like a breeze with Linkurious
Navigating large graphs like a breeze with LinkuriousLinkurious
 

La actualidad más candente (9)

Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHIBig Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social network
 
Social media community using optimized algorithm by M. Gomathi / Lecturer
Social media community using optimized algorithm by M. Gomathi / LecturerSocial media community using optimized algorithm by M. Gomathi / Lecturer
Social media community using optimized algorithm by M. Gomathi / Lecturer
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computing
 
Identifying news clusters using Q-analysis and Modularity
Identifying news clusters using Q-analysis and ModularityIdentifying news clusters using Q-analysis and Modularity
Identifying news clusters using Q-analysis and Modularity
 
Big Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust network
Big Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust networkBig Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust network
Big Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust network
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
 
From Argument Mapping to Argument Mining, and Back
From Argument Mapping to Argument Mining, and BackFrom Argument Mapping to Argument Mining, and Back
From Argument Mapping to Argument Mining, and Back
 
Navigating large graphs like a breeze with Linkurious
Navigating large graphs like a breeze with LinkuriousNavigating large graphs like a breeze with Linkurious
Navigating large graphs like a breeze with Linkurious
 

Similar a Analyzing rich club behavior in open source projects

Operationalisation of Collaboration Sunbelt 2015
Operationalisation of Collaboration Sunbelt 2015Operationalisation of Collaboration Sunbelt 2015
Operationalisation of Collaboration Sunbelt 2015Dawn Foster
 
Network Relationships and Job Changes of Software Developers at Sunbelt 2016
Network Relationships and Job Changes of Software Developers at Sunbelt 2016Network Relationships and Job Changes of Software Developers at Sunbelt 2016
Network Relationships and Job Changes of Software Developers at Sunbelt 2016Dawn Foster
 
Birds of a Feather Flock Together? A Study of Developers’ Flocking and Migrat...
Birds of a Feather Flock Together? A Study of Developers’ Flocking and Migrat...Birds of a Feather Flock Together? A Study of Developers’ Flocking and Migrat...
Birds of a Feather Flock Together? A Study of Developers’ Flocking and Migrat...IJCSIS Research Publications
 
Leveraging the Crowd: Supporting Newcomers to Build an OSS Community
Leveraging the Crowd: Supporting Newcomers to Build an OSS CommunityLeveraging the Crowd: Supporting Newcomers to Build an OSS Community
Leveraging the Crowd: Supporting Newcomers to Build an OSS CommunityMarco Aurelio Gerosa
 
Decentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic WebDecentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic Webhala Skaf
 
The path to an hybrid open source paradigm
The path to an hybrid open source paradigmThe path to an hybrid open source paradigm
The path to an hybrid open source paradigmJonathan Challener
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningNeo4j
 
SocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfSocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfBalasundaramSr
 
A data-driven approach for understanding Open Design @ Design For Next
A data-driven approach for understanding Open Design @ Design For NextA data-driven approach for understanding Open Design @ Design For Next
A data-driven approach for understanding Open Design @ Design For NextMAKE-IT
 
CROSSMINER Project at OW2con'19
CROSSMINER Project at OW2con'19CROSSMINER Project at OW2con'19
CROSSMINER Project at OW2con'19OW2
 
Experiences in the Design and Implementation of a Social Cloud for Volunteer ...
Experiences in the Design and Implementation of a Social Cloud for Volunteer ...Experiences in the Design and Implementation of a Social Cloud for Volunteer ...
Experiences in the Design and Implementation of a Social Cloud for Volunteer ...ryanchard
 
Overview of the Research in Wimmics 2018
Overview of the Research in Wimmics 2018Overview of the Research in Wimmics 2018
Overview of the Research in Wimmics 2018Fabien Gandon
 
Conor Hayes - Topics, tags and trends in the blogosphere
Conor Hayes - Topics, tags and trends in the blogosphereConor Hayes - Topics, tags and trends in the blogosphere
Conor Hayes - Topics, tags and trends in the blogosphereDERIGalway
 
PEARC17: The Community Software Repository from XSEDE: A Resource for the Nat...
PEARC17: The Community Software Repository from XSEDE: A Resource for the Nat...PEARC17: The Community Software Repository from XSEDE: A Resource for the Nat...
PEARC17: The Community Software Repository from XSEDE: A Resource for the Nat...John-Paul Navarro
 
Participation Inequality and the 90-9-1 Principle in Open Source [OpenSym'2020]
Participation Inequality and the 90-9-1 Principle in Open Source [OpenSym'2020]Participation Inequality and the 90-9-1 Principle in Open Source [OpenSym'2020]
Participation Inequality and the 90-9-1 Principle in Open Source [OpenSym'2020]rclariso
 
IronHacks Live: Info session #3 - COVID-19 Data Science Challenge
IronHacks Live: Info session #3 - COVID-19 Data Science ChallengeIronHacks Live: Info session #3 - COVID-19 Data Science Challenge
IronHacks Live: Info session #3 - COVID-19 Data Science ChallengePurdue RCODI
 
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczFOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczIoan Toma
 
GraphChain
GraphChainGraphChain
GraphChainsopekmir
 

Similar a Analyzing rich club behavior in open source projects (20)

Operationalisation of Collaboration Sunbelt 2015
Operationalisation of Collaboration Sunbelt 2015Operationalisation of Collaboration Sunbelt 2015
Operationalisation of Collaboration Sunbelt 2015
 
Final Algos
Final AlgosFinal Algos
Final Algos
 
Network Relationships and Job Changes of Software Developers at Sunbelt 2016
Network Relationships and Job Changes of Software Developers at Sunbelt 2016Network Relationships and Job Changes of Software Developers at Sunbelt 2016
Network Relationships and Job Changes of Software Developers at Sunbelt 2016
 
Birds of a Feather Flock Together? A Study of Developers’ Flocking and Migrat...
Birds of a Feather Flock Together? A Study of Developers’ Flocking and Migrat...Birds of a Feather Flock Together? A Study of Developers’ Flocking and Migrat...
Birds of a Feather Flock Together? A Study of Developers’ Flocking and Migrat...
 
Leveraging the Crowd: Supporting Newcomers to Build an OSS Community
Leveraging the Crowd: Supporting Newcomers to Build an OSS CommunityLeveraging the Crowd: Supporting Newcomers to Build an OSS Community
Leveraging the Crowd: Supporting Newcomers to Build an OSS Community
 
Decentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic WebDecentralized Data Management for the Semantic Web
Decentralized Data Management for the Semantic Web
 
The path to an hybrid open source paradigm
The path to an hybrid open source paradigmThe path to an hybrid open source paradigm
The path to an hybrid open source paradigm
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine Learning
 
SocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfSocialCom09-tutorial.pdf
SocialCom09-tutorial.pdf
 
A data-driven approach for understanding Open Design @ Design For Next
A data-driven approach for understanding Open Design @ Design For NextA data-driven approach for understanding Open Design @ Design For Next
A data-driven approach for understanding Open Design @ Design For Next
 
DE gitConnect
DE gitConnectDE gitConnect
DE gitConnect
 
CROSSMINER Project at OW2con'19
CROSSMINER Project at OW2con'19CROSSMINER Project at OW2con'19
CROSSMINER Project at OW2con'19
 
Experiences in the Design and Implementation of a Social Cloud for Volunteer ...
Experiences in the Design and Implementation of a Social Cloud for Volunteer ...Experiences in the Design and Implementation of a Social Cloud for Volunteer ...
Experiences in the Design and Implementation of a Social Cloud for Volunteer ...
 
Overview of the Research in Wimmics 2018
Overview of the Research in Wimmics 2018Overview of the Research in Wimmics 2018
Overview of the Research in Wimmics 2018
 
Conor Hayes - Topics, tags and trends in the blogosphere
Conor Hayes - Topics, tags and trends in the blogosphereConor Hayes - Topics, tags and trends in the blogosphere
Conor Hayes - Topics, tags and trends in the blogosphere
 
PEARC17: The Community Software Repository from XSEDE: A Resource for the Nat...
PEARC17: The Community Software Repository from XSEDE: A Resource for the Nat...PEARC17: The Community Software Repository from XSEDE: A Resource for the Nat...
PEARC17: The Community Software Repository from XSEDE: A Resource for the Nat...
 
Participation Inequality and the 90-9-1 Principle in Open Source [OpenSym'2020]
Participation Inequality and the 90-9-1 Principle in Open Source [OpenSym'2020]Participation Inequality and the 90-9-1 Principle in Open Source [OpenSym'2020]
Participation Inequality and the 90-9-1 Principle in Open Source [OpenSym'2020]
 
IronHacks Live: Info session #3 - COVID-19 Data Science Challenge
IronHacks Live: Info session #3 - COVID-19 Data Science ChallengeIronHacks Live: Info session #3 - COVID-19 Data Science Challenge
IronHacks Live: Info session #3 - COVID-19 Data Science Challenge
 
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczFOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
 
GraphChain
GraphChainGraphChain
GraphChain
 

Más de Marco Brambilla

M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...Marco Brambilla
 
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...Marco Brambilla
 
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Marco Brambilla
 
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Exploring the Bi-verse.A trip across the digital and physical ecospheresExploring the Bi-verse.A trip across the digital and physical ecospheres
Exploring the Bi-verse. A trip across the digital and physical ecospheresMarco Brambilla
 
Conversation graphs in Online Social Media
Conversation graphs in Online Social MediaConversation graphs in Online Social Media
Conversation graphs in Online Social MediaMarco Brambilla
 
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...Marco Brambilla
 
Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals Marco Brambilla
 
Data Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionData Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionMarco Brambilla
 
Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Marco Brambilla
 
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...Marco Brambilla
 
Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Marco Brambilla
 
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...Marco Brambilla
 
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...Marco Brambilla
 
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.Marco Brambilla
 
Big Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di MilanoBig Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di MilanoMarco Brambilla
 
Web Science. An introduction
Web Science. An introductionWeb Science. An introduction
Web Science. An introductionMarco Brambilla
 
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...Marco Brambilla
 
Studying Multicultural Diversity of Cities and Neighborhoods through Social M...
Studying Multicultural Diversity of Cities and Neighborhoods through Social M...Studying Multicultural Diversity of Cities and Neighborhoods through Social M...
Studying Multicultural Diversity of Cities and Neighborhoods through Social M...Marco Brambilla
 
Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...Marco Brambilla
 
Model driven software engineering in practice book - chapter 7 - Developing y...
Model driven software engineering in practice book - chapter 7 - Developing y...Model driven software engineering in practice book - chapter 7 - Developing y...
Model driven software engineering in practice book - chapter 7 - Developing y...Marco Brambilla
 

Más de Marco Brambilla (20)

M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
 
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
 
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
 
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Exploring the Bi-verse.A trip across the digital and physical ecospheresExploring the Bi-verse.A trip across the digital and physical ecospheres
Exploring the Bi-verse. A trip across the digital and physical ecospheres
 
Conversation graphs in Online Social Media
Conversation graphs in Online Social MediaConversation graphs in Online Social Media
Conversation graphs in Online Social Media
 
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
 
Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals
 
Data Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionData Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extraction
 
Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018
 
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
 
Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...
 
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
 
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
 
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
 
Big Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di MilanoBig Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di Milano
 
Web Science. An introduction
Web Science. An introductionWeb Science. An introduction
Web Science. An introduction
 
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
 
Studying Multicultural Diversity of Cities and Neighborhoods through Social M...
Studying Multicultural Diversity of Cities and Neighborhoods through Social M...Studying Multicultural Diversity of Cities and Neighborhoods through Social M...
Studying Multicultural Diversity of Cities and Neighborhoods through Social M...
 
Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...
 
Model driven software engineering in practice book - chapter 7 - Developing y...
Model driven software engineering in practice book - chapter 7 - Developing y...Model driven software engineering in practice book - chapter 7 - Developing y...
Model driven software engineering in practice book - chapter 7 - Developing y...
 

Último

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 

Último (20)

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 

Analyzing rich club behavior in open source projects

  • 1. Analyzing Rich-Club Behavior in Open Source Projects OpenSym 2019, the 15th International Symposium on Open Collaboration Skövde, Sweden Mattia Gasparini1, Javier Luis Cànovas Izquierdo2, Robert Clarisò2, Marco Brambilla1, Jordi Cabot2 Politecnico di Milano1 Universitat Oberta de la Catalunya2
  • 2. Introduction • Git and Github data to analyze evolution, success and management of Open Source Software. • Define developers behavioral patterns. • Discover how collaborations between developers work. 2
  • 3. Problem Statement ANALYSIS OF COLLABORATION NETWORKS COMMITS, ISSUES AND PULL REQUESTS AS SOURCES DISCOVER PRESENCE OF SPECIFIC COLLABORATION STRUCTURES: RICH-CLUBS 3
  • 4. Rich-club coefficient • Graph structural property: It represents the tendency of well-connected nodes (i.e.: hubs) to interact with other well- connected nodes. • Formulation: 𝜙 𝑘 = 2𝐸 𝑘 𝑁𝑘(𝑁𝑘 − 1) 𝜌 𝑘 = 𝜙(𝑘) 𝜙 𝑟𝑎𝑛𝑑𝑜𝑚(𝑘) 𝐸 𝑘: number of edges between nodes of degree greater or equal to 𝑘 𝑁𝑘: number of nodes with degree greater or equal to 𝑘 𝜙 𝑘 : rich-club coefficient 𝜌 𝑘 : normalized rich-club coefficient 4
  • 5. Related Work • Rich-club phenomenon for a specific project [2], or for a single FLOSS community [3]. • Study of the presence of a rich-club effect across the whole GitHub social network [4]. • Analysis on open source communities exploiting email exchanges among participants [5]. 5 [2] Weifeng Pan, Bing Li, Yutao Ma, and Jing Liu. 2011. Multi-granularity evolution analysis of software using complex network theory [3] Guido Conaldi. 2010. Flat for the few, steep for the many: Structural cohesion and Rich-Club effect as measures of hierarchy and control in FLOSS communities [4] Antonio Lima, Luca Rossi, and Mirco Musolesi. 2014. Coding Together at Scale: GitHub as a Collaborative Social Network [5] Sergi Valverde and Ricard V. Solé. 2007. Self-organization versus hierarchy in open-source social networks
  • 6. Case Study 6 Top-100 starred projects in 2016 on GitHub 926K commits produced by 50K Git users 1.3M issues-related events generated by 118K GitHub users 280K pullrequest-related events generated by 20K GitHub users
  • 8. Data Collection & Preprocessing • Git repository cloning for commits data using Gitana • Github activities for issues and PR activities querying GHArchive • Duplicity and clashing problem 8
  • 9. Graphs Construction • Definition of 4 undirected graphs: a. PR graph b. Commits graph c. Issues graph d. Supergraph (a + b + c) • Nodes: users • Edges connect a pair of users if they interacted on the same element (issue, PR, file) 9
  • 10. Graphs Example Materialize PR graph (a) Materialize commits graph (b) Materialize issues graph (c) Materialize supergraph (d) 10
  • 11. Rich-club Coefficient Calculation • Calculation using algorithm implementation included in NetworkX6 • Normalized coefficient 𝜌(𝑘): rich-club effect relevant if 𝜌 𝑘 > 1 • Discard networks for which randomization fails 11 [6] https://networkx.github.io/documentation/stable/reference/algorithms/rich_club.html
  • 12. Rich-club Coefficient Results • 60 projects have a defined coefficient for the supergraph. • Each graph presents a rich- club effect, since 𝜌 𝑘 > 1 for some 𝑘
  • 13. Materialize7: Rich-Club Supergraph Coefficient Maximum normalized coefficient (k = 49) corresponds to maximum club effect with nodes of degree at least 49. 13[7] https://materializecss.com
  • 18. Maximum coefficient distribution • Distribution of the maximum rich-club coefficient for each type of graph across the studied projects. • Mean value around 1 for issues and commits graphs coefficients: weak rich-club presence. • Mean value around 1.4 for PR graphs coefficient: strong rich- club presence. Further insights 18
  • 19. Multi-club users • 25 over 60 projects present a set of users belonging to multiple rich- clubs. • Distribution of multi-club users across the 25 projects. • Developers form community with strong influence in each project level. Further insights 19
  • 20. Conclusions First systematic evaluation of the rich-club behaviour on open source projects: • 60% of projects shows rich-clubs in the supergraph, mostly with a slight effect. • Rich-club behavior could undermine the open paradigma, but phenomeon requires further analysis. • Strong rich-club presence in PR graphs may reside to criticality of the activity. • 25 over 60 projects have users belonging to multiple rich-clubs. 20
  • 21. Future Work Weighted rich-club coefficient Rich-club effect at module and ecosystem level Time dimension to highlight temporal clubs 21

Notas del editor

  1. GitHub is the most popular service to develop and maintain open source software. Each user interacts with many other users in the project development process (commits, issues, pr), defining collaboration networks. Studying collaboration networks helps in discovering properties and behaviors that influence development, management and success of an OSS project.
  2. Developers collaborate mostly with the same fixed subset of other important colleagues, instead of spreading the cooperation to each component of the team.
  3. Formally, it cab be measured by the so called rich-club coefficient ϕ(k). Intuitively, ϕ(k) measures how far the set of nodes with degree k is from being a complete subgraph. The value of ϕ(k) ranges from 0 (all nodes are disconnected) to 1 (a clique), with higher values showing a stronger rich-club behavior in the network. It is monotonically increasing even for random networks, so a normalized coefficient has been introduced in literature: ϕ(k) is divided by the coefficient calculated for a random network with same degree distribution of the original one.
  4. Presence or absence of a rich-clubs in open source projects has not been studied in a systematic way and has not been applied to a large dataset as the one that GitHub can now provide.
  5. Clashing: same name of different users Duplicity: different names for the same users Solution: use SHA value to associate git commits to GitHub users (if still present)
  6. Two users are connected in the PR graph if they commented/interacted on the same PR…
  7. Calculaton of rich-club coefficient is run for each project’s supergraph to have a global view of the effect. Maximum value for each project is shown: each of the 60 graphs presents a rich club behavior, even if most of them have values only slightly higher than 1. For this reason, we want to better understand the correspondence between the coefficient and the actual graphs.
  8. The first example that we take is the materialize repositorty: rich-club coefficient with respect to node degree is presented. It is possible to notice a rich-club behavior for a range of degrees, with a peak on k=49, which should correspond to groups of nodes with degree at least 49 connected to each other.
  9. This seems to go against the open source paradigma: project “owned” by few users. Established in 2014 by a team of 4 developers, with 3,853 commits and 252 contributors. Nevertheless, the project only has two top contributors (more than 1,000 commits), which belong to the original team, and no frequent contributors
  10. Mixed behavior presence: slightly over than 1, then dramatically lower. The overall intuition is that the graph does not present rich-clubs
  11. It was publicly announced by Apple in 2014 and was later open sourced in December 2015. Currently, the project has more than 84k commits and 674 contributors, with 14 top contributors (more than 1.000 commits) and 44 frequent contributors (between 100 and 1.000 commits). Remarkably, 4 of the top contributors and 21 of the frequent contributors do not belong to Apple according to their GitHub profile. This is a sign that the project has successfully attracted and retained external talent.
  12. In this table, the 10 projects with highest coefficient for the supergraph are presented. Along with them, the coefficient for the other kind of graphs is calculated when possible. Infact,also these other graphs can «hide» other clubs structures.
  13. Maximum coefficient distribution for each kind of graph as a further insight. Blue line is the one already discussed.Green and orange line show commits and issues maximum coefficient distribution: density has a peak on 1 meaning that most of the graphs do not present strong rich-clubs. Red line has its peak around 1.4: most of the projects present evident rich-club structures. This behavior could be related to the fact that PR is the most critical level in open-source software development and few trusthworty developers are in charge of most of the tasks.
  14. We focused also the attention on the users: almost 50% of the projects, have users tha belongs to multiple clubs. The distribution presents the number of users shared across all the projects’ clubs: this means that, on average, 7 developers are in the PR club, as well as in the commits and issues club. These developers form a sub-community inside the project that has strong influence in all the project’s levels.
  15. As rich-club phenomenon is quite complex and also its application on OSS communities relatively new, plenty of further works can be done. First of all, we want to apply weighted coefficient version to check if other patterns arise. We want to extend the analysis at the module and the ecosystem level. And third, we want to introduce time variable: in this work the graphs are built using the entire data as a 1-year snapshot, but it is possible to build monthly graphs and check if temporal clubs show up.
  16. With this, I have concluded the presentation. Thank you for the attention.