SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
Prevalence and Evolution of License Violations
in npm and RubyGems Dependency Networks
Ilyas Saïd Makari Ilyas.Said.Makari@vub.be
Ahmed Zerouali Ahmed.Zerouali@vub.be
Coen De Roover Coen.De.Roover@vub.be
The International Conference on Software and Systems Reuse (ICSR)
Virtual (originally Montpellier, France) - June 15-17, 2022
Background
Open source software can be distributed with varying degrees of freedom
● 1. Public domain
○ All rights are granted with no conditions whatsoever
○ For example: “The Unlicense”
● 2. Permissive licenses
○ Little restrictions imposed
○ Must include copyright notice from original author
○ MIT, Apache, BSD, etc
● 3. Restrictive licenses (copyleft)
Background
● Strong copyleft licenses
○ All derivatives of the original work should be released under the same license
○ For example: GNU General Public License (GPL)
● Weak copyleft licenses
○ Exception when work is used as independent building block
○ For example: GNU Lesser General Public License (LGPL)
Background
Not all licenses may be legally combined in one software package
- For example: MIT (permissive) is compatible with GPL (restrictive), but not vice
versa.
We call a license A “one-way compatible” with license B, if software that
contains packages from both licenses may be legally licensed under
license B.
Background
https:/
/exploring-data.com/vis/npm-packages-dependencies/
~2M packages
~86M releases
~250M dependencies
Background
direct dependency
(level 1)
indirect dependency
(level 4)
Generated with https://npm.anvaka.com/
Background
Problem
Q: How frequent are license violations in dependency
networks of open source package repositories?
studied package
dependency
Case studies
License Violations in npm and RubyGems Dependency Networks
Method
A new license Compatibility Matrix, based on:
1. Kapitsaki et al. [*]
[*] Georgia M Kapitsaki, Frederik Kramer, and Nikolaos D Tselikas. Automating the license compatibility process in open source
software with spdx. Journal of Systems and Software, 131:386–401, 2017.
Compatibility graph from Kapitsaki et al. [*]
Then, we manually included information from:
1. Free Software Foundation
2. The European Commission
=> answer for 1,681 pairs of licenses, from which 205 (12.2%) are labeled as “Unknown”
Research questions
RQ1: What are the most prevalent licenses in package repositories?
○ What are the climates of each ecosystem?
○ Permissive or restrictive climate?
○ Does it influence the number of incompatibilities?
RQ2: To which extent do packages rely on direct dependencies with incompatible
licenses?
○ How prevalent are license violations on the first dependency tree level?
RQ3: How does license incompatibility spread across package dependency
networks?
○ How prevalent are license violations on each dependency tree level?
Case studies
~750k packages (latest release)
~3.5M direct runtime
dependencies
~95k packages (latest release)
~211k direct runtime
dependencies
Open Data:
- Libraries.io gathers data from 32 package managers and 3 source code
repositories.
- They monitor over 5.4M unique open source packages, and more than 500M
interdependencies between them.
Dataset
Case studies
package.json
Dependency resolution: January 12th, 2020
Case studies
~750k packages
~3.5M direct dependencies
On January 12th, 2020
~ 66.4 M (all) runtime dependencies
~7.3% of the packages have
dependencies with incompatible
licenses,
~95k packages
~211k direct dependencies
On January 12th, 2020:
~ 1.2M (all) runtime dependencies
~13.9% of the packages have
dependencies with incompatible
licenses,
Research questions
RQ1: What are the most prevalent licenses in package repositories?
- MIT is the most popular license in npm and RubyGems.
Research questions
RQ1: What are the most prevalent licenses in package repositories?
- MIT has been popular in npm since its beginning.
- ISC becoming the new default license for npm packages increased its
popularity.
Research questions
RQ1: What are the most prevalent licenses in package repositories?
- MIT gradually evolved into the most popular license.
- Over the last few years, Apache has become the second most popular
license choice within the RubyGems ecosystem.
Research questions
RQ2: To which extent do packages rely on direct dependencies with incompatible
licenses?
- Only 0.9% and 4.3% of npm and RubyGems dependencies have licenses
that are incompatible with those of their dependents, respectively.
- The most common pair of incompatible licenses is MIT with GPL.
Research questions
RQ3: How does license incompatibility spread across package dependency networks?
- npm packages have more indirect dependencies with incompatible licenses than
RubyGems (due to the high number of dependencies that npm packages include.)
- However, RubyGems has proportionally more incompatible indirect dependencies
than npm.
Research questions
RQ3: How does license incompatibility spread across package dependency networks?
- The number of dependencies without a license decreases from one level to the
next as we go deeper in both package repositories.
Tooling
Screenshot of the license compatibility checking tool.
https://doi.org/10.5281/zenodo.5913761
Conclusion
● Deeper-level dependencies cause fewer incompatibilities than those at the shallow levels.
● GPL dependencies are the major cause for incompatibilities, and they are more present in the first
level of dependency trees.
● We found that a set of packages created by a single organization can influence an ecosystem when it
consistently releases useful packages under a particular license.
● Our results help in understanding the state of license incompatibilities in software package
ecosystems.
Threats to validity
● Libraries.io dataset.
● Various sources of information to construct our license compatibility matrix.
● We only considered the license of the latest release of each package.
● Many packages do not have any license.

Más contenido relacionado

Similar a Prevalence and Evolution of License Violations in npm and RubyGems Dependency Networks

On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...Ahmed Zerouali
 
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...sparkfabrik
 
Open Source: A New Software Paradigm
Open Source: A New Software ParadigmOpen Source: A New Software Paradigm
Open Source: A New Software ParadigmYe Joo Park
 
An Open Source Case Study
An Open Source Case StudyAn Open Source Case Study
An Open Source Case Studywebhostingguy
 
Software Licensing.pptx
Software Licensing.pptxSoftware Licensing.pptx
Software Licensing.pptxAaliyanShaikh
 
GDSC - Software Licensing.pdf
GDSC - Software Licensing.pdfGDSC - Software Licensing.pdf
GDSC - Software Licensing.pdfAaliyanShaikh
 
10 things to know about the intersection of blockchain technology, open sourc...
10 things to know about the intersection of blockchain technology, open sourc...10 things to know about the intersection of blockchain technology, open sourc...
10 things to know about the intersection of blockchain technology, open sourc...Kyiv National Economic University
 
Business models of open hardware
Business models of open hardwareBusiness models of open hardware
Business models of open hardwareRobert Viseur
 
Legal analysis of source code
Legal analysis of source codeLegal analysis of source code
Legal analysis of source codeRobert Viseur
 
L'open hardware dans l'électronique (et au delà...)
L'open hardware dans l'électronique (et au delà...)L'open hardware dans l'électronique (et au delà...)
L'open hardware dans l'électronique (et au delà...)Robert Viseur
 
Intro to FOSS
Intro to FOSSIntro to FOSS
Intro to FOSSmgamal87
 
Introduction to FOSS
Introduction to FOSSIntroduction to FOSS
Introduction to FOSSmgamal87
 
Introduction To Open Source Licensing
Introduction To Open Source LicensingIntroduction To Open Source Licensing
Introduction To Open Source LicensingMark Radcliffe
 
IPO Presentation 2012
IPO Presentation 2012IPO Presentation 2012
IPO Presentation 2012theosss
 
Introduction To Open Source Licenses
Introduction To Open Source LicensesIntroduction To Open Source Licenses
Introduction To Open Source LicensesHarley Pascua
 
ePractice workshop on Open Source Software, 7 April 2011 - Philippe Laurent
ePractice workshop on Open Source Software, 7 April 2011 - Philippe LaurentePractice workshop on Open Source Software, 7 April 2011 - Philippe Laurent
ePractice workshop on Open Source Software, 7 April 2011 - Philippe LaurentePractice.eu
 

Similar a Prevalence and Evolution of License Violations in npm and RubyGems Dependency Networks (20)

On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
 
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
 
Open Source: A New Software Paradigm
Open Source: A New Software ParadigmOpen Source: A New Software Paradigm
Open Source: A New Software Paradigm
 
Open Source vs Proprietary
Open Source vs ProprietaryOpen Source vs Proprietary
Open Source vs Proprietary
 
An Open Source Case Study
An Open Source Case StudyAn Open Source Case Study
An Open Source Case Study
 
Software Licensing.pptx
Software Licensing.pptxSoftware Licensing.pptx
Software Licensing.pptx
 
GDSC - Software Licensing.pdf
GDSC - Software Licensing.pdfGDSC - Software Licensing.pdf
GDSC - Software Licensing.pdf
 
Joomladay 2014 - Open source licenses
Joomladay 2014 - Open source licensesJoomladay 2014 - Open source licenses
Joomladay 2014 - Open source licenses
 
10 things to know about the intersection of blockchain technology, open sourc...
10 things to know about the intersection of blockchain technology, open sourc...10 things to know about the intersection of blockchain technology, open sourc...
10 things to know about the intersection of blockchain technology, open sourc...
 
Business models of open hardware
Business models of open hardwareBusiness models of open hardware
Business models of open hardware
 
Legal analysis of source code
Legal analysis of source codeLegal analysis of source code
Legal analysis of source code
 
L'open hardware dans l'électronique (et au delà...)
L'open hardware dans l'électronique (et au delà...)L'open hardware dans l'électronique (et au delà...)
L'open hardware dans l'électronique (et au delà...)
 
Open Source Software
Open Source SoftwareOpen Source Software
Open Source Software
 
2009 patents - presentation
2009   patents - presentation2009   patents - presentation
2009 patents - presentation
 
Intro to FOSS
Intro to FOSSIntro to FOSS
Intro to FOSS
 
Introduction to FOSS
Introduction to FOSSIntroduction to FOSS
Introduction to FOSS
 
Introduction To Open Source Licensing
Introduction To Open Source LicensingIntroduction To Open Source Licensing
Introduction To Open Source Licensing
 
IPO Presentation 2012
IPO Presentation 2012IPO Presentation 2012
IPO Presentation 2012
 
Introduction To Open Source Licenses
Introduction To Open Source LicensesIntroduction To Open Source Licenses
Introduction To Open Source Licenses
 
ePractice workshop on Open Source Software, 7 April 2011 - Philippe Laurent
ePractice workshop on Open Source Software, 7 April 2011 - Philippe LaurentePractice workshop on Open Source Software, 7 April 2011 - Philippe Laurent
ePractice workshop on Open Source Software, 7 April 2011 - Philippe Laurent
 

Más de Ahmed Zerouali

Analysis And Observations Of The Evolution Of Testing Library Usage
Analysis And Observations Of The Evolution Of Testing Library UsageAnalysis And Observations Of The Evolution Of Testing Library Usage
Analysis And Observations Of The Evolution Of Testing Library UsageAhmed Zerouali
 
On Popularity and Quality Metrics of npm Packages
On Popularity and Quality Metrics of npm PackagesOn Popularity and Quality Metrics of npm Packages
On Popularity and Quality Metrics of npm PackagesAhmed Zerouali
 
A multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesA multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesAhmed Zerouali
 
Evolution of Technical Lag in DockerHub images - Benevol20
Evolution of Technical Lag in DockerHub images - Benevol20Evolution of Technical Lag in DockerHub images - Benevol20
Evolution of Technical Lag in DockerHub images - Benevol20Ahmed Zerouali
 
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...
PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...Ahmed Zerouali
 
Technical Lag in Software Ecosystems
Technical Lag in Software EcosystemsTechnical Lag in Software Ecosystems
Technical Lag in Software EcosystemsAhmed Zerouali
 
Technical lag in npm and docker ecosystems
Technical lag in npm and docker ecosystemsTechnical lag in npm and docker ecosystems
Technical lag in npm and docker ecosystemsAhmed Zerouali
 
Analyzing Packages in Docker images hosted On DockerHub
Analyzing Packages in Docker images hosted On DockerHubAnalyzing Packages in Docker images hosted On DockerHub
Analyzing Packages in Docker images hosted On DockerHubAhmed Zerouali
 
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...On the Diversity of Software Package Popularity Metrics: An Empirical Study o...
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...Ahmed Zerouali
 
ConPan: A Tool to Analyze Packages in Software Containers
ConPan: A Tool to Analyze Packages in Software ContainersConPan: A Tool to Analyze Packages in Software Containers
ConPan: A Tool to Analyze Packages in Software ContainersAhmed Zerouali
 
Technical Lag in Docker Containers
Technical Lag in Docker ContainersTechnical Lag in Docker Containers
Technical Lag in Docker ContainersAhmed Zerouali
 
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAnalyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAhmed Zerouali
 
An Empirical Comparison of the Development History of CloudStack and Eucalyptus
An Empirical Comparison of the Development History of CloudStack and EucalyptusAn Empirical Comparison of the Development History of CloudStack and Eucalyptus
An Empirical Comparison of the Development History of CloudStack and EucalyptusAhmed Zerouali
 
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAnalyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAhmed Zerouali
 
An Empirical Analysis of Technical Lag in npm Package Dependencies
An Empirical Analysis of Technical Lag in npm Package DependenciesAn Empirical Analysis of Technical Lag in npm Package Dependencies
An Empirical Analysis of Technical Lag in npm Package DependenciesAhmed Zerouali
 

Más de Ahmed Zerouali (15)

Analysis And Observations Of The Evolution Of Testing Library Usage
Analysis And Observations Of The Evolution Of Testing Library UsageAnalysis And Observations Of The Evolution Of Testing Library Usage
Analysis And Observations Of The Evolution Of Testing Library Usage
 
On Popularity and Quality Metrics of npm Packages
On Popularity and Quality Metrics of npm PackagesOn Popularity and Quality Metrics of npm Packages
On Popularity and Quality Metrics of npm Packages
 
A multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesA multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker images
 
Evolution of Technical Lag in DockerHub images - Benevol20
Evolution of Technical Lag in DockerHub images - Benevol20Evolution of Technical Lag in DockerHub images - Benevol20
Evolution of Technical Lag in DockerHub images - Benevol20
 
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...
PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...
 
Technical Lag in Software Ecosystems
Technical Lag in Software EcosystemsTechnical Lag in Software Ecosystems
Technical Lag in Software Ecosystems
 
Technical lag in npm and docker ecosystems
Technical lag in npm and docker ecosystemsTechnical lag in npm and docker ecosystems
Technical lag in npm and docker ecosystems
 
Analyzing Packages in Docker images hosted On DockerHub
Analyzing Packages in Docker images hosted On DockerHubAnalyzing Packages in Docker images hosted On DockerHub
Analyzing Packages in Docker images hosted On DockerHub
 
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...On the Diversity of Software Package Popularity Metrics: An Empirical Study o...
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...
 
ConPan: A Tool to Analyze Packages in Software Containers
ConPan: A Tool to Analyze Packages in Software ContainersConPan: A Tool to Analyze Packages in Software Containers
ConPan: A Tool to Analyze Packages in Software Containers
 
Technical Lag in Docker Containers
Technical Lag in Docker ContainersTechnical Lag in Docker Containers
Technical Lag in Docker Containers
 
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAnalyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
 
An Empirical Comparison of the Development History of CloudStack and Eucalyptus
An Empirical Comparison of the Development History of CloudStack and EucalyptusAn Empirical Comparison of the Development History of CloudStack and Eucalyptus
An Empirical Comparison of the Development History of CloudStack and Eucalyptus
 
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAnalyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
 
An Empirical Analysis of Technical Lag in npm Package Dependencies
An Empirical Analysis of Technical Lag in npm Package DependenciesAn Empirical Analysis of Technical Lag in npm Package Dependencies
An Empirical Analysis of Technical Lag in npm Package Dependencies
 

Último

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 

Último (20)

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 

Prevalence and Evolution of License Violations in npm and RubyGems Dependency Networks

  • 1. Prevalence and Evolution of License Violations in npm and RubyGems Dependency Networks Ilyas Saïd Makari Ilyas.Said.Makari@vub.be Ahmed Zerouali Ahmed.Zerouali@vub.be Coen De Roover Coen.De.Roover@vub.be The International Conference on Software and Systems Reuse (ICSR) Virtual (originally Montpellier, France) - June 15-17, 2022
  • 3. Open source software can be distributed with varying degrees of freedom ● 1. Public domain ○ All rights are granted with no conditions whatsoever ○ For example: “The Unlicense” ● 2. Permissive licenses ○ Little restrictions imposed ○ Must include copyright notice from original author ○ MIT, Apache, BSD, etc ● 3. Restrictive licenses (copyleft) Background
  • 4. ● Strong copyleft licenses ○ All derivatives of the original work should be released under the same license ○ For example: GNU General Public License (GPL) ● Weak copyleft licenses ○ Exception when work is used as independent building block ○ For example: GNU Lesser General Public License (LGPL) Background
  • 5. Not all licenses may be legally combined in one software package - For example: MIT (permissive) is compatible with GPL (restrictive), but not vice versa. We call a license A “one-way compatible” with license B, if software that contains packages from both licenses may be legally licensed under license B. Background
  • 7. direct dependency (level 1) indirect dependency (level 4) Generated with https://npm.anvaka.com/ Background
  • 8. Problem Q: How frequent are license violations in dependency networks of open source package repositories? studied package dependency
  • 9. Case studies License Violations in npm and RubyGems Dependency Networks
  • 10. Method A new license Compatibility Matrix, based on: 1. Kapitsaki et al. [*] [*] Georgia M Kapitsaki, Frederik Kramer, and Nikolaos D Tselikas. Automating the license compatibility process in open source software with spdx. Journal of Systems and Software, 131:386–401, 2017. Compatibility graph from Kapitsaki et al. [*] Then, we manually included information from: 1. Free Software Foundation 2. The European Commission => answer for 1,681 pairs of licenses, from which 205 (12.2%) are labeled as “Unknown”
  • 11. Research questions RQ1: What are the most prevalent licenses in package repositories? ○ What are the climates of each ecosystem? ○ Permissive or restrictive climate? ○ Does it influence the number of incompatibilities? RQ2: To which extent do packages rely on direct dependencies with incompatible licenses? ○ How prevalent are license violations on the first dependency tree level? RQ3: How does license incompatibility spread across package dependency networks? ○ How prevalent are license violations on each dependency tree level?
  • 12. Case studies ~750k packages (latest release) ~3.5M direct runtime dependencies ~95k packages (latest release) ~211k direct runtime dependencies
  • 13. Open Data: - Libraries.io gathers data from 32 package managers and 3 source code repositories. - They monitor over 5.4M unique open source packages, and more than 500M interdependencies between them. Dataset
  • 15. Case studies ~750k packages ~3.5M direct dependencies On January 12th, 2020 ~ 66.4 M (all) runtime dependencies ~7.3% of the packages have dependencies with incompatible licenses, ~95k packages ~211k direct dependencies On January 12th, 2020: ~ 1.2M (all) runtime dependencies ~13.9% of the packages have dependencies with incompatible licenses,
  • 16. Research questions RQ1: What are the most prevalent licenses in package repositories? - MIT is the most popular license in npm and RubyGems.
  • 17. Research questions RQ1: What are the most prevalent licenses in package repositories? - MIT has been popular in npm since its beginning. - ISC becoming the new default license for npm packages increased its popularity.
  • 18. Research questions RQ1: What are the most prevalent licenses in package repositories? - MIT gradually evolved into the most popular license. - Over the last few years, Apache has become the second most popular license choice within the RubyGems ecosystem.
  • 19. Research questions RQ2: To which extent do packages rely on direct dependencies with incompatible licenses? - Only 0.9% and 4.3% of npm and RubyGems dependencies have licenses that are incompatible with those of their dependents, respectively. - The most common pair of incompatible licenses is MIT with GPL.
  • 20. Research questions RQ3: How does license incompatibility spread across package dependency networks? - npm packages have more indirect dependencies with incompatible licenses than RubyGems (due to the high number of dependencies that npm packages include.) - However, RubyGems has proportionally more incompatible indirect dependencies than npm.
  • 21. Research questions RQ3: How does license incompatibility spread across package dependency networks? - The number of dependencies without a license decreases from one level to the next as we go deeper in both package repositories.
  • 22. Tooling Screenshot of the license compatibility checking tool. https://doi.org/10.5281/zenodo.5913761
  • 23. Conclusion ● Deeper-level dependencies cause fewer incompatibilities than those at the shallow levels. ● GPL dependencies are the major cause for incompatibilities, and they are more present in the first level of dependency trees. ● We found that a set of packages created by a single organization can influence an ecosystem when it consistently releases useful packages under a particular license. ● Our results help in understanding the state of license incompatibilities in software package ecosystems.
  • 24. Threats to validity ● Libraries.io dataset. ● Various sources of information to construct our license compatibility matrix. ● We only considered the license of the latest release of each package. ● Many packages do not have any license.