SlideShare una empresa de Scribd logo
1 de 30
An Empirical Study of Popularity and
Quality of NPM Packages
1
Ahmed Zerouali
Motivation
2
Number of libaries in most known OS package managers
- 206k libraries - Java
- 162k packages - PHP
- 600k packages - JavaScript
In 10/Nov/2017
Motivation
3
Reasons to choose the right OS software:
 Software Quality
 Software Features
 Software Support and Documentation
 Software Popularity
 …
 …
Motivation
4
Choosing the right OS software:
 Software Quality
 Software Features
 Software Support and Documentation
 Software Popularity
 …
 …
Motivation
5
Interviews with developers:
 C. Bogart, C. Kästner, J. Herbsleb, and F. Thung. How to break
an API: Cost negotiation and community values in three software
ecosystems. In Int’l Symp. Foundations of Software Engineering
(FSE), pages 109–120. ACM, 2016.
Popularity and community reputation are the most influenced factors.
Motivation
6
Software Popularity and Quality
TOP 14
Motivation
7
Software Popularity and Quality
TOP 2
Method
8
Chosen software package manager
Method
9
An open source repository containing
metadata(size, dependents,
dependencies) of package
dependencies extracted from 23
package managers.
An open source search engine that
computes a normalized score between 0
and 1 of the npm packages popularity,
quality and maintenance
Method
10
characteristics score
Popularity Quality Maintenance
# stars
# forks
# subscribers
#contributors
# dependents
# downloads
# Downloads
acceleration
README?
License?
.gitignore and friends?
Has tests?
#Test coverage
Is the build passing?
#outdated deps
& vulnerabilities?
Has Custom website?
Has Linters Configured?
#Ratio of open issues
vs. total issues
#Time to close issues
#Commits frequency
#Release frequency
Data Extraction
11
- Download the prepared and availabale
metadata from 15th June 2017
- Use API and get the latest information
( rate limit= 60 request/minute )
-Use API (no rate limit)
Method
12
DESCRIPTIVE STATISTICS OF THE CONSIDERED DATASET
Research Questions
13
RQ0(preliminary question): How are measures of package
popularity related to the use of a package?
RQ1: Is there a relationship between package quality
and package popularity?
RQ2: Is there a relationship between the maintainability
and popularity of packages?
RQ3: Are deprecated packages still being used?
RQ4: How different are packages used in web frontend
development in the context of all packages?
Data Analysis
14
-Data analysis and precessing: import pandas
- Data visualization: import matplotlib
import seaborn
- Analytics: import scipy
RQ0 - How are measures of package popularity
related to the use of a package?
15
Pearson correlation coefficient R= 0.8
RQ0 - How are measures of package popularity
related to the use of a package?
16
 Almost 4 out of 10 packages are not used by any other package or
external repository.
 35% of packages don’t have any direct dependency
package
RQ1 - Is there a relationship between package quality
and package popularity?
17
- Quality
- Testing (tests, test converage, build status)
- Carefulness( licence, readme, .gitingor..).
- Health ( outdated dependencies and
vulnerabilities)
-Branding( badges and homepage)
- Popularity
-Community interest (npms.io)
- dependent external repositories (libraries.io)
RQ1 - Is there a relationship between package quality
and package popularity?
18
Distribution of popularity in terms of community interest, number of dependent
repositories and quality score of npm packages, split into packages that have at
least one dependency and packages that don’t.
RQ1 - Is there a relationship between package quality
and package popularity?
19
Pearson correlation coefficient R <0.33, for both testing and carefulness
RQ2 - Is there a relationship between the
maintainability and popularity of packages?
20
Distribution of maintenance characteristics scores grouped in packages
that have a commit score above the median and packages that have a commit
score under the median(0.25)
RQ2 - Is there a relationship between the
maintainability and popularity of packages?
21
RQ3: Are deprecated packages still being used?
22
- Package declared ‘deprecated’ in the status: 768
- Packages declared ‘deprecated’ in the description: 1,522
Total deprecated packages found: 2,290( out of all npm)
 Total deprecated packages found in npmsio: 836
RQ3: Are deprecated packages still being used?
23
0.4% of all npm packages are deprecated packages and they are
less used.
Deprecated packages are used by popular packages too.
Deprecated packages have the same characteristics as the other
packages, except for size, release frequency, commit frequency and
fixing issues.
RQ4: How different are packages used in web
frontend development in the context of all packages?
24
- Package on Bower : 65,397
- Package on Bower and npm : 25,203
 Total front-end packages found in npmsio: 20,210
RQ4: How different are packages used in web
frontend development in the context of all packages?
25
Limitations
26
-Npms.io and libraries.io metrics and their evaluation.
- Not generalizable to other packages.
- Package category.
Conclusion
27
 Investigated the relationship between software popularity and quality.
Used npms.io and libraries.io.
Found that:
Software popularity and quality are weakly correlated.
Maintenance has little impact on Popularity.
 Only a small number of packages are deprecated.
 Front-end packages are more popular.
Future Work
28
- Cross-ecosystem comparisons: detect differences in the
relation between popularity and quality across ecosystems.
-Qualitative analysis: carrying out interviews and surveys
with package developers.
29
Questions
30

Más contenido relacionado

Similar a On Popularity and Quality Metrics of npm Packages

Socio-Technical Empirical Comparison of Software Package Ecosystems
Socio-Technical Empirical Comparison of Software Package EcosystemsSocio-Technical Empirical Comparison of Software Package Ecosystems
Socio-Technical Empirical Comparison of Software Package EcosystemsTom Mens
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Tom Mens
 
A Step Towards Reproducibility in R
A Step Towards Reproducibility in RA Step Towards Reproducibility in R
A Step Towards Reproducibility in RRevolution Analytics
 
An Empirical Analysis of Technical Lag in npm Package Dependencies
An Empirical Analysis of Technical Lag in npm Package DependenciesAn Empirical Analysis of Technical Lag in npm Package Dependencies
An Empirical Analysis of Technical Lag in npm Package DependenciesAhmed Zerouali
 
On the topology of package dependency networks: A comparison of programming l...
On the topology of package dependency networks: A comparison of programming l...On the topology of package dependency networks: A comparison of programming l...
On the topology of package dependency networks: A comparison of programming l...Tom Mens
 
Predicting Android Application Security and Privacy Risk With Static Code Met...
Predicting Android Application Security and Privacy Risk With Static Code Met...Predicting Android Application Security and Privacy Risk With Static Code Met...
Predicting Android Application Security and Privacy Risk With Static Code Met...MobileSoft
 
Bolker esa2014
Bolker esa2014Bolker esa2014
Bolker esa2014Ben Bolker
 
Distributions and package management in the containers era
Distributions and package management in the containers eraDistributions and package management in the containers era
Distributions and package management in the containers eranussbauml
 
Reproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R ConferenceReproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R ConferenceRevolution Analytics
 
Reproducibility with Checkpoint & RRO
Reproducibility with Checkpoint & RROReproducibility with Checkpoint & RRO
Reproducibility with Checkpoint & RROWork-Bench
 
Software bill of materials: strumenti e analisi di progetti open source dell’...
Software bill of materials: strumenti e analisi di progetti open source dell’...Software bill of materials: strumenti e analisi di progetti open source dell’...
Software bill of materials: strumenti e analisi di progetti open source dell’...FedericoBoni3
 
Towards Laws of Software Ecosystem Evolution: An Empirical Comparison of Seve...
Towards Laws of Software Ecosystem Evolution: An Empirical Comparison of Seve...Towards Laws of Software Ecosystem Evolution: An Empirical Comparison of Seve...
Towards Laws of Software Ecosystem Evolution: An Empirical Comparison of Seve...Tom Mens
 
OWASP Dependency-Track Introduction
OWASP Dependency-Track IntroductionOWASP Dependency-Track Introduction
OWASP Dependency-Track IntroductionSergey Sotnikov
 
Software Security Assurance for DevOps
Software Security Assurance for DevOpsSoftware Security Assurance for DevOps
Software Security Assurance for DevOpsBlack Duck by Synopsys
 
Software Security Assurance for Devops
Software Security Assurance for DevopsSoftware Security Assurance for Devops
Software Security Assurance for DevopsJerika Phelps
 
How to increase the technical health of your software?
How to increase the technical health of your software?How to increase the technical health of your software?
How to increase the technical health of your software?Tom Mens
 
Towards an empirical analysis of the maintainability of CRAN packages
Towards an empirical analysis of the maintainability of CRAN packagesTowards an empirical analysis of the maintainability of CRAN packages
Towards an empirical analysis of the maintainability of CRAN packagesTom Mens
 
Fasten Industry Meeting with GitHub about Dependancy Management
Fasten Industry Meeting with GitHub about Dependancy ManagementFasten Industry Meeting with GitHub about Dependancy Management
Fasten Industry Meeting with GitHub about Dependancy ManagementFasten Project
 
Intelligent Software Updates: Leveraging the Software Ecosystem to Support wh...
Intelligent Software Updates: Leveraging the Software Ecosystem to Support wh...Intelligent Software Updates: Leveraging the Software Ecosystem to Support wh...
Intelligent Software Updates: Leveraging the Software Ecosystem to Support wh...Au Gai
 

Similar a On Popularity and Quality Metrics of npm Packages (20)

Socio-Technical Empirical Comparison of Software Package Ecosystems
Socio-Technical Empirical Comparison of Software Package EcosystemsSocio-Technical Empirical Comparison of Software Package Ecosystems
Socio-Technical Empirical Comparison of Software Package Ecosystems
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)
 
A Step Towards Reproducibility in R
A Step Towards Reproducibility in RA Step Towards Reproducibility in R
A Step Towards Reproducibility in R
 
An Empirical Analysis of Technical Lag in npm Package Dependencies
An Empirical Analysis of Technical Lag in npm Package DependenciesAn Empirical Analysis of Technical Lag in npm Package Dependencies
An Empirical Analysis of Technical Lag in npm Package Dependencies
 
Aliens in Your Apps!
Aliens in Your Apps!Aliens in Your Apps!
Aliens in Your Apps!
 
On the topology of package dependency networks: A comparison of programming l...
On the topology of package dependency networks: A comparison of programming l...On the topology of package dependency networks: A comparison of programming l...
On the topology of package dependency networks: A comparison of programming l...
 
Predicting Android Application Security and Privacy Risk With Static Code Met...
Predicting Android Application Security and Privacy Risk With Static Code Met...Predicting Android Application Security and Privacy Risk With Static Code Met...
Predicting Android Application Security and Privacy Risk With Static Code Met...
 
Bolker esa2014
Bolker esa2014Bolker esa2014
Bolker esa2014
 
Distributions and package management in the containers era
Distributions and package management in the containers eraDistributions and package management in the containers era
Distributions and package management in the containers era
 
Reproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R ConferenceReproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R Conference
 
Reproducibility with Checkpoint & RRO
Reproducibility with Checkpoint & RROReproducibility with Checkpoint & RRO
Reproducibility with Checkpoint & RRO
 
Software bill of materials: strumenti e analisi di progetti open source dell’...
Software bill of materials: strumenti e analisi di progetti open source dell’...Software bill of materials: strumenti e analisi di progetti open source dell’...
Software bill of materials: strumenti e analisi di progetti open source dell’...
 
Towards Laws of Software Ecosystem Evolution: An Empirical Comparison of Seve...
Towards Laws of Software Ecosystem Evolution: An Empirical Comparison of Seve...Towards Laws of Software Ecosystem Evolution: An Empirical Comparison of Seve...
Towards Laws of Software Ecosystem Evolution: An Empirical Comparison of Seve...
 
OWASP Dependency-Track Introduction
OWASP Dependency-Track IntroductionOWASP Dependency-Track Introduction
OWASP Dependency-Track Introduction
 
Software Security Assurance for DevOps
Software Security Assurance for DevOpsSoftware Security Assurance for DevOps
Software Security Assurance for DevOps
 
Software Security Assurance for Devops
Software Security Assurance for DevopsSoftware Security Assurance for Devops
Software Security Assurance for Devops
 
How to increase the technical health of your software?
How to increase the technical health of your software?How to increase the technical health of your software?
How to increase the technical health of your software?
 
Towards an empirical analysis of the maintainability of CRAN packages
Towards an empirical analysis of the maintainability of CRAN packagesTowards an empirical analysis of the maintainability of CRAN packages
Towards an empirical analysis of the maintainability of CRAN packages
 
Fasten Industry Meeting with GitHub about Dependancy Management
Fasten Industry Meeting with GitHub about Dependancy ManagementFasten Industry Meeting with GitHub about Dependancy Management
Fasten Industry Meeting with GitHub about Dependancy Management
 
Intelligent Software Updates: Leveraging the Software Ecosystem to Support wh...
Intelligent Software Updates: Leveraging the Software Ecosystem to Support wh...Intelligent Software Updates: Leveraging the Software Ecosystem to Support wh...
Intelligent Software Updates: Leveraging the Software Ecosystem to Support wh...
 

Más de Ahmed Zerouali

Prevalence and Evolution of License Violations in npm and RubyGems Dependency...
Prevalence and Evolution of License Violations in npm and RubyGems Dependency...Prevalence and Evolution of License Violations in npm and RubyGems Dependency...
Prevalence and Evolution of License Violations in npm and RubyGems Dependency...Ahmed Zerouali
 
Analysis And Observations Of The Evolution Of Testing Library Usage
Analysis And Observations Of The Evolution Of Testing Library UsageAnalysis And Observations Of The Evolution Of Testing Library Usage
Analysis And Observations Of The Evolution Of Testing Library UsageAhmed Zerouali
 
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...Ahmed Zerouali
 
A multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesA multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesAhmed Zerouali
 
Evolution of Technical Lag in DockerHub images - Benevol20
Evolution of Technical Lag in DockerHub images - Benevol20Evolution of Technical Lag in DockerHub images - Benevol20
Evolution of Technical Lag in DockerHub images - Benevol20Ahmed Zerouali
 
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...
PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...Ahmed Zerouali
 
Technical Lag in Software Ecosystems
Technical Lag in Software EcosystemsTechnical Lag in Software Ecosystems
Technical Lag in Software EcosystemsAhmed Zerouali
 
Technical lag in npm and docker ecosystems
Technical lag in npm and docker ecosystemsTechnical lag in npm and docker ecosystems
Technical lag in npm and docker ecosystemsAhmed Zerouali
 
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...On the Diversity of Software Package Popularity Metrics: An Empirical Study o...
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...Ahmed Zerouali
 
ConPan: A Tool to Analyze Packages in Software Containers
ConPan: A Tool to Analyze Packages in Software ContainersConPan: A Tool to Analyze Packages in Software Containers
ConPan: A Tool to Analyze Packages in Software ContainersAhmed Zerouali
 
Technical Lag in Docker Containers
Technical Lag in Docker ContainersTechnical Lag in Docker Containers
Technical Lag in Docker ContainersAhmed Zerouali
 
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAnalyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAhmed Zerouali
 
An Empirical Comparison of the Development History of CloudStack and Eucalyptus
An Empirical Comparison of the Development History of CloudStack and EucalyptusAn Empirical Comparison of the Development History of CloudStack and Eucalyptus
An Empirical Comparison of the Development History of CloudStack and EucalyptusAhmed Zerouali
 
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAnalyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAhmed Zerouali
 

Más de Ahmed Zerouali (14)

Prevalence and Evolution of License Violations in npm and RubyGems Dependency...
Prevalence and Evolution of License Violations in npm and RubyGems Dependency...Prevalence and Evolution of License Violations in npm and RubyGems Dependency...
Prevalence and Evolution of License Violations in npm and RubyGems Dependency...
 
Analysis And Observations Of The Evolution Of Testing Library Usage
Analysis And Observations Of The Evolution Of Testing Library UsageAnalysis And Observations Of The Evolution Of Testing Library Usage
Analysis And Observations Of The Evolution Of Testing Library Usage
 
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
On the Impact of Security Vulnerabilities in the npm and RubyGems Dependency ...
 
A multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesA multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker images
 
Evolution of Technical Lag in DockerHub images - Benevol20
Evolution of Technical Lag in DockerHub images - Benevol20Evolution of Technical Lag in DockerHub images - Benevol20
Evolution of Technical Lag in DockerHub images - Benevol20
 
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...
PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...
 
Technical Lag in Software Ecosystems
Technical Lag in Software EcosystemsTechnical Lag in Software Ecosystems
Technical Lag in Software Ecosystems
 
Technical lag in npm and docker ecosystems
Technical lag in npm and docker ecosystemsTechnical lag in npm and docker ecosystems
Technical lag in npm and docker ecosystems
 
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...On the Diversity of Software Package Popularity Metrics: An Empirical Study o...
On the Diversity of Software Package Popularity Metrics: An Empirical Study o...
 
ConPan: A Tool to Analyze Packages in Software Containers
ConPan: A Tool to Analyze Packages in Software ContainersConPan: A Tool to Analyze Packages in Software Containers
ConPan: A Tool to Analyze Packages in Software Containers
 
Technical Lag in Docker Containers
Technical Lag in Docker ContainersTechnical Lag in Docker Containers
Technical Lag in Docker Containers
 
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAnalyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
 
An Empirical Comparison of the Development History of CloudStack and Eucalyptus
An Empirical Comparison of the Development History of CloudStack and EucalyptusAn Empirical Comparison of the Development History of CloudStack and Eucalyptus
An Empirical Comparison of the Development History of CloudStack and Eucalyptus
 
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java ProjectsAnalyzing the Evolution of Testing Library Usage in Open Source Java Projects
Analyzing the Evolution of Testing Library Usage in Open Source Java Projects
 

Último

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 

Último (20)

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 

On Popularity and Quality Metrics of npm Packages

  • 1. An Empirical Study of Popularity and Quality of NPM Packages 1 Ahmed Zerouali
  • 2. Motivation 2 Number of libaries in most known OS package managers - 206k libraries - Java - 162k packages - PHP - 600k packages - JavaScript In 10/Nov/2017
  • 3. Motivation 3 Reasons to choose the right OS software:  Software Quality  Software Features  Software Support and Documentation  Software Popularity  …  …
  • 4. Motivation 4 Choosing the right OS software:  Software Quality  Software Features  Software Support and Documentation  Software Popularity  …  …
  • 5. Motivation 5 Interviews with developers:  C. Bogart, C. Kästner, J. Herbsleb, and F. Thung. How to break an API: Cost negotiation and community values in three software ecosystems. In Int’l Symp. Foundations of Software Engineering (FSE), pages 109–120. ACM, 2016. Popularity and community reputation are the most influenced factors.
  • 9. Method 9 An open source repository containing metadata(size, dependents, dependencies) of package dependencies extracted from 23 package managers. An open source search engine that computes a normalized score between 0 and 1 of the npm packages popularity, quality and maintenance
  • 10. Method 10 characteristics score Popularity Quality Maintenance # stars # forks # subscribers #contributors # dependents # downloads # Downloads acceleration README? License? .gitignore and friends? Has tests? #Test coverage Is the build passing? #outdated deps & vulnerabilities? Has Custom website? Has Linters Configured? #Ratio of open issues vs. total issues #Time to close issues #Commits frequency #Release frequency
  • 11. Data Extraction 11 - Download the prepared and availabale metadata from 15th June 2017 - Use API and get the latest information ( rate limit= 60 request/minute ) -Use API (no rate limit)
  • 12. Method 12 DESCRIPTIVE STATISTICS OF THE CONSIDERED DATASET
  • 13. Research Questions 13 RQ0(preliminary question): How are measures of package popularity related to the use of a package? RQ1: Is there a relationship between package quality and package popularity? RQ2: Is there a relationship between the maintainability and popularity of packages? RQ3: Are deprecated packages still being used? RQ4: How different are packages used in web frontend development in the context of all packages?
  • 14. Data Analysis 14 -Data analysis and precessing: import pandas - Data visualization: import matplotlib import seaborn - Analytics: import scipy
  • 15. RQ0 - How are measures of package popularity related to the use of a package? 15 Pearson correlation coefficient R= 0.8
  • 16. RQ0 - How are measures of package popularity related to the use of a package? 16  Almost 4 out of 10 packages are not used by any other package or external repository.  35% of packages don’t have any direct dependency package
  • 17. RQ1 - Is there a relationship between package quality and package popularity? 17 - Quality - Testing (tests, test converage, build status) - Carefulness( licence, readme, .gitingor..). - Health ( outdated dependencies and vulnerabilities) -Branding( badges and homepage) - Popularity -Community interest (npms.io) - dependent external repositories (libraries.io)
  • 18. RQ1 - Is there a relationship between package quality and package popularity? 18 Distribution of popularity in terms of community interest, number of dependent repositories and quality score of npm packages, split into packages that have at least one dependency and packages that don’t.
  • 19. RQ1 - Is there a relationship between package quality and package popularity? 19 Pearson correlation coefficient R <0.33, for both testing and carefulness
  • 20. RQ2 - Is there a relationship between the maintainability and popularity of packages? 20 Distribution of maintenance characteristics scores grouped in packages that have a commit score above the median and packages that have a commit score under the median(0.25)
  • 21. RQ2 - Is there a relationship between the maintainability and popularity of packages? 21
  • 22. RQ3: Are deprecated packages still being used? 22 - Package declared ‘deprecated’ in the status: 768 - Packages declared ‘deprecated’ in the description: 1,522 Total deprecated packages found: 2,290( out of all npm)  Total deprecated packages found in npmsio: 836
  • 23. RQ3: Are deprecated packages still being used? 23 0.4% of all npm packages are deprecated packages and they are less used. Deprecated packages are used by popular packages too. Deprecated packages have the same characteristics as the other packages, except for size, release frequency, commit frequency and fixing issues.
  • 24. RQ4: How different are packages used in web frontend development in the context of all packages? 24 - Package on Bower : 65,397 - Package on Bower and npm : 25,203  Total front-end packages found in npmsio: 20,210
  • 25. RQ4: How different are packages used in web frontend development in the context of all packages? 25
  • 26. Limitations 26 -Npms.io and libraries.io metrics and their evaluation. - Not generalizable to other packages. - Package category.
  • 27. Conclusion 27  Investigated the relationship between software popularity and quality. Used npms.io and libraries.io. Found that: Software popularity and quality are weakly correlated. Maintenance has little impact on Popularity.  Only a small number of packages are deprecated.  Front-end packages are more popular.
  • 28. Future Work 28 - Cross-ecosystem comparisons: detect differences in the relation between popularity and quality across ecosystems. -Qualitative analysis: carrying out interviews and surveys with package developers.
  • 29. 29

Notas del editor

  1. One of the most crippling choices new developers and even existing ones face is deciding what programming language to work in, which frameworks to use and which library to learn. Given there are literally thousands of libraries to choose from, and all have their own pros and cons, it can be difficult to decide what to learn.
  2. Why it is important to pick the right software Often, you can find many open-source choices that appear to fit the your need, but picking the wrong software can have expensive consequences. A lot of time is required to learn new software and integrate it into your project, and time is money. Choosing the wrong software can be an expensive mistake. From the different reasons that developers have when choosing a new software are: SQ: Is this software library well tested and written? SF: What does it provide as functionnality? SSD: Is it well documented? SP: for example, Is it used by a lot of people?
  3. Out of all these reasons, popularity seems to be the most influenced factor.
  4. Researches interviewed developers involved in open source software ecosystems about the reason behind selecting the appropriate software, and most answer were related to: popularity and community reputation.
  5. But does this factor imply a good software quality. Do popular software packages for example in javascript have good development quality? Let’s take an example, this is sinon which is a test package is ranked 14 and it has good test coverage and all builds passing.
  6. While Chai which is also a test library, has failing buils and less test converage than the top 14 package.
  7. To verify if this is not the case for a lot of libraries, and that indeed there is a link between software quality and popularity. We investigated this issue for packages that are hosted in the NPM packages manager. We choose NPM because it’s now the largest registry for packages in the world, and because Javascript is one of the most used programming languages.
  8. We used two open source package tracking tools. Libraries.io which contains the metadata of packages dependencies extracted from 23 package managers And Npmsio. Which is an open sources …..
  9. The scores are calculated using many different metrics.
  10. For the data extraction, we had the choice between downloading the the prepared and available data of 15th june 2017 or use their api and get the latest iformation, but since there is not a lot of time between june and october and also because libraries.io has a rate limit, we used the available metadata. For npms, we used their API To download the data using npms.io was also really fast
  11. After combining the data from both sources, from the 516,705 packages in libraries.io of the 15th june 2017 extracted dataset, we found 308,777 of them also in npms.io. And We observed that all packages in npms.io are hosted on Gihtub, which is of a great value to us, since our purpose is to analyze packages that evolve in the same.
  12. To empirically study the relationship between software quality and popularity. We consider the following research questions.
  13. In order to be able to answer these questions: We used only python for the extarction, cleaning and preparing the data. As well as for the analysis. To play with the data we used:
  14. Our aim with this preliminary question is to better understand the concept of popularity: In order to study popularity, we rely on the popularity score of npms.io. This score includes, among other metrics, the number of other npm packages directly depending on it. Also we rely on the number of dependent repositories metric extracted from libraries.io . it counts the number of Git repositories that do not correspond to an npm package yet depend on the npm package under consideration. The package scores computed by npms.io are values between 0 and 1. To facilitate comparison with the aforementioned metric from libraries.io, we normalize this metrics to a value between 0 and 1. As shown in this figure, the scatter plot of npm package popularity in terms of community interest compared to the number of dependent repositories, reveals a correlation between both kinds of popularity. To confirm this we calculated the … and we found strong correlation at R=0.81
  15. We also found that.
  16. For the first question we verified if there is quantitative evidence of a relation between popularity terms of community interest dependent external repositoried and quaity in terms of …..
  17. We observe that most npm packages have low popularity within the community and have very little external repositories depending on them, while most of them have a good quality score. We also verified statistically if packagees that do not use any dependency are different but we couldn’t find a statistical signficance difference.
  18. To calculate the quality scores, the high weights were given to carefulness and testing. To have a deep look at how these two metrics are distributed, we divided packages in quintiles by their popularity score. And we statistical found that for most categories, the characteristic of carefulness is higher than the characteristic of testing. We also checked whether we can find a correlation between carefulness and testing for all packages with popularity, and we found only a weak linear correlation.
  19. After that we studied the relation between maintenance activity and popularity. We expected that packages under active maintenance are more popular than packages that are no longer being maintained. When checking the source code of npms, we find that they had difficulties to evaluate packages that have disabled or zero issues in their repository. That’s why for this particular research question, we filtred them out. We investigated the relationship between releasing, committing and fixing issues. For all packages considered for this analysis, we grouped them into two categories of equal size based on the median value for the commit frequency. And we found that npm packages that commit frequently have good fixing issues scores and they also release frequently.
  20. Using the maintenance score, we checked whether we could find different distributions of the number of dependent npm packages and repositories. Similar to what we did in before, we divided packages in quintiles according to their maintenance score. As shown in the figure, we couldnt find relevant difference between the distributions . Which means that maintenance does not have a large impact on the popularity of npm packages
  21. To know how the deprecated npm packages are being handled, we identified all npm packages in the libraries.io dataset that have a “deprectaed” status in them. From all packages we found only: 768 After that we analyzed manually description of all packages that have the word ‘deprecat’ in their description We filtred packages distined to handle deprecation and we found 1522 more deprecated packages. From this number of deprecated packages, only 836 was found in our dataset.
  22. We analyzed their scores and popularity
  23. After that, and in order to know how front end package are different We extracted all packages that are hosted the front end dedicated package manager Bower. And then we identified which of these packages are also on npm. And finaly we could find 20,210 packages that are hosted on bower and npm and they are in our dataset
  24. For these packages and the other packages hosted on npm, we carried out a comparison between their maintenance, quality and popularity characteristics scores. And we found that front end packages are different in size, age and popularity. They are more popular than the other packages.
  25. Our results could be different when relying on other metrics that have been defined and implemented in a different way to quantify quality or popularity. Since we only used metrics already evaluated by npms and libraries.io. We did not differentiate or classify the npm packages by their category or domain, which may impact our findings.
  26. This analysis presented an empirical analysis on software package popularity and quality in npm packages. Using the available data on libraries.io and npms.io, two open source services that provide software dependency tracking, we analyzed the characteristics of open source npm packages in order to investigate the relationship between quality and popularity within the npm ecosystem.