SlideShare una empresa de Scribd logo
1 de 45
Descargar para leer sin conexión
The (r)evolution of CI/CD on GitHub
Promises and Perils of the GitHub Actions ecosystem
Tom Mens
Software Engineering Lab
March 2023
SECO-ASSIST
secoassist.github.io
2
3
Collaborative software development
4
Commits
Issues
Pull Requests
Comments
Code Reviews
Discussions
Project Management
...
Continuous Integration
Quality
analysis
Build Test Deploy
GitHub
Actions
Examples of CI/CD tools
5
Specifying
GitHub Actions
workflows
6
repository
workflow 3
workflow 2
step 3
job 1
workflow 1
job 2 job 3
workflows
jobs
steps
repository
Parallel
Parallel by default /
sequential
Sequential
.github/workflows/
strategy
step 2
step 1
use: (action) run: (shell cmd) use: (action)
Running workflows
7
GitHub
marketplace
8
Reusing Actions from GitHub MarketPlace
On the rise and fall of CI services in GitHub
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Abstract—Continuous integration (CI) services are used in
collaborative open source projects to automate parts of the
development workflow. Such services have been in widespread use
for over a decade, with new CIs being introduced over the years,
sometimes overtaking other CIs in popularity. We conducted a
longitudinal empirical study over a period of nine years, aiming
to better understand this rapidly evolving CI landscape. By
analysing the development history of 91,810 GitHub repositories
of active npm packages having used at least one CI service,
we quantitatively studied the evolution of seven popular CIs,
specifically focusing on their co-usage and migration in the
considered repositories. We provide statistical evidence of the rise
of GitHub Actions, that has become the dominant CI service in
less than 18 months time. This coincides with the fall of Travis
that has seen an important decrease in usage, likely due to a
combination of policy changes and migrations to GitHub Actions.
Index Terms—Continuous integration, distributed software
development, software repositories, GitHub
I. INTRODUCTION
Continuous integration (CI), deployment and delivery have
become the cornerstone of collaborative software development
and DevOps practices. CI automates the integration of code
changes from multiple contributors into a central repository
where automated builds, tests and code quality checks run.
Well-known examples of CI services are Jenkins, Travis,
CircleCI and AppVeyor. CI services can also be built-in in
social coding platforms such as GitHub and GitLab [1]. GitLab
already featured CI capabilities since November 2012. Based
on popular demand, and in response to CI support integrated
in GitLab, GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this article) in October 2018. In August 2019, they officially
began supporting Continuous Integration through GHA, and
the product was released publicly in November 2019.
GHA [2] allows to automate a wide range of tasks based
on a variety of triggers such as commits, issues, pull requests,
comments and many more. GHA can be used to facilitate code
reviews, code quality analysis, communication, dependency
and security monitoring and management, testing, etc. GHA
facilitates the integration with external services, and can even
obviate the need of using such external services altogether.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 56 million users in
September 2020 [3]. Given its popularity and the ease with
which GHA allows to automate the CI workflow, we hypoth-
esise that GHA has had a significant impact on today’s CI
landscape. More particularly, we believe that it has increased
the awareness of the need for CI, it has reduced the entry
barrier for projects to start using CI, and it may have lead
projects to migrate from other CI services towards GHA.
This article aims to quantitatively and objectively verify
these hypotheses, and discusses their consequences, through a
longitudinal analysis of how different CIs have been used over
a nine-year period in 91,810 GitHub repositories correspond-
ing to the software development history of reusable Node.JS
packages distributed through the npm package registry. This
empirical study focuses on four research questions:
RQ1 How did the CI landscape evolve? We identified 20
different CIs being used in the considered set of repositories,
some of which were considerably more prevalent than others.
Together with Travis, GHA covers more than 80% of all
usages. Moreover, in only 18 months GHA has overtaken all
other CIs in popularity.
RQ2 What are the most frequent combinations of CIs? We
observed that many repositories have used multiple CIs during
their lifetime. AppVeyor is nearly always used in combination
with some other CI. If a repository uses a CI simultaneously
with another one, it is mostly in combination with Travis,
GHA or CircleCI.
RQ3 How frequently are CIs being replaced by an alternative?
We observed a non-negligible amount of CI migrations. GHA
attracted most of these migrations. The majority of migrations
were moving away from Travis and towards GHA.
RQ4 How has the CI landscape changed since GHA was
introduced? Based on a regression discontinuity design, we
found that the usage of Travis, Azure and CircleCI has been
negatively affected by the introduction of GHA.
This article is structured as follows. Section II motivates the
selected dataset and discusses the data extraction and cleaning
steps that were carried out. Sections III to VI provide answers
to each research question. Section VII discusses the ramifi-
cations of these answers. Section VIII presents the threats to
validity of the conducted research. Section IX presents the
related work. Finally, Section X concludes.
II. DATA EXTRACTION
In order to analyse the use of CIs in software development
repositories on GitHub, we need a large dataset containing
On the Use of GitHub Actions in Software
Development Repositories
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Pooya Rostami Mazrae
Software Engineering Lab
University of Mons
Mons, Belgium
pooya.rostamimazrae@umons.ac.be
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Abstract—GitHub Actions was introduced in 2019 and con-
stitutes an integrated alternative to CI/CD services for GitHub
repositories. The deep integration with GitHub allows reposi-
tories to easily automate software development workflows. This
paper empirically studies the use of GitHub Actions on a dataset
comprising 68K repositories on GitHub, of which 43.9% are using
GitHub Actions workflows. We analyse which workflows are
automated and identify the most frequent automation practices.
We show that reuse of actions is a common practice, even if
this reuse is concentrated in a limited number of actions. We
study which actions are most frequently used and how workflows
refer to them. Furthermore, we discuss the related security
and versioning aspects. As such, we provide an overview of
the use of GitHub Actions, constituting a necessary first step
towards a better understanding of this emerging ecosystem and
its implications on collaborative software development in the
GitHub social coding platform.
Index Terms—GitHub Actions, continuous integration, collab-
orative software development, workflow automation
I. INTRODUCTION
Open source software (OSS) development is a continuous,
highly distributed and collaborative endeavour [1]. Develop-
ment of OSS projects faces many socio-technical challenges
[2]–[4]. The multitude of tools (e.g., version control systems,
software distribution managers, bug and issue trackers) and
development-related activities makes it very challenging for
contributor communities to keep up with the rapid pace of
producing and maintaining high-quality software releases.
Automated workflows were introduced to automate numer-
ous repetitive social or technical activities that are inherently
part of the collaborative software development process. Con-
tinuous integration, deployment and delivery (CI/CD) have
become the cornerstone of collaborative software develop-
ment and DevOps practices. Well-known examples of CI/CD
services are Travis, Jenkins, CircleCI and TeamCity. They
automate the integration of code changes from multiple con-
tributors into a central repository where automated builds, tests
and code quality checks run.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 73 million users in
2021 [5]. GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this paper) in October 2018 based on popular demand, and in
response to GitLab’s integrated CI/CD support [6]. In August
2019, GitHub officially began supporting CI through GHA,
and the product was released publicly in November 2019.
GHA [7] allows the automation of a wide range of tasks
based on a variety of triggers such as commits, issues, pull
requests, comments, schedules, and many more. Its deep
integration into GitHub implies that GHA can be used not
only for executing test suites or deploying new releases
as in traditional CI/CD services, but also to facilitate code
reviews, communication, dependency and security monitoring
and management, etc. GHA also promotes the use and sharing
of reusable components, called actions, in workflows. These
actions are distributed in public repositories and on the GitHub
Marketplace. They allow workflow developers to easily in-
tegrate specific tasks (e.g., set up a specific programming
language environment, publish a release on a package registry,
run tests and check code quality) without having to write the
corresponding code.
Since its public release in November 2019, GHA has
become the most dominant CI/CD service, only 18 months
after its introduction [8]. Its Marketplace of reusable actions
has been growing exponentially ever since, reaching 12K
reusable actions in February 2022. It is therefore fair to
say that GHA has become a software ecosystem of its own,
comparable to ecosystems of reusable software libraries (such
as npm, RubyGems, CRAN, Maven, and PyPI) that have been
empirically studied by many researchers in recent years (e.g.,
[9]–[14]).
The emerging GHA ecosystem is worthy of being empiri-
cally studied in its own right since it is likely to suffer from
the same issues related to dependency management, security
vulnerabilities, outdated or obsolete components, backward
compatibility, and so on. This article therefore quantitatively
studies the use of GHA in 68K repositories on GitHub. We
analyse which workflows are automated and identify the most
frequent automation practices. We show that reuse of actions
is a common practice and identify which actions are reused
and how. As such, we provide an overview of the use of
GHA, a necessary first step towards a better understanding
of the emerging GHA ecosystem and its implications on
software development in GitHub repositories. More concretely,
we answer the following research questions:
9
Empirical Software Engineering (2023) 28:52
https://doi.org/10.1007/s10664-022-10285-5
On the usage, co-usage and migration of CI/CD tools:
A qualitative analysis
Pooya Rostami Mazrae1 · Tom Mens1 · Mehdi Golzadeh1 · Alexandre Decan1
Accepted: 28 December 2022
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023
Abstract
Continuous integration, delivery and deployment (CI/CD) is used to support the collabora-
tive software development process. CI/CD tools automate a wide range of activities in the
development workflow such as testing, linting, updating dependencies, creating and deploy-
ing releases, and so on. Previous quantitative studies have revealed important changes in the
landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many
software projects migrating to other CI/CD tools. In order to understand the reasons behind
these changes in CI/CD usage, this paper presents a qualitative study based on in-depth
interviews with 22 experienced software practitioners reporting on their usage, co-usage and
migration of 31 different CI/CD tools. Following an inductive and deductive coding process,
we analyse the interviews and found a high amount of competition between CI/CD tools. We
observe multiple reasons for co-using different CI/CD tools within the same project, and we
identify the main reasons and detractors for migrating to different alternatives. Among all
reported migrations, we observe a clear trend of migrations away from Travis and migrations
towards GitHub Actions and we identify the main reasons behind them.
Keywords CI/CD · Collaborative software development · Workflow automation ·
Qualitative analysis · Empirical software engineering
Communicated by: Alexander Serebrenik
Alexandre Decan (F.R.S.-FNRS Research Associate)
! Pooya Rostami Mazrae
pooya.rostami.m@gmail.com; pooya.rostamimazrae@umons.ac.be
Tom Mens
tom.mens@umons.ac.be
Mehdi Golzadeh
golzadeh.mehdi@gmail.com
Alexandre Decan
alexandre.decan@umons.ac.be
1 Software Engineering Lab, Université de Mons, Mons, Belgium
https://doi.org/10.1109/ICSME55016.2022.00029
https://doi.org/10.1109/SANER53432.2022.00084
On the rise and fall of CI services in GitHub
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Abstract—Continuous integration (CI) services are used in
collaborative open source projects to automate parts of the
development workflow. Such services have been in widespread use
for over a decade, with new CIs being introduced over the years,
sometimes overtaking other CIs in popularity. We conducted a
longitudinal empirical study over a period of nine years, aiming
to better understand this rapidly evolving CI landscape. By
analysing the development history of 91,810 GitHub repositories
of active npm packages having used at least one CI service,
we quantitatively studied the evolution of seven popular CIs,
specifically focusing on their co-usage and migration in the
considered repositories. We provide statistical evidence of the rise
of GitHub Actions, that has become the dominant CI service in
less than 18 months time. This coincides with the fall of Travis
that has seen an important decrease in usage, likely due to a
combination of policy changes and migrations to GitHub Actions.
Index Terms—Continuous integration, distributed software
development, software repositories, GitHub
I. INTRODUCTION
Continuous integration (CI), deployment and delivery have
become the cornerstone of collaborative software development
and DevOps practices. CI automates the integration of code
changes from multiple contributors into a central repository
where automated builds, tests and code quality checks run.
Well-known examples of CI services are Jenkins, Travis,
CircleCI and AppVeyor. CI services can also be built-in in
social coding platforms such as GitHub and GitLab [1]. GitLab
already featured CI capabilities since November 2012. Based
on popular demand, and in response to CI support integrated
in GitLab, GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this article) in October 2018. In August 2019, they officially
began supporting Continuous Integration through GHA, and
the product was released publicly in November 2019.
GHA [2] allows to automate a wide range of tasks based
on a variety of triggers such as commits, issues, pull requests,
comments and many more. GHA can be used to facilitate code
reviews, code quality analysis, communication, dependency
and security monitoring and management, testing, etc. GHA
facilitates the integration with external services, and can even
obviate the need of using such external services altogether.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 56 million users in
September 2020 [3]. Given its popularity and the ease with
which GHA allows to automate the CI workflow, we hypoth-
esise that GHA has had a significant impact on today’s CI
landscape. More particularly, we believe that it has increased
the awareness of the need for CI, it has reduced the entry
barrier for projects to start using CI, and it may have lead
projects to migrate from other CI services towards GHA.
This article aims to quantitatively and objectively verify
these hypotheses, and discusses their consequences, through a
longitudinal analysis of how different CIs have been used over
a nine-year period in 91,810 GitHub repositories correspond-
ing to the software development history of reusable Node.JS
packages distributed through the npm package registry. This
empirical study focuses on four research questions:
RQ1 How did the CI landscape evolve? We identified 20
different CIs being used in the considered set of repositories,
some of which were considerably more prevalent than others.
Together with Travis, GHA covers more than 80% of all
usages. Moreover, in only 18 months GHA has overtaken all
other CIs in popularity.
RQ2 What are the most frequent combinations of CIs? We
observed that many repositories have used multiple CIs during
their lifetime. AppVeyor is nearly always used in combination
with some other CI. If a repository uses a CI simultaneously
with another one, it is mostly in combination with Travis,
GHA or CircleCI.
RQ3 How frequently are CIs being replaced by an alternative?
We observed a non-negligible amount of CI migrations. GHA
attracted most of these migrations. The majority of migrations
were moving away from Travis and towards GHA.
RQ4 How has the CI landscape changed since GHA was
introduced? Based on a regression discontinuity design, we
found that the usage of Travis, Azure and CircleCI has been
negatively affected by the introduction of GHA.
This article is structured as follows. Section II motivates the
selected dataset and discusses the data extraction and cleaning
steps that were carried out. Sections III to VI provide answers
to each research question. Section VII discusses the ramifi-
cations of these answers. Section VIII presents the threats to
validity of the conducted research. Section IX presents the
related work. Finally, Section X concludes.
II. DATA EXTRACTION
In order to analyse the use of CIs in software development
repositories on GitHub, we need a large dataset containing
10
https://doi.org/10.1109/SANER53432.2022.00084
Dataset
11
1.6M+
Scoped packages
803K packages
on GitHub
Excluded 11,557
forks
Excluded inactive
repositories
201,403
Repositories
Presence
of CI configuration
files
119,033 CI usages
in
91,810 Repositories
May
2021
Cloned 676K
How prevalent is CI usage
in GitHub repositories?
CI services are used in
more than half of all
considered repositories.
Evolution of GitHub CI/CD landscape
13
Since 2021, GitHub Actions has become
the dominant CI/CD tool in GitHub
Most frequent co-usage of CIs
14
Analysing
CI churn
in the last 3 years
Migrations
between CIs
Migrations
toward GitHub
Actions
Migrations
away from Travis
What happened to Travis?
Travis changed
its free plan
GHA was
introduced
20
Empirical Software Engineering (2023) 28:52
https://doi.org/10.1007/s10664-022-10285-5
On the usage, co-usage and migration of CI/CD tools:
A qualitative analysis
Pooya Rostami Mazrae1 · Tom Mens1 · Mehdi Golzadeh1 · Alexandre Decan1
Accepted: 28 December 2022
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023
Abstract
Continuous integration, delivery and deployment (CI/CD) is used to support the collabora-
tive software development process. CI/CD tools automate a wide range of activities in the
development workflow such as testing, linting, updating dependencies, creating and deploy-
ing releases, and so on. Previous quantitative studies have revealed important changes in the
landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many
software projects migrating to other CI/CD tools. In order to understand the reasons behind
these changes in CI/CD usage, this paper presents a qualitative study based on in-depth
interviews with 22 experienced software practitioners reporting on their usage, co-usage and
migration of 31 different CI/CD tools. Following an inductive and deductive coding process,
we analyse the interviews and found a high amount of competition between CI/CD tools. We
observe multiple reasons for co-using different CI/CD tools within the same project, and we
identify the main reasons and detractors for migrating to different alternatives. Among all
reported migrations, we observe a clear trend of migrations away from Travis and migrations
towards GitHub Actions and we identify the main reasons behind them.
Keywords CI/CD · Collaborative software development · Workflow automation ·
Qualitative analysis · Empirical software engineering
Communicated by: Alexander Serebrenik
Alexandre Decan (F.R.S.-FNRS Research Associate)
! Pooya Rostami Mazrae
pooya.rostami.m@gmail.com; pooya.rostamimazrae@umons.ac.be
Tom Mens
tom.mens@umons.ac.be
Mehdi Golzadeh
golzadeh.mehdi@gmail.com
Alexandre Decan
alexandre.decan@umons.ac.be
1 Software Engineering Lab, Université de Mons, Mons, Belgium
Methodology
21
• Around 30 questions related to CI usage, co-usage and migration
Interview questionnaire
• Selected candidates through Twitter, LinkedIn, email, direct messages
• Colleagues' referrals (snowballing)
Selection of respondents
• Using online video conferencing tool
Geographic diversity
• Actively contributed to, or having been responsible for a software project relying on CI
• Sufficient knowledge about which CI tool is used in that software project and how
• Having been involved in setting up or maintaining the CI process of the project
Inclusion Criteria
Demographics of respondents
• 22 respondents
• 16 from 7 European countries
• 4 from North America
• 2 from Asia
• software development experience
• average of 12 years and 4 months
• Good mix of industrial and open source
contributors
22
CI/CD tools being used
• 14 additional tools reported only once
• 3 custom-built in-house CI/CD solutions
23
The good ...
25
26
The bad ...
The ugly
27
CI/CD migrations
30
Reasons for
CI migration
31
Why is GitHub Actions so
popular?
• deep integration with GitHub
• ease of setup and use
• trendy
• speed
• reliability
• free tier for open source projects
• large marketplace of reusable Actions
• support for major operating systems
• company support (Microsoft)
• automation beyond CI/CD
33
Difficulties in CI migration
• Learning curve
• Fundamental differences between the
source and target of the migration
• Trial-and-error nature of configuring a
new CI tool
• Lack of familiarity with the new CI tool
• Important missing features
34
On the Use of GitHub Actions in Software
Development Repositories
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Pooya Rostami Mazrae
Software Engineering Lab
University of Mons
Mons, Belgium
pooya.rostamimazrae@umons.ac.be
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Abstract—GitHub Actions was introduced in 2019 and con-
stitutes an integrated alternative to CI/CD services for GitHub
repositories. The deep integration with GitHub allows reposi-
tories to easily automate software development workflows. This
paper empirically studies the use of GitHub Actions on a dataset
comprising 68K repositories on GitHub, of which 43.9% are using
GitHub Actions workflows. We analyse which workflows are
automated and identify the most frequent automation practices.
We show that reuse of actions is a common practice, even if
this reuse is concentrated in a limited number of actions. We
study which actions are most frequently used and how workflows
refer to them. Furthermore, we discuss the related security
and versioning aspects. As such, we provide an overview of
the use of GitHub Actions, constituting a necessary first step
towards a better understanding of this emerging ecosystem and
its implications on collaborative software development in the
GitHub social coding platform.
Index Terms—GitHub Actions, continuous integration, collab-
orative software development, workflow automation
I. INTRODUCTION
Open source software (OSS) development is a continuous,
highly distributed and collaborative endeavour [1]. Develop-
ment of OSS projects faces many socio-technical challenges
[2]–[4]. The multitude of tools (e.g., version control systems,
software distribution managers, bug and issue trackers) and
development-related activities makes it very challenging for
contributor communities to keep up with the rapid pace of
producing and maintaining high-quality software releases.
Automated workflows were introduced to automate numer-
ous repetitive social or technical activities that are inherently
part of the collaborative software development process. Con-
tinuous integration, deployment and delivery (CI/CD) have
become the cornerstone of collaborative software develop-
ment and DevOps practices. Well-known examples of CI/CD
services are Travis, Jenkins, CircleCI and TeamCity. They
automate the integration of code changes from multiple con-
tributors into a central repository where automated builds, tests
and code quality checks run.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 73 million users in
2021 [5]. GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this paper) in October 2018 based on popular demand, and in
response to GitLab’s integrated CI/CD support [6]. In August
2019, GitHub officially began supporting CI through GHA,
and the product was released publicly in November 2019.
GHA [7] allows the automation of a wide range of tasks
based on a variety of triggers such as commits, issues, pull
requests, comments, schedules, and many more. Its deep
integration into GitHub implies that GHA can be used not
only for executing test suites or deploying new releases
as in traditional CI/CD services, but also to facilitate code
reviews, communication, dependency and security monitoring
and management, etc. GHA also promotes the use and sharing
of reusable components, called actions, in workflows. These
actions are distributed in public repositories and on the GitHub
Marketplace. They allow workflow developers to easily in-
tegrate specific tasks (e.g., set up a specific programming
language environment, publish a release on a package registry,
run tests and check code quality) without having to write the
corresponding code.
Since its public release in November 2019, GHA has
become the most dominant CI/CD service, only 18 months
after its introduction [8]. Its Marketplace of reusable actions
has been growing exponentially ever since, reaching 12K
reusable actions in February 2022. It is therefore fair to
say that GHA has become a software ecosystem of its own,
comparable to ecosystems of reusable software libraries (such
as npm, RubyGems, CRAN, Maven, and PyPI) that have been
empirically studied by many researchers in recent years (e.g.,
[9]–[14]).
The emerging GHA ecosystem is worthy of being empiri-
cally studied in its own right since it is likely to suffer from
the same issues related to dependency management, security
vulnerabilities, outdated or obsolete components, backward
compatibility, and so on. This article therefore quantitatively
studies the use of GHA in 68K repositories on GitHub. We
analyse which workflows are automated and identify the most
frequent automation practices. We show that reuse of actions
is a common practice and identify which actions are reused
and how. As such, we provide an overview of the use of
GHA, a necessary first step towards a better understanding
of the emerging GHA ecosystem and its implications on
software development in GitHub repositories. More concretely,
we answer the following research questions:
35
https://doi.org/10.1109/ICSME55016.2022.00029
Research
Questions
36
What are the characteristics of repositories using
workflows?
Which kinds of workflows are automated?
What are the most frequent jobs in workflows?
What are the automation practices?
Which types of Actions are reused?
Dataset
• 67,870 repositories
• 4 out of 10 repositories
use GitHub Actions
workflows
• 70,278 workflow files
• 108,500 jobs
• 567,352 steps
37
Quantification of jobs and workflows
Workflows in repositories
single workflow (49.3%)
more than one workflow (50.7%)
Jobs in workflows
single job (77.8%)
more than one job (22.2%)
38
Characteristics of GitHub repositories
using GitHub Actions
Median Effect size
Characteristic With workflows
Without
workflows
Interpretation
Pull Requests 124 41 medium
Contributors 20 11 small
Commits 598 344 small
Issues 105 59 small
40
Repos with GHA workflows tend to have more
contributors, pull requests, commits, and issues
Most frequent event types
triggering workflows
63,4
56,3
16,1 15,4
6,2
8,6
0
10
20
30
40
50
60
70
push PR schedule workflow_dispatch release others
41
DifferDifferent ways of executing codecode
Step type Action target % of steps % of repositories
run: -- 49,9% 93,5%
uses:
Local path 0,8% 2,0%
Docker image 0,1% 1,8%
Same repository 0,2% 0,4%
Same owner 0,7% 4,3%
Other public
repository
48,3% 99,3%
42
Reusing Actions in steps is a common practice
Which Actions are reused?
35,50%
7,20% 6,60% 5,90% 5,80%
98%
22%
26%
19%
21%
0,00%
10,00%
20,00%
30,00%
40,00%
50,00%
60,00%
70,00%
80,00%
90,00%
100,00%
actions/checkout actions/cache actions/setup-node actions/upload-artifact actions/setup-python
Top 5 most frequent Actions in steps and repositories
steps repositories 44
• A few Actions concentrate
most of the reuse
• Most of them being
developed by GitHub
45
On the Outdatedness of Workflows
in the GitHub Actions Ecosystem
Alexandre Decan1
, Hassan Onsori Delicheh, Tom Mens
aSoftware Engineering Lab, University of Mons, Mons, Belgium
Abstract
GitHub Actions was introduced as a way to automate CI/CD workflows in
GitHub, the largest social coding platform. Thanks to its deep integration into
GitHub, GitHub Actions can be used to automate a wide range of social and
technical activities. Among its main features, it allows automation workflows
to rely on reusable components – the so-called Actions – to enable developers to
focus on the tasks that should be automated rather than on how to automate
them. As any other kind of reusable software components, Actions are contin-
uously updated, causing many automation workflows to use outdated versions
of these Actions. Based on a dataset of nearly one million workflows obtained
from 22K+ repositories between November 2019 and September 2022, we pro-
vide quantitative empirical evidence that reusing Actions in GitHub workflows
is common practice, even if this reuse tends to concentrate on a limited number
of Actions. We show that Actions are frequently updated, and we quantify to
which extent automation workflows are outdated with respect to these Actions.
Using two complementary metrics, technical lag and opportunity lag, we found
that most of the workflows are using an outdated Action release, are lagging
behind the latest available release for at least 7 months, and had the oppor-
tunity to be updated during at least 9 months. This calls for a more rigorous
management of Action outdatedness in automation workflows, as well as for
better policies and tooling to keep workflows up-to-date.
Keywords: software ecosystem, dependency management, continuous
integration, collaborative software development, workflow automation,
technical lag
Email addresses: alexandre.decan@umons.ac.be (Alexandre Decan),
hassan.onsoridelicheh@umons.ac.be (Hassan Onsori Delicheh), tom.mens@umons.ac.be
(Tom Mens)
1F.R.S.-FNRS Research Associate
Preprint submitted to Journal of Systems & Software March 21, 2023
Outdatedness in the
GitHub Actions ecosystem
46
• Four out of five workflows and nearly
two thirds of the steps are using an
outdated release of an Action.
• Steps using Actions provided by GitHub
are responsible for most of the
outdatedness.
• More than one third of the other steps
and nearly half of the other workflows
are using an outdated release of an
Action.
release of
actions/checkout@v2
release of
actions/checkout@v3
release of
actions/setup-*@v2
release of
actions/setup-*@v3
v1 v2 v3 v4
latest
technical lag
observation
date
GitHub workflow
selected
Action
lifeline
Outdatedness in the
GitHub Actions ecosystem
Technical lag of workflows / steps: the time period between the start of
reusing a selected Action and the latest release of that Action.
Outdatedness in the
GitHub Actions ecosystem
Technical lag of workflows / steps: the time period between the start of
reusing a selected Action and the latest release of that Action.
• Technical lag of outdated steps
tends to increase over time.
• Half of the outdated steps using
other Actions are using a version
that is lagging behind the latest one
for at least 7.3 months.
• Main cause of technical lag =
Actions provided by GitHub
Outdatedness in the
GitHub Actions ecosystem
Opportunity lag of workflows / steps: the time period during which a
workflow could have updated an outdated step to a more recent
version of an Action, but didn’t.
v1 v2 v3 v4
opportunity lag
observation
time
GitHub workflow
first update
opportunity
Action
lifeline
selected
Outdatedness in the
GitHub Actions ecosystem
Opportunity lag of workflows / steps: the time period during which a
workflow could have updated an outdated step to a more recent
version of an Action, but didn’t.
• The opportunity lag of outdated steps
tends to increase over time.
• On average, maintainers of outdated
steps have had the opportunity to
update them for 9 months, but have not
done so.
• Main cause of opportunity lag =
Actions provided by GitHub
new releases for
docker/*
Thank you for
your attention.
Any questions?
55

Más contenido relacionado

La actualidad más candente

Get started with gitops and flux
Get started with gitops and fluxGet started with gitops and flux
Get started with gitops and fluxLibbySchulze1
 
Istio service mesh introduction
Istio service mesh introductionIstio service mesh introduction
Istio service mesh introductionKyohei Mizumoto
 
CI with Gitlab & Docker
CI with Gitlab & DockerCI with Gitlab & Docker
CI with Gitlab & DockerJoerg Henning
 
CICD Pipeline Using Github Actions
CICD Pipeline Using Github ActionsCICD Pipeline Using Github Actions
CICD Pipeline Using Github ActionsKumar Shìvam
 
OpenTelemetry 101 FTW
OpenTelemetry 101 FTWOpenTelemetry 101 FTW
OpenTelemetry 101 FTWNGINX, Inc.
 
Exploring Docker in CI/CD
Exploring Docker in CI/CDExploring Docker in CI/CD
Exploring Docker in CI/CDHenry Huang
 
Kubernetes Deployment Strategies
Kubernetes Deployment StrategiesKubernetes Deployment Strategies
Kubernetes Deployment StrategiesAbdennour TM
 
Introducing GitLab (September 2018)
Introducing GitLab (September 2018)Introducing GitLab (September 2018)
Introducing GitLab (September 2018)Noa Harel
 
Continuous Integration/Deployment with Gitlab CI
Continuous Integration/Deployment with Gitlab CIContinuous Integration/Deployment with Gitlab CI
Continuous Integration/Deployment with Gitlab CIDavid Hahn
 
Introduction to Github Actions
Introduction to Github ActionsIntroduction to Github Actions
Introduction to Github ActionsKnoldus Inc.
 
Introduction to Git and GitHub
Introduction to Git and GitHubIntroduction to Git and GitHub
Introduction to Git and GitHubVikram SV
 
Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub ActionsBo-Yi Wu
 

La actualidad más candente (20)

Get started with gitops and flux
Get started with gitops and fluxGet started with gitops and flux
Get started with gitops and flux
 
Github
GithubGithub
Github
 
Introduction to Git
Introduction to GitIntroduction to Git
Introduction to Git
 
Challenges of Kubernetes On-premise Deployment
Challenges of Kubernetes On-premise DeploymentChallenges of Kubernetes On-premise Deployment
Challenges of Kubernetes On-premise Deployment
 
Istio service mesh introduction
Istio service mesh introductionIstio service mesh introduction
Istio service mesh introduction
 
GitOps with Gitkube
GitOps with GitkubeGitOps with Gitkube
GitOps with Gitkube
 
CI with Gitlab & Docker
CI with Gitlab & DockerCI with Gitlab & Docker
CI with Gitlab & Docker
 
CICD Pipeline Using Github Actions
CICD Pipeline Using Github ActionsCICD Pipeline Using Github Actions
CICD Pipeline Using Github Actions
 
OpenTelemetry 101 FTW
OpenTelemetry 101 FTWOpenTelemetry 101 FTW
OpenTelemetry 101 FTW
 
Exploring Docker in CI/CD
Exploring Docker in CI/CDExploring Docker in CI/CD
Exploring Docker in CI/CD
 
Kubernetes PPT.pptx
Kubernetes PPT.pptxKubernetes PPT.pptx
Kubernetes PPT.pptx
 
Kubernetes Deployment Strategies
Kubernetes Deployment StrategiesKubernetes Deployment Strategies
Kubernetes Deployment Strategies
 
Git & GitLab
Git & GitLabGit & GitLab
Git & GitLab
 
Introducing GitLab (September 2018)
Introducing GitLab (September 2018)Introducing GitLab (September 2018)
Introducing GitLab (September 2018)
 
Continuous Integration/Deployment with Gitlab CI
Continuous Integration/Deployment with Gitlab CIContinuous Integration/Deployment with Gitlab CI
Continuous Integration/Deployment with Gitlab CI
 
01. Kubernetes-PPT.pptx
01. Kubernetes-PPT.pptx01. Kubernetes-PPT.pptx
01. Kubernetes-PPT.pptx
 
Introduction to Github Actions
Introduction to Github ActionsIntroduction to Github Actions
Introduction to Github Actions
 
Introduction to Git and GitHub
Introduction to Git and GitHubIntroduction to Git and GitHub
Introduction to Git and GitHub
 
GitHub Basics - Derek Bable
GitHub Basics - Derek BableGitHub Basics - Derek Bable
GitHub Basics - Derek Bable
 
Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub Actions
 

Similar a The (r)evolution of CI/CD on GitHub

Github Case Study By Amil Ali
Github Case Study By Amil AliGithub Case Study By Amil Ali
Github Case Study By Amil AliAmilAli1
 
GitHub Vs GitLab | What Are The Major Difference?
GitHub Vs GitLab | What Are The Major Difference?GitHub Vs GitLab | What Are The Major Difference?
GitHub Vs GitLab | What Are The Major Difference?GrapesTech Solutions
 
concordia hacktoberfest.pptx
concordia hacktoberfest.pptxconcordia hacktoberfest.pptx
concordia hacktoberfest.pptxAnkurVerma95745
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubTom Mens
 
Introduction to Github Actions
Introduction to Github ActionsIntroduction to Github Actions
Introduction to Github ActionsKnoldus Inc.
 
What is the concept of GitOps.pdf
What is the concept of GitOps.pdfWhat is the concept of GitOps.pdf
What is the concept of GitOps.pdfCiente
 
why google stores billions of lines of code in a single repository
why google stores billions of lines of code in a single repositorywhy google stores billions of lines of code in a single repository
why google stores billions of lines of code in a single repositorymustafa sarac
 
Why Google Stores Billions of Lines of Code in a Single Repository
Why Google Stores Billions of Lines of Code in a Single RepositoryWhy Google Stores Billions of Lines of Code in a Single Repository
Why Google Stores Billions of Lines of Code in a Single RepositoryKapil Mohan
 
Get started with GitHub Copilot.pptx
Get started with GitHub Copilot.pptxGet started with GitHub Copilot.pptx
Get started with GitHub Copilot.pptxKhushiPanwar33
 
Difference between gitlab vs github vs bitbucket
Difference between gitlab vs github vs bitbucketDifference between gitlab vs github vs bitbucket
Difference between gitlab vs github vs bitbucketAcodez IT Solutions
 
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDKubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDSunnyvale
 
Github:fi Presentation
Github:fi PresentationGithub:fi Presentation
Github:fi PresentationPTIHPA
 

Similar a The (r)evolution of CI/CD on GitHub (20)

GitHub.docx
GitHub.docxGitHub.docx
GitHub.docx
 
Github Case Study By Amil Ali
Github Case Study By Amil AliGithub Case Study By Amil Ali
Github Case Study By Amil Ali
 
Git tech
Git techGit tech
Git tech
 
GitHub Vs GitLab | What Are The Major Difference?
GitHub Vs GitLab | What Are The Major Difference?GitHub Vs GitLab | What Are The Major Difference?
GitHub Vs GitLab | What Are The Major Difference?
 
concordia hacktoberfest.pptx
concordia hacktoberfest.pptxconcordia hacktoberfest.pptx
concordia hacktoberfest.pptx
 
What is github.
What is github.What is github.
What is github.
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHub
 
Introduction to Github Actions
Introduction to Github ActionsIntroduction to Github Actions
Introduction to Github Actions
 
Git version control
Git version controlGit version control
Git version control
 
What is the concept of GitOps.pdf
What is the concept of GitOps.pdfWhat is the concept of GitOps.pdf
What is the concept of GitOps.pdf
 
Github job support.pptx
Github job support.pptxGithub job support.pptx
Github job support.pptx
 
why google stores billions of lines of code in a single repository
why google stores billions of lines of code in a single repositorywhy google stores billions of lines of code in a single repository
why google stores billions of lines of code in a single repository
 
Why Google Stores Billions of Lines of Code in a Single Repository
Why Google Stores Billions of Lines of Code in a Single RepositoryWhy Google Stores Billions of Lines of Code in a Single Repository
Why Google Stores Billions of Lines of Code in a Single Repository
 
Git hub
Git hubGit hub
Git hub
 
Get started with GitHub Copilot.pptx
Get started with GitHub Copilot.pptxGet started with GitHub Copilot.pptx
Get started with GitHub Copilot.pptx
 
Difference between gitlab vs github vs bitbucket
Difference between gitlab vs github vs bitbucketDifference between gitlab vs github vs bitbucket
Difference between gitlab vs github vs bitbucket
 
GITHUB
GITHUBGITHUB
GITHUB
 
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDKubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
 
GitHub for partners
GitHub for partnersGitHub for partners
GitHub for partners
 
Github:fi Presentation
Github:fi PresentationGithub:fi Presentation
Github:fi Presentation
 

Más de Tom Mens

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD studentTom Mens
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentTom Mens
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubTom Mens
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureTom Mens
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Tom Mens
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networksTom Mens
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsTom Mens
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero SpaceTom Mens
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesTom Mens
 
Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Tom Mens
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Tom Mens
 
On the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsOn the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsTom Mens
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...Tom Mens
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Tom Mens
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Tom Mens
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsTom Mens
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarTom Mens
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersTom Mens
 
ConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersTom Mens
 
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...Tom Mens
 

Más de Tom Mens (20)

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD student
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software development
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHub
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the Future
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networks
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero Space
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messages
 
Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
 
On the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsOn the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystems
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research Achievements
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminar
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package Managers
 
ConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker Containers
 
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
 

Último

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Último (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

The (r)evolution of CI/CD on GitHub

  • 1. The (r)evolution of CI/CD on GitHub Promises and Perils of the GitHub Actions ecosystem Tom Mens Software Engineering Lab March 2023 SECO-ASSIST secoassist.github.io
  • 2. 2
  • 3. 3
  • 4. Collaborative software development 4 Commits Issues Pull Requests Comments Code Reviews Discussions Project Management ... Continuous Integration Quality analysis Build Test Deploy GitHub Actions
  • 6. Specifying GitHub Actions workflows 6 repository workflow 3 workflow 2 step 3 job 1 workflow 1 job 2 job 3 workflows jobs steps repository Parallel Parallel by default / sequential Sequential .github/workflows/ strategy step 2 step 1 use: (action) run: (shell cmd) use: (action)
  • 9. On the rise and fall of CI services in GitHub Mehdi Golzadeh Software Engineering Lab University of Mons Mons, Belgium mehdi.golzadeh@umons.ac.be Alexandre Decan Software Engineering Lab University of Mons Mons, Belgium alexandre.decan@umons.ac.be Tom Mens Software Engineering Lab University of Mons Mons, Belgium tom.mens@umons.ac.be Abstract—Continuous integration (CI) services are used in collaborative open source projects to automate parts of the development workflow. Such services have been in widespread use for over a decade, with new CIs being introduced over the years, sometimes overtaking other CIs in popularity. We conducted a longitudinal empirical study over a period of nine years, aiming to better understand this rapidly evolving CI landscape. By analysing the development history of 91,810 GitHub repositories of active npm packages having used at least one CI service, we quantitatively studied the evolution of seven popular CIs, specifically focusing on their co-usage and migration in the considered repositories. We provide statistical evidence of the rise of GitHub Actions, that has become the dominant CI service in less than 18 months time. This coincides with the fall of Travis that has seen an important decrease in usage, likely due to a combination of policy changes and migrations to GitHub Actions. Index Terms—Continuous integration, distributed software development, software repositories, GitHub I. INTRODUCTION Continuous integration (CI), deployment and delivery have become the cornerstone of collaborative software development and DevOps practices. CI automates the integration of code changes from multiple contributors into a central repository where automated builds, tests and code quality checks run. Well-known examples of CI services are Jenkins, Travis, CircleCI and AppVeyor. CI services can also be built-in in social coding platforms such as GitHub and GitLab [1]. GitLab already featured CI capabilities since November 2012. Based on popular demand, and in response to CI support integrated in GitLab, GitHub publicly announced the beta version of GitHub Actions (abbreviated to GHA in the remainder of this article) in October 2018. In August 2019, they officially began supporting Continuous Integration through GHA, and the product was released publicly in November 2019. GHA [2] allows to automate a wide range of tasks based on a variety of triggers such as commits, issues, pull requests, comments and many more. GHA can be used to facilitate code reviews, code quality analysis, communication, dependency and security monitoring and management, testing, etc. GHA facilitates the integration with external services, and can even obviate the need of using such external services altogether. GitHub is by far the largest social coding platform, hosting the development history of millions of collaborative software repositories, and accommodating over 56 million users in September 2020 [3]. Given its popularity and the ease with which GHA allows to automate the CI workflow, we hypoth- esise that GHA has had a significant impact on today’s CI landscape. More particularly, we believe that it has increased the awareness of the need for CI, it has reduced the entry barrier for projects to start using CI, and it may have lead projects to migrate from other CI services towards GHA. This article aims to quantitatively and objectively verify these hypotheses, and discusses their consequences, through a longitudinal analysis of how different CIs have been used over a nine-year period in 91,810 GitHub repositories correspond- ing to the software development history of reusable Node.JS packages distributed through the npm package registry. This empirical study focuses on four research questions: RQ1 How did the CI landscape evolve? We identified 20 different CIs being used in the considered set of repositories, some of which were considerably more prevalent than others. Together with Travis, GHA covers more than 80% of all usages. Moreover, in only 18 months GHA has overtaken all other CIs in popularity. RQ2 What are the most frequent combinations of CIs? We observed that many repositories have used multiple CIs during their lifetime. AppVeyor is nearly always used in combination with some other CI. If a repository uses a CI simultaneously with another one, it is mostly in combination with Travis, GHA or CircleCI. RQ3 How frequently are CIs being replaced by an alternative? We observed a non-negligible amount of CI migrations. GHA attracted most of these migrations. The majority of migrations were moving away from Travis and towards GHA. RQ4 How has the CI landscape changed since GHA was introduced? Based on a regression discontinuity design, we found that the usage of Travis, Azure and CircleCI has been negatively affected by the introduction of GHA. This article is structured as follows. Section II motivates the selected dataset and discusses the data extraction and cleaning steps that were carried out. Sections III to VI provide answers to each research question. Section VII discusses the ramifi- cations of these answers. Section VIII presents the threats to validity of the conducted research. Section IX presents the related work. Finally, Section X concludes. II. DATA EXTRACTION In order to analyse the use of CIs in software development repositories on GitHub, we need a large dataset containing On the Use of GitHub Actions in Software Development Repositories Alexandre Decan Software Engineering Lab University of Mons Mons, Belgium alexandre.decan@umons.ac.be Tom Mens Software Engineering Lab University of Mons Mons, Belgium tom.mens@umons.ac.be Pooya Rostami Mazrae Software Engineering Lab University of Mons Mons, Belgium pooya.rostamimazrae@umons.ac.be Mehdi Golzadeh Software Engineering Lab University of Mons Mons, Belgium mehdi.golzadeh@umons.ac.be Abstract—GitHub Actions was introduced in 2019 and con- stitutes an integrated alternative to CI/CD services for GitHub repositories. The deep integration with GitHub allows reposi- tories to easily automate software development workflows. This paper empirically studies the use of GitHub Actions on a dataset comprising 68K repositories on GitHub, of which 43.9% are using GitHub Actions workflows. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice, even if this reuse is concentrated in a limited number of actions. We study which actions are most frequently used and how workflows refer to them. Furthermore, we discuss the related security and versioning aspects. As such, we provide an overview of the use of GitHub Actions, constituting a necessary first step towards a better understanding of this emerging ecosystem and its implications on collaborative software development in the GitHub social coding platform. Index Terms—GitHub Actions, continuous integration, collab- orative software development, workflow automation I. INTRODUCTION Open source software (OSS) development is a continuous, highly distributed and collaborative endeavour [1]. Develop- ment of OSS projects faces many socio-technical challenges [2]–[4]. The multitude of tools (e.g., version control systems, software distribution managers, bug and issue trackers) and development-related activities makes it very challenging for contributor communities to keep up with the rapid pace of producing and maintaining high-quality software releases. Automated workflows were introduced to automate numer- ous repetitive social or technical activities that are inherently part of the collaborative software development process. Con- tinuous integration, deployment and delivery (CI/CD) have become the cornerstone of collaborative software develop- ment and DevOps practices. Well-known examples of CI/CD services are Travis, Jenkins, CircleCI and TeamCity. They automate the integration of code changes from multiple con- tributors into a central repository where automated builds, tests and code quality checks run. GitHub is by far the largest social coding platform, hosting the development history of millions of collaborative software repositories, and accommodating over 73 million users in 2021 [5]. GitHub publicly announced the beta version of GitHub Actions (abbreviated to GHA in the remainder of this paper) in October 2018 based on popular demand, and in response to GitLab’s integrated CI/CD support [6]. In August 2019, GitHub officially began supporting CI through GHA, and the product was released publicly in November 2019. GHA [7] allows the automation of a wide range of tasks based on a variety of triggers such as commits, issues, pull requests, comments, schedules, and many more. Its deep integration into GitHub implies that GHA can be used not only for executing test suites or deploying new releases as in traditional CI/CD services, but also to facilitate code reviews, communication, dependency and security monitoring and management, etc. GHA also promotes the use and sharing of reusable components, called actions, in workflows. These actions are distributed in public repositories and on the GitHub Marketplace. They allow workflow developers to easily in- tegrate specific tasks (e.g., set up a specific programming language environment, publish a release on a package registry, run tests and check code quality) without having to write the corresponding code. Since its public release in November 2019, GHA has become the most dominant CI/CD service, only 18 months after its introduction [8]. Its Marketplace of reusable actions has been growing exponentially ever since, reaching 12K reusable actions in February 2022. It is therefore fair to say that GHA has become a software ecosystem of its own, comparable to ecosystems of reusable software libraries (such as npm, RubyGems, CRAN, Maven, and PyPI) that have been empirically studied by many researchers in recent years (e.g., [9]–[14]). The emerging GHA ecosystem is worthy of being empiri- cally studied in its own right since it is likely to suffer from the same issues related to dependency management, security vulnerabilities, outdated or obsolete components, backward compatibility, and so on. This article therefore quantitatively studies the use of GHA in 68K repositories on GitHub. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice and identify which actions are reused and how. As such, we provide an overview of the use of GHA, a necessary first step towards a better understanding of the emerging GHA ecosystem and its implications on software development in GitHub repositories. More concretely, we answer the following research questions: 9 Empirical Software Engineering (2023) 28:52 https://doi.org/10.1007/s10664-022-10285-5 On the usage, co-usage and migration of CI/CD tools: A qualitative analysis Pooya Rostami Mazrae1 · Tom Mens1 · Mehdi Golzadeh1 · Alexandre Decan1 Accepted: 28 December 2022 © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023 Abstract Continuous integration, delivery and deployment (CI/CD) is used to support the collabora- tive software development process. CI/CD tools automate a wide range of activities in the development workflow such as testing, linting, updating dependencies, creating and deploy- ing releases, and so on. Previous quantitative studies have revealed important changes in the landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many software projects migrating to other CI/CD tools. In order to understand the reasons behind these changes in CI/CD usage, this paper presents a qualitative study based on in-depth interviews with 22 experienced software practitioners reporting on their usage, co-usage and migration of 31 different CI/CD tools. Following an inductive and deductive coding process, we analyse the interviews and found a high amount of competition between CI/CD tools. We observe multiple reasons for co-using different CI/CD tools within the same project, and we identify the main reasons and detractors for migrating to different alternatives. Among all reported migrations, we observe a clear trend of migrations away from Travis and migrations towards GitHub Actions and we identify the main reasons behind them. Keywords CI/CD · Collaborative software development · Workflow automation · Qualitative analysis · Empirical software engineering Communicated by: Alexander Serebrenik Alexandre Decan (F.R.S.-FNRS Research Associate) ! Pooya Rostami Mazrae pooya.rostami.m@gmail.com; pooya.rostamimazrae@umons.ac.be Tom Mens tom.mens@umons.ac.be Mehdi Golzadeh golzadeh.mehdi@gmail.com Alexandre Decan alexandre.decan@umons.ac.be 1 Software Engineering Lab, Université de Mons, Mons, Belgium https://doi.org/10.1109/ICSME55016.2022.00029 https://doi.org/10.1109/SANER53432.2022.00084
  • 10. On the rise and fall of CI services in GitHub Mehdi Golzadeh Software Engineering Lab University of Mons Mons, Belgium mehdi.golzadeh@umons.ac.be Alexandre Decan Software Engineering Lab University of Mons Mons, Belgium alexandre.decan@umons.ac.be Tom Mens Software Engineering Lab University of Mons Mons, Belgium tom.mens@umons.ac.be Abstract—Continuous integration (CI) services are used in collaborative open source projects to automate parts of the development workflow. Such services have been in widespread use for over a decade, with new CIs being introduced over the years, sometimes overtaking other CIs in popularity. We conducted a longitudinal empirical study over a period of nine years, aiming to better understand this rapidly evolving CI landscape. By analysing the development history of 91,810 GitHub repositories of active npm packages having used at least one CI service, we quantitatively studied the evolution of seven popular CIs, specifically focusing on their co-usage and migration in the considered repositories. We provide statistical evidence of the rise of GitHub Actions, that has become the dominant CI service in less than 18 months time. This coincides with the fall of Travis that has seen an important decrease in usage, likely due to a combination of policy changes and migrations to GitHub Actions. Index Terms—Continuous integration, distributed software development, software repositories, GitHub I. INTRODUCTION Continuous integration (CI), deployment and delivery have become the cornerstone of collaborative software development and DevOps practices. CI automates the integration of code changes from multiple contributors into a central repository where automated builds, tests and code quality checks run. Well-known examples of CI services are Jenkins, Travis, CircleCI and AppVeyor. CI services can also be built-in in social coding platforms such as GitHub and GitLab [1]. GitLab already featured CI capabilities since November 2012. Based on popular demand, and in response to CI support integrated in GitLab, GitHub publicly announced the beta version of GitHub Actions (abbreviated to GHA in the remainder of this article) in October 2018. In August 2019, they officially began supporting Continuous Integration through GHA, and the product was released publicly in November 2019. GHA [2] allows to automate a wide range of tasks based on a variety of triggers such as commits, issues, pull requests, comments and many more. GHA can be used to facilitate code reviews, code quality analysis, communication, dependency and security monitoring and management, testing, etc. GHA facilitates the integration with external services, and can even obviate the need of using such external services altogether. GitHub is by far the largest social coding platform, hosting the development history of millions of collaborative software repositories, and accommodating over 56 million users in September 2020 [3]. Given its popularity and the ease with which GHA allows to automate the CI workflow, we hypoth- esise that GHA has had a significant impact on today’s CI landscape. More particularly, we believe that it has increased the awareness of the need for CI, it has reduced the entry barrier for projects to start using CI, and it may have lead projects to migrate from other CI services towards GHA. This article aims to quantitatively and objectively verify these hypotheses, and discusses their consequences, through a longitudinal analysis of how different CIs have been used over a nine-year period in 91,810 GitHub repositories correspond- ing to the software development history of reusable Node.JS packages distributed through the npm package registry. This empirical study focuses on four research questions: RQ1 How did the CI landscape evolve? We identified 20 different CIs being used in the considered set of repositories, some of which were considerably more prevalent than others. Together with Travis, GHA covers more than 80% of all usages. Moreover, in only 18 months GHA has overtaken all other CIs in popularity. RQ2 What are the most frequent combinations of CIs? We observed that many repositories have used multiple CIs during their lifetime. AppVeyor is nearly always used in combination with some other CI. If a repository uses a CI simultaneously with another one, it is mostly in combination with Travis, GHA or CircleCI. RQ3 How frequently are CIs being replaced by an alternative? We observed a non-negligible amount of CI migrations. GHA attracted most of these migrations. The majority of migrations were moving away from Travis and towards GHA. RQ4 How has the CI landscape changed since GHA was introduced? Based on a regression discontinuity design, we found that the usage of Travis, Azure and CircleCI has been negatively affected by the introduction of GHA. This article is structured as follows. Section II motivates the selected dataset and discusses the data extraction and cleaning steps that were carried out. Sections III to VI provide answers to each research question. Section VII discusses the ramifi- cations of these answers. Section VIII presents the threats to validity of the conducted research. Section IX presents the related work. Finally, Section X concludes. II. DATA EXTRACTION In order to analyse the use of CIs in software development repositories on GitHub, we need a large dataset containing 10 https://doi.org/10.1109/SANER53432.2022.00084
  • 11. Dataset 11 1.6M+ Scoped packages 803K packages on GitHub Excluded 11,557 forks Excluded inactive repositories 201,403 Repositories Presence of CI configuration files 119,033 CI usages in 91,810 Repositories May 2021 Cloned 676K
  • 12. How prevalent is CI usage in GitHub repositories? CI services are used in more than half of all considered repositories.
  • 13. Evolution of GitHub CI/CD landscape 13 Since 2021, GitHub Actions has become the dominant CI/CD tool in GitHub
  • 19. What happened to Travis? Travis changed its free plan GHA was introduced
  • 20. 20 Empirical Software Engineering (2023) 28:52 https://doi.org/10.1007/s10664-022-10285-5 On the usage, co-usage and migration of CI/CD tools: A qualitative analysis Pooya Rostami Mazrae1 · Tom Mens1 · Mehdi Golzadeh1 · Alexandre Decan1 Accepted: 28 December 2022 © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023 Abstract Continuous integration, delivery and deployment (CI/CD) is used to support the collabora- tive software development process. CI/CD tools automate a wide range of activities in the development workflow such as testing, linting, updating dependencies, creating and deploy- ing releases, and so on. Previous quantitative studies have revealed important changes in the landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many software projects migrating to other CI/CD tools. In order to understand the reasons behind these changes in CI/CD usage, this paper presents a qualitative study based on in-depth interviews with 22 experienced software practitioners reporting on their usage, co-usage and migration of 31 different CI/CD tools. Following an inductive and deductive coding process, we analyse the interviews and found a high amount of competition between CI/CD tools. We observe multiple reasons for co-using different CI/CD tools within the same project, and we identify the main reasons and detractors for migrating to different alternatives. Among all reported migrations, we observe a clear trend of migrations away from Travis and migrations towards GitHub Actions and we identify the main reasons behind them. Keywords CI/CD · Collaborative software development · Workflow automation · Qualitative analysis · Empirical software engineering Communicated by: Alexander Serebrenik Alexandre Decan (F.R.S.-FNRS Research Associate) ! Pooya Rostami Mazrae pooya.rostami.m@gmail.com; pooya.rostamimazrae@umons.ac.be Tom Mens tom.mens@umons.ac.be Mehdi Golzadeh golzadeh.mehdi@gmail.com Alexandre Decan alexandre.decan@umons.ac.be 1 Software Engineering Lab, Université de Mons, Mons, Belgium
  • 21. Methodology 21 • Around 30 questions related to CI usage, co-usage and migration Interview questionnaire • Selected candidates through Twitter, LinkedIn, email, direct messages • Colleagues' referrals (snowballing) Selection of respondents • Using online video conferencing tool Geographic diversity • Actively contributed to, or having been responsible for a software project relying on CI • Sufficient knowledge about which CI tool is used in that software project and how • Having been involved in setting up or maintaining the CI process of the project Inclusion Criteria
  • 22. Demographics of respondents • 22 respondents • 16 from 7 European countries • 4 from North America • 2 from Asia • software development experience • average of 12 years and 4 months • Good mix of industrial and open source contributors 22
  • 23. CI/CD tools being used • 14 additional tools reported only once • 3 custom-built in-house CI/CD solutions 23
  • 29. Why is GitHub Actions so popular? • deep integration with GitHub • ease of setup and use • trendy • speed • reliability • free tier for open source projects • large marketplace of reusable Actions • support for major operating systems • company support (Microsoft) • automation beyond CI/CD 33
  • 30. Difficulties in CI migration • Learning curve • Fundamental differences between the source and target of the migration • Trial-and-error nature of configuring a new CI tool • Lack of familiarity with the new CI tool • Important missing features 34
  • 31. On the Use of GitHub Actions in Software Development Repositories Alexandre Decan Software Engineering Lab University of Mons Mons, Belgium alexandre.decan@umons.ac.be Tom Mens Software Engineering Lab University of Mons Mons, Belgium tom.mens@umons.ac.be Pooya Rostami Mazrae Software Engineering Lab University of Mons Mons, Belgium pooya.rostamimazrae@umons.ac.be Mehdi Golzadeh Software Engineering Lab University of Mons Mons, Belgium mehdi.golzadeh@umons.ac.be Abstract—GitHub Actions was introduced in 2019 and con- stitutes an integrated alternative to CI/CD services for GitHub repositories. The deep integration with GitHub allows reposi- tories to easily automate software development workflows. This paper empirically studies the use of GitHub Actions on a dataset comprising 68K repositories on GitHub, of which 43.9% are using GitHub Actions workflows. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice, even if this reuse is concentrated in a limited number of actions. We study which actions are most frequently used and how workflows refer to them. Furthermore, we discuss the related security and versioning aspects. As such, we provide an overview of the use of GitHub Actions, constituting a necessary first step towards a better understanding of this emerging ecosystem and its implications on collaborative software development in the GitHub social coding platform. Index Terms—GitHub Actions, continuous integration, collab- orative software development, workflow automation I. INTRODUCTION Open source software (OSS) development is a continuous, highly distributed and collaborative endeavour [1]. Develop- ment of OSS projects faces many socio-technical challenges [2]–[4]. The multitude of tools (e.g., version control systems, software distribution managers, bug and issue trackers) and development-related activities makes it very challenging for contributor communities to keep up with the rapid pace of producing and maintaining high-quality software releases. Automated workflows were introduced to automate numer- ous repetitive social or technical activities that are inherently part of the collaborative software development process. Con- tinuous integration, deployment and delivery (CI/CD) have become the cornerstone of collaborative software develop- ment and DevOps practices. Well-known examples of CI/CD services are Travis, Jenkins, CircleCI and TeamCity. They automate the integration of code changes from multiple con- tributors into a central repository where automated builds, tests and code quality checks run. GitHub is by far the largest social coding platform, hosting the development history of millions of collaborative software repositories, and accommodating over 73 million users in 2021 [5]. GitHub publicly announced the beta version of GitHub Actions (abbreviated to GHA in the remainder of this paper) in October 2018 based on popular demand, and in response to GitLab’s integrated CI/CD support [6]. In August 2019, GitHub officially began supporting CI through GHA, and the product was released publicly in November 2019. GHA [7] allows the automation of a wide range of tasks based on a variety of triggers such as commits, issues, pull requests, comments, schedules, and many more. Its deep integration into GitHub implies that GHA can be used not only for executing test suites or deploying new releases as in traditional CI/CD services, but also to facilitate code reviews, communication, dependency and security monitoring and management, etc. GHA also promotes the use and sharing of reusable components, called actions, in workflows. These actions are distributed in public repositories and on the GitHub Marketplace. They allow workflow developers to easily in- tegrate specific tasks (e.g., set up a specific programming language environment, publish a release on a package registry, run tests and check code quality) without having to write the corresponding code. Since its public release in November 2019, GHA has become the most dominant CI/CD service, only 18 months after its introduction [8]. Its Marketplace of reusable actions has been growing exponentially ever since, reaching 12K reusable actions in February 2022. It is therefore fair to say that GHA has become a software ecosystem of its own, comparable to ecosystems of reusable software libraries (such as npm, RubyGems, CRAN, Maven, and PyPI) that have been empirically studied by many researchers in recent years (e.g., [9]–[14]). The emerging GHA ecosystem is worthy of being empiri- cally studied in its own right since it is likely to suffer from the same issues related to dependency management, security vulnerabilities, outdated or obsolete components, backward compatibility, and so on. This article therefore quantitatively studies the use of GHA in 68K repositories on GitHub. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice and identify which actions are reused and how. As such, we provide an overview of the use of GHA, a necessary first step towards a better understanding of the emerging GHA ecosystem and its implications on software development in GitHub repositories. More concretely, we answer the following research questions: 35 https://doi.org/10.1109/ICSME55016.2022.00029
  • 32. Research Questions 36 What are the characteristics of repositories using workflows? Which kinds of workflows are automated? What are the most frequent jobs in workflows? What are the automation practices? Which types of Actions are reused?
  • 33. Dataset • 67,870 repositories • 4 out of 10 repositories use GitHub Actions workflows • 70,278 workflow files • 108,500 jobs • 567,352 steps 37
  • 34. Quantification of jobs and workflows Workflows in repositories single workflow (49.3%) more than one workflow (50.7%) Jobs in workflows single job (77.8%) more than one job (22.2%) 38
  • 35. Characteristics of GitHub repositories using GitHub Actions Median Effect size Characteristic With workflows Without workflows Interpretation Pull Requests 124 41 medium Contributors 20 11 small Commits 598 344 small Issues 105 59 small 40 Repos with GHA workflows tend to have more contributors, pull requests, commits, and issues
  • 36. Most frequent event types triggering workflows 63,4 56,3 16,1 15,4 6,2 8,6 0 10 20 30 40 50 60 70 push PR schedule workflow_dispatch release others 41
  • 37. DifferDifferent ways of executing codecode Step type Action target % of steps % of repositories run: -- 49,9% 93,5% uses: Local path 0,8% 2,0% Docker image 0,1% 1,8% Same repository 0,2% 0,4% Same owner 0,7% 4,3% Other public repository 48,3% 99,3% 42 Reusing Actions in steps is a common practice
  • 38. Which Actions are reused? 35,50% 7,20% 6,60% 5,90% 5,80% 98% 22% 26% 19% 21% 0,00% 10,00% 20,00% 30,00% 40,00% 50,00% 60,00% 70,00% 80,00% 90,00% 100,00% actions/checkout actions/cache actions/setup-node actions/upload-artifact actions/setup-python Top 5 most frequent Actions in steps and repositories steps repositories 44 • A few Actions concentrate most of the reuse • Most of them being developed by GitHub
  • 39. 45 On the Outdatedness of Workflows in the GitHub Actions Ecosystem Alexandre Decan1 , Hassan Onsori Delicheh, Tom Mens aSoftware Engineering Lab, University of Mons, Mons, Belgium Abstract GitHub Actions was introduced as a way to automate CI/CD workflows in GitHub, the largest social coding platform. Thanks to its deep integration into GitHub, GitHub Actions can be used to automate a wide range of social and technical activities. Among its main features, it allows automation workflows to rely on reusable components – the so-called Actions – to enable developers to focus on the tasks that should be automated rather than on how to automate them. As any other kind of reusable software components, Actions are contin- uously updated, causing many automation workflows to use outdated versions of these Actions. Based on a dataset of nearly one million workflows obtained from 22K+ repositories between November 2019 and September 2022, we pro- vide quantitative empirical evidence that reusing Actions in GitHub workflows is common practice, even if this reuse tends to concentrate on a limited number of Actions. We show that Actions are frequently updated, and we quantify to which extent automation workflows are outdated with respect to these Actions. Using two complementary metrics, technical lag and opportunity lag, we found that most of the workflows are using an outdated Action release, are lagging behind the latest available release for at least 7 months, and had the oppor- tunity to be updated during at least 9 months. This calls for a more rigorous management of Action outdatedness in automation workflows, as well as for better policies and tooling to keep workflows up-to-date. Keywords: software ecosystem, dependency management, continuous integration, collaborative software development, workflow automation, technical lag Email addresses: alexandre.decan@umons.ac.be (Alexandre Decan), hassan.onsoridelicheh@umons.ac.be (Hassan Onsori Delicheh), tom.mens@umons.ac.be (Tom Mens) 1F.R.S.-FNRS Research Associate Preprint submitted to Journal of Systems & Software March 21, 2023
  • 40. Outdatedness in the GitHub Actions ecosystem 46 • Four out of five workflows and nearly two thirds of the steps are using an outdated release of an Action. • Steps using Actions provided by GitHub are responsible for most of the outdatedness. • More than one third of the other steps and nearly half of the other workflows are using an outdated release of an Action. release of actions/checkout@v2 release of actions/checkout@v3 release of actions/setup-*@v2 release of actions/setup-*@v3
  • 41. v1 v2 v3 v4 latest technical lag observation date GitHub workflow selected Action lifeline Outdatedness in the GitHub Actions ecosystem Technical lag of workflows / steps: the time period between the start of reusing a selected Action and the latest release of that Action.
  • 42. Outdatedness in the GitHub Actions ecosystem Technical lag of workflows / steps: the time period between the start of reusing a selected Action and the latest release of that Action. • Technical lag of outdated steps tends to increase over time. • Half of the outdated steps using other Actions are using a version that is lagging behind the latest one for at least 7.3 months. • Main cause of technical lag = Actions provided by GitHub
  • 43. Outdatedness in the GitHub Actions ecosystem Opportunity lag of workflows / steps: the time period during which a workflow could have updated an outdated step to a more recent version of an Action, but didn’t. v1 v2 v3 v4 opportunity lag observation time GitHub workflow first update opportunity Action lifeline selected
  • 44. Outdatedness in the GitHub Actions ecosystem Opportunity lag of workflows / steps: the time period during which a workflow could have updated an outdated step to a more recent version of an Action, but didn’t. • The opportunity lag of outdated steps tends to increase over time. • On average, maintainers of outdated steps have had the opportunity to update them for 9 months, but have not done so. • Main cause of opportunity lag = Actions provided by GitHub new releases for docker/*
  • 45. Thank you for your attention. Any questions? 55