Recognising bot activity in collaborative software development

•Descargar como PPTX, PDF•

0 recomendaciones•25 vistas

Presentation by Natarajan Chidambaram during the International ICSE Workshop on Bots in Software Engineering (BotSE 2023) in Australia. Joint work with Mehdi Golzadeh, Tom Mens, Alexandre Decan of the Software Engineering Lab of the University of Mons and with Eleni Constantinou.

Ciencias

Recognizing bot activity in collaborative
software development
Natarajan Chidambaram
Software Engineering Lab, University of Mons, Belgium
Supported by Service public deWallonie – Recherche under grant n°2010235 “ARIAC BY DIGITALWALLONIA4.AI”
and Fonds de la Recherche Scientifique – FNRS under grant numbers F.4515.23, O.0157.18F-RG43 and T.0017.18
S
ECO-AS
S
IS
T

M. Golzadeh, T. Mens, A. Decan, E. Constantinou and N. Chidambaram, "Recognizing Bot Activity in Collaborative Software Development," in IEEE Software, vol. 39, no. 5, pp. 56-61, Sept.-Oct. 2022, doi: 10.1109/MS.2022.3178601.

https://doi.org/10.1145/3528228.3528406
https://doi.org/10.1109/MS.2022.3178601

Detecting bots, why?
• Recognise and accredit project contributors
• Which types of contributions to consider?
• How to identify the contributors?
• How to measure contribution effort?
• Find and hire experts
• Understand and improve the project development process
• Avoid bias in socio-technical and bot-based studies

Prevalence of Bots in GitHub
• 1 out of 10 top contributors is a bot
• 12 out of 21 are not marked as [bot] by GitHub
x

M. Golzadeh, A. Decan and N. Chidambaram, "On the Accuracy of Bot Detection Techniques," 2022 IEEE/ACM 4th International Workshop on Bots in Software Engineering (BotSE),
2022, pp. 1-5, doi: 10.1145/3528228.3528406.

Accuracy of bot Identification
Contributor type for 540 contributors present in 27 GitHub projects

N. Chidambaram, A. Decan, T. Mens, A dataset of bot and human activities in GitHub, in: International Conference on Mining Software Repositories (MSR), IEEE, 2023.

GitHub Events API:
Can retrieve the latest 300 events in
the last 90 days
Closing issue
branch
Creating tag
Creating branch
Creating repository
IssuesEvent
IssueCommentEvent Closing issue
created
Reopening issue
reopened
CreateEvent
Opening issue

$# contributors # activities Bot dataset 385 649,755 Human dataset 616 184,056 total 1,001 833,811 • 834K activities obtained from 1M+ events • 24 activity types • 1K contributors • 105 days (25 Nov 2022 - 9 Mar 2023) { "date": "2022-11-26T14:13:19+00:00", "activity": "Commenting issue", "contributor": "kubevirt-bot", "repository": "kubevirt/kubevirt", "comment": { "length": 255, "GH_node": "IC_kwDOBJIk985PKH4s" }, "issue": { "id": 8294, "title": "SRIOV VF interface not found in VM", "created_at": "2022-08- 13T11:10:06+00:00", "status": "open", "closed_at": null, "resolved": false, "GH_node": "I_kwDOBJIk985Pvz5k" } "conversation": { "comments": 9 } } JSON format$

Usefulness of the Dataset
• Analyse most frequent activities
• Find differences in behaviour between bots and humans
• Forecast future contributor activities
• Distinguish bot and human contributors
• Develop a new bot detection technique
bot human

Some Distinguishing Features
Number of activity types Variation in activity frequency

Some Distinguishing Features
Hours to shift between repositories
Dispersion of activity types across
repositories

Más contenido relacionado

Similar a Recognising bot activity in collaborative software development

Blockchain technology overview

RishabhMalik32

Crypto Currency, Bitcoin and Blockchain

Goutama Bachtiar

New Business Models enabled by Blockchain

Slash

IRJET- Bitcoin - The Future Currency

IRJET Journal

Ontology of citizen science @ Siena 2016 11 24

Luigi Ceccaroni

Blockchain Technology Utilizationin Global Rakuten Ecosystem

Rakuten Group, Inc.

DELLA - CRYPTOCURRENCY PRICE TRACKER

IRJET Journal

IRJET- A Survey on Blockchain Technology and Municipal Corporation System

IRJET Journal

Future of jobs and digital economy citi conference 090618

Economic Strategy Institute

NTEN Workshop | August 9, 2017

Denise Linn Riedl

I gave this talk at the 'Digital Twin Conference' hosted by LH Corp at COEX, Seoul on August 8th, 2019. Abstract: 'Digital Twin' is a digital replication of real world objects, processes, phenomena that can be used for various purposes. Digital twin concept backs to manufacturing industry in early 2000s for the PLM (Product Lifecycle Management) purposes. It is based on the idea that a digital informational construct about a physical system could be created as an entity on its own. Definitions of digital twin emphasize the three important levels or characteristics. At first, there should be connection between real physical world and corresponding virtual world. To do this, Level 1 digital twin provides virtual 3D models. Secondly, this connection between real world and virtual world is established by generating (near) real time data using sensors or IoT. This is called Level 2 digital twin. Thirdly, Level 3 digital twin carries out certain analyses, predictions, and simulations using virtual 3D and (near) real time data. ‘Smart Spaces’ are interactive environments where humans and technology can openly communicate with each other in a physical or digital setting. Examples of smart spaces include smart cities, smart factories, and smart homes. ‘Smart Spaces’ is one of Garner’s Top 10 Tech Trends for 2019. As spaces are going through digital transformation with 4th industrial revolution, there are many attempts to apply digital twin technology to manage urban, spatial, and industrial issues around the world. Those attempts look set to play an increasingly important role in the creation of smart cities, smart factories, and smart homes. Bringing the virtual and real worlds together in this way can help to give better analysis, visualization, and simulation to the decision-making process. This will be a multi-way process with iterative feedback among stakeholders. In this talk, I'll share my real experiences in carrying out digital twin and smart space projects. Also I’ll talk about what I’ve learnt from these projects.

Digital Twin and Smart Spaces

SANGHEE SHIN

Rhee sokwoo

Federico De Palma Medrano

Electric Capital Crypto Dev Report · 2022

Maria Xinhe Shen

Proffer Blockchain Hackathon $17K+ prizes | Launch Presentation

Anshul Bhagi

DLT, Blockchain Analytics and AI Workshop at NYU, Dec 10, 2018

"Dean \"Sakis\"" Karakitsos

Rob van Kranenburg @ Thingscon Amsterdam

CLICKNL

ThingsConAMS - Stakeholders in a new world - Rob van Kranenburg

ThingsConAMS

Blockchain and smart contracts: infrastructure and platforms

Claudio Di Ciccio

Anaconda and PyData Solutions

Travis Oliphant

Blockchain Technology Report 2018

Ranvijay Singh

Similar a Recognising bot activity in collaborative software development (20)

Blockchain technology overview

Crypto Currency, Bitcoin and Blockchain

New Business Models enabled by Blockchain

IRJET- Bitcoin - The Future Currency

Ontology of citizen science @ Siena 2016 11 24

Blockchain Technology Utilizationin Global Rakuten Ecosystem

DELLA - CRYPTOCURRENCY PRICE TRACKER

IRJET- A Survey on Blockchain Technology and Municipal Corporation System

Future of jobs and digital economy citi conference 090618

NTEN Workshop | August 9, 2017

Digital Twin and Smart Spaces

Rhee sokwoo

Electric Capital Crypto Dev Report · 2022

Proffer Blockchain Hackathon $17K+ prizes | Launch Presentation

DLT, Blockchain Analytics and AI Workshop at NYU, Dec 10, 2018

Rob van Kranenburg @ Thingscon Amsterdam

ThingsConAMS - Stakeholders in a new world - Rob van Kranenburg

Blockchain and smart contracts: infrastructure and platforms

Anaconda and PyData Solutions

Blockchain Technology Report 2018

Más de Tom Mens

How to be(come) a successful PhD student

Tom Mens

The (r)evolution of CI/CD on GitHub

Tom Mens

In January 2018, four Software Engineering research groups located in different Belgian Universities launched a five year research project to nurture the software ecosystems of the future. We assembled a diverse team of about a dozen researchers and embarked on an exciting journey leading to a rich and diverse suite of papers, tools and datasets. Halfway into the project the corona pandemic intervened, but despite several months of lockdown, we succeeded in increasing inter-university collaboration. In this paper we share our achievements so that the BENEVOL community may benefit from our experience.

Nurturing the Software Ecosystems of the Future

Tom Mens

Comment programmer un robot en 30 minutes?

Tom Mens

On the rise and fall of CI services in GitHub

Tom Mens

Presentation at FOSDEM 2022 Composition and Dependency Management DevRoom of empirical research on backporting practices in package dependency networks, published in the IEEE Transactions in Software Engineering in 2021 (https://doi.org/10.1109/TSE.2021.3112204) Joint work by Alexandre Decan, Tom Mens; Ahmed Zeourali, Coen De Roover as part of the Belgian Excellence of Science research project SECOASSIST (https://secoassist.github.io)

On backporting practices in package dependency networks

Tom Mens

Presentation by Tom Mens at PackagingCon 2021 on Wednesday 10 November 2021. Abstract: Semantic versioning (semver) is a commonly accepted open source practice, used by many package management systems to inform whether new package releases introduce possibly backward incompatible changes. Maintainers depending on such packages can use this practice to reduce the risk of breaking changes in their own packages by specifying version constraints on their dependencies. Depending on the amount of control a package maintainer desires to assert over her package dependencies, these constraints can range from very permissive to very restrictive. We empirically compared the evolution of semver compliance in four package management systems: Cargo, npm, Packagist and Rubygems. We discuss to what extent ecosystem-specific characteristics influence the degree of semver compliance, and we suggest to develop tools adopting the wisdom of the crowds to help package maintainers decide which type of version constraints they should impose on their dependencies. We also studied to which extent the packages distributed by these package managers are still using a 0.y.z release, suggesting less stable and immature packages. We explore the effect of such "major zero" packages on semantic versioning adoption. Our findings shed insight in some important differences between package managers with respect to package versioning policies. Our empirical results have been published in two peer-reviewed academic journals: the IEEE Transactions in Software Engineering (https://doi.org/10.1109/TSE.2019.2918315) and Elsevier Science of Computer Programming (https://doi.org/10.1016/j.scico.2021.102656). Achknowledgments: Research conducted in the context of the SECOASSIST "Excellence of Science" Research Project.

Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems

Tom Mens

Presentation by Tom Mens at FOSDEM21 (Free Open Source Developers Meeting, February 2021). Published in Science of Computer Programming, August 2021. https://doi.org/10.1016/j.scico.2021.102656 Abstract: When developing open source software end-user applications or reusable software packages, developers depend on software packages distributed through package managers such as npm, Packagist, Cargo, RubyGems. In addition to this, empirical evidence has shown that these package managers adhere to a large extent to semantic versioning principles. Packages that are still in major version zero are considered unstable according to semantic versioning, as some developers consider such packages as immature, still being under initial development. This presentation reports on large-scale empirical evidence on the use of dependencies towards 0.y.z versions in four different software package distributions: Cargo, npm, Packagist and RubyGems. We study to which extent packages get stuck in the zero version space, never crossing the psychological barrier of major version zero. We compare the effect of the policies and practices of package managers on this phenomenon. We do not reveal the results of our findings in this abstract yet, as it would spoil the fun of the presentation.

Lost in Zero Space

Tom Mens

Detecting the presence of bots in distributed software development activity is very important in order to prevent bias in socio-technical empirical studies. In previous work, we proposed a classification model to detect bots in GitHub repositories based on the pull request and issue comments of GitHub accounts. The current study generalises the approach to git contributors based on their commit messages. We train and evaluate the classification model on a large dataset of 6,922 git contributors. The original model based on pull request and issue comments obtained a precision of 0.77 on this dataset, whereas retraining the classification model on git commit messages increased the precision to 0.80. As a proof-of-concept, we implemented this model in BoDeGiC, an open source command-line tool to detect bots in git repositories.

Evaluating a bot detection model on git commit messages

Tom Mens

Is my software ecosystem healthy? It depends!

Tom Mens

Presentation by Mehdi Golzadeh (Software Engineering Lab, University of Mons) of an article published at the 2nd International ICSE Workshop on Bots In Software Engineering (BotSE). See https://doi.org/10.1145/3387940.3391503 Abstract: Many empirical studies focus on socio-technical activity in social coding platforms such as GitHub, for example to study the onboarding, abandonment, productivity and collaboration among team members. Such studies face the difficulty that GitHub activity can also be generated automatically by bots of a different nature. It therefore becomes imperative to distinguish such bots from human users. We propose an automated approach to detect bots in GitHub pull request activity. Relying on the assumption that bots contain repetitive message patterns in their pull request comments, we analyse the similarity between multiple messages from the same GitHub identity, using a clustering method that combines the Jaccard and Levenshtein distance. We empirically evaluate our approach by analysing 20,090 comments of 250 users and 42 bots in 1,262 GitHub repositories. Our results show that the method is able to clearly separate bots from human users.

Bot or not? Detecting bots in GitHub pull request activity based on comment s...

Tom Mens

On the fragility of open source software packaging ecosystems

Tom Mens

How magic is zero? An Empirical Analysis of Initial Development Releases in S...

Tom Mens

This talk reports on our findings based on multiple empirical studies that we have conducted to understand different aspects of dependency management and their practical implications. This includes: * the outdatedness of package dependencies, the transitive impact of such "technical lag", and its relation to the presence of bugs and security vulnerabilities. * the impact of using either more permissive or more restrictive version contraints on dependencies. * the virtues and limitations of being compliant to semantic versioning, a common policy to inform dependents whether new releases of software packages introduce possibly backward incompatible changes. * the impact of specific characteristics, policies and tools used by the packaging ecosystem and its supporting community on all of the above. The contents of the talk is primarily based on the following peer-reviewed scientific articles: * What do package dependencies tell us about semantic versioning? Alexandre Decan, Tom Mens. IEEE Transactions on Software Engineering, 2019. https://doi.org/10.1109/TSE.2019.2918315 * An empirical comparison of dependency network evolution in seven software packaging ecosystems. Alexandre Decan, Tom Mens, Philippe Grosjean. Empirical Software Engineering 24(1):381-416, 2019. https://doi.org/10.1007/s10664-017-9589-y * A formal framework for measuring technical lag in component repositories and its application to npm. Ahmed Zerouali, Tom Mens, Jesus Gonzalez‐Barahona, Alexandre Decan, Eleni Constantinou, Gregorio Robles. Journal of Software: Evolution and Process 31(8), 2019. https://doi.org/10.1002/smr.2157 * On the Impact of Security Vulnerabilities in the npm Package Dependency Network. Alexandre Decan, Tom Mens, Eleni Constantinou. International Conference on Mining Software Repositories, 2018. https://doi.org/10.1145/3196398.3196401 * On the Evolution of Technical Lag in the npm Package Dependency Network. Alexandre Decan, Tom Mens, Eleni Constantinou. International Conference on Software Maintenance and Evolution, 2018. https://doi.org/10.1109/ICSME.2018.00050

Comparing dependency issues across software package distributions (FOSDEM 2020)

Tom Mens

Measuring Technical Lag in Software Deployments (CHAOSScon 2020)

Tom Mens

This presentation reports on the research results achieved in the context of the interuniversity interdisciplinary research project SECOHealth "Vers une méthodologie et analyse socio-technique interdisciplinaire de la santé des écosystèmes logiciels" co-financed by FRS-FNRS Belgium and FRQ (FRSC - FRNT, Québec) with principal investigators Tom Mens (UMONS), Bram Adams (Polytechnique Montréal) and Josianne Marsan (Université Laval).

SecoHealth 2019 Research Achievements

Tom Mens

Introduction to the research seminar on empirical analysis of open source software ecosystems, organised by the SECO-ASSIST "excellence of science" research project, on September 4th, 2019 at the University of Mons, Belgium. With invited presentations by Alexander Serebrenik, Jesus Gonzalez-Barahona, Dario Di Nucci and Henrique Nucci. The seminar concludes with the public PhD defense of Ahmed Zerouali (supervised by Tom Mens) on the topic of "A Measurement Framework for Analyzing Technical Lag in Open-Source Software Ecosystems"

SECO-Assist 2019 research seminar

Tom Mens

Invited presentation at Concordia University (Montreal, Canada) by Eleni Constantinou and Tom Mens on recent research about the socio-technical health issues in software package management ecosystems. Abstract: The large majority of today’s software is relying on open software software components. Such components are typically distributed through package managers for a wide variety of programming languages, and developed and maintained through online distributed software development services like GitHub. Software component repositories are perceived as software ecosystems that constitute complex and evolving socio-technical software dependency networks. Because of their complexity and evolution, these ecosystems tend to suffer from a wide variety of software health issues that can be either technical or social in nature. Examples of such issues include the ecosystem fragility due to exponential growth and transitive dependencies; the abundance of outdated, unmaintained or obsolete software components; the prolonged presence of unfixed bugs and security vulnerabilities; the abandonment or high turnover of key contributors, suboptimal collaboration between contributors, and many more. This presentation will report on our past and ongoing empirical research that studies such health factors within and across different software packaging ecosystems (such as npm, RubyGems, Cargo, CRAN, CPAN). We provide empirical evidence of some of the health problems, compare their presence across different ecosystems, and suggest ways to reduce their potential impact by providing concrete guidelines and tools. The presented research Is being conducted by researchers of the Software Engineering Lab at the University of Mons in the context of two ongoing projects SECOHealth and SECO-ASSIST, aiming to analyse and improve the health of software ecosystems.

Empirically Analysing the Socio-Technical Health of Software Package Managers

Tom Mens

Demonstration of the ConPan tool for analysing outdated packages and their security vulnerabilities in Docker containers. Developed by Ahmed Zerouali, Software Engineering Lab, University of Mons. In collaboration with Universidad Rey Juan Carlos and Bitergia, Madrid, Spain. Presented by Tom Mens at the MSR 2019 International Conference on Mining Software Repositories, May 2019, Montréal, Canada.

ConPan: Analysing Packages Installed in Docker Containers

Tom Mens

On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...

Tom Mens

Más de Tom Mens (20)

How to be(come) a successful PhD student

The (r)evolution of CI/CD on GitHub

Nurturing the Software Ecosystems of the Future

Comment programmer un robot en 30 minutes?

On the rise and fall of CI services in GitHub

On backporting practices in package dependency networks

Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems

Lost in Zero Space

Evaluating a bot detection model on git commit messages

Is my software ecosystem healthy? It depends!

Bot or not? Detecting bots in GitHub pull request activity based on comment s...

On the fragility of open source software packaging ecosystems

How magic is zero? An Empirical Analysis of Initial Development Releases in S...

Comparing dependency issues across software package distributions (FOSDEM 2020)

Measuring Technical Lag in Software Deployments (CHAOSScon 2020)

SecoHealth 2019 Research Achievements

SECO-Assist 2019 research seminar

Empirically Analysing the Socio-Technical Health of Software Package Managers

ConPan: Analysing Packages Installed in Docker Containers

On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...

Último

Gwalior CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL IN We are Providing :- ● – Private independent collage Going girls . ● – independent Models . ● – House Wife’s . ● – Private Independent House Wife’s ● – Corporate M.N.C Working Profiles . ● – Call Center Girls . ● – Live Band Girls . ●- Foreigners & Many More . Service type: 1.In call 2.out call 3. full Lip to Lip kiss 4.69 5.b-job without Condom 6. Hard Core sex & Much More. 7 Body to Body Touch 8 Kissing 9 Sucking Boobs and More 10 Enjoy by Hand 11 Relax By Oral 12 Sex with Happy Ending • In Call and Out Call Service • 3* 5* 7* Hotels Service • 24 Hours Available • Indian, Russian, Punjabi, Kashmiri Escorts • Real Models, College Girls, House Wife, Also Available • Short Time and Full Time Service Available • Hygienic Full AC Neat and Clean Rooms Avail. In Hotel 24 hours • Daily Escorts Staff Available • Minimum to Maximum Range Available.

Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL

kantirani197

PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE low price Call me 8617370543 100%genuine sexy VIP call girl safe service WhatsApp chat. 8617370543 Normal call kijiye 8617370543 100% genuine young college girl and housewife Full enjoy open minded girl provide 24hour full cooperative model and full satisfactions☎️*8617370543*⭐Escorts service █▬█⓿▀█▀ call girls and bhabhi available for room Sex and video call service A-1 HIGH CLASS CALL GIRLS TOP MODEL 24X7 HOME/HOTEL Call Girls Safe & Secure High Class Sm Affordable Rate 100% Satisfaction, Unlimited Enjoyment. Any Time for Model/Escort in High class luxury and premium

PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE

Goa Call Girls High Profile Escorts

Zoology 5th semester notes( Sumit_yadav).pdf

Sumit Kumar yadav

Thyroid Physiology_Dr.E. Muralinath_ Associate Professor

muralinath2

Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....

muralinath2

Dr. E. Muralinath_ Blood indices_clinical aspects

muralinath2

Use of mutants in understanding seedling development.pptx

RenuJangid3

Digital Dentistry.Digital Dentistryvv.pptx

MohamedFarag457087

CYTOGENETIC MAP................ ppt.pptx

Silpa

Module for Grade 9 for Asynchronous/Distance learning

levieagacer

Context. WASP-76 b has been a recurrent subject of study since the detection of a signature in high-resolution transit spectroscopy data indicating an asymmetry between the two limbs of the planet. The existence of this asymmetric signature has been confirmed by multiple studies, but its physical origin is still under debate. In addition, it contrasts with the absence of asymmetry reported in the infrared (IR) phase curve. Aims. We provide a more comprehensive dataset of WASP-76 b with the goal of drawing a complete view of the physical processes at work in this atmosphere. In particular, we attempt to reconcile visible high-resolution transit spectroscopy data and IR broadband phase curves. Methods. We gathered 3 phase curves, 20 occultations, and 6 transits for WASP-76 b in the visible with the CHEOPS space telescope. We also report the analysis of three unpublished sectors observed by the TESS space telescope (also in the visible), which represents 34 phase curves. Results. WASP-76 b displays an occultation of 260±11 and 152±10 ppm in TESS and CHEOPS bandpasses respectively. Depending on the composition assumed for the atmosphere and the data reduction used for the IR data, we derived geometric albedo estimates that range from 0.05 ± 0.023 to 0.146 ± 0.013 and from <0.13 to 0.189 ± 0.017 in the CHEOPS and TESS bandpasses, respectively. As expected from the IR phase curves, a low-order model of the phase curves does not yield any detectable asymmetry in the visible either. However, an empirical model allowing for sharper phase curve variations offers a hint of a flux excess before the occultation, with an amplitude of ∼40 ppm, an orbital offset of ∼−30◦ , and a width of ∼20◦ . We also constrained the orbital eccentricity of WASP-76 b to a value lower than 0.0067, with a 99.7% confidence level. This result contradicts earlier proposed scenarios aimed at explaining the asymmetry observed in high-resolution transit spectroscopy. Conclusions. In light of these findings, we hypothesise that WASP-76 b could have night-side clouds that extend predominantly towards its eastern limb. At this limb, the clouds would be associated with spherical droplets or spherically shaped aerosols of an unknown species, which would be responsible for a glory effect in the visible phase curves.

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b

Sérgio Sacani

FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry

Alex Henderson

Cyathodium bryophyte: morphology, anatomy, reproduction etc.

Silpa

Factory Acceptance Test( FAT).pptx .

Poonam Aher Patil

FAIRSpectra - Enabling the FAIRification of Analytical Science

Alex Henderson

Selaginella: features, morphology ,anatomy and reproduction.

Silpa

www.seribangash.com The Mariana Trench is one of the most remarkable geological features on Earth. Here are some details about it: Location: The Mariana Trench is located in the western Pacific Ocean, east of the Mariana Islands. It stretches for about 2,550 kilometers (1,580 miles) and is known as the deepest part of the world's oceans. Depth: The trench reaches incredible depths, with its deepest point known as the Challenger Deep, which plunges down to approximately 10,984 meters (36,037 feet) below sea level. To put this into perspective, if Mount Everest, the tallest mountain on Earth, were placed at the bottom of the Challenger Deep, its peak would still be over 2 kilometers (1.25 miles) underwater. Formation: The Mariana Trench was formed by the subduction of the Pacific Plate beneath the Mariana Plate. This process creates a deep trench as the heavier Pacific Plate is forced beneath the lighter Mariana Plate. Geological Features: The trench is characterized by steep, V-shaped valleys, and its walls are composed of highly compressed sedimentary rock. At the bottom of the trench, there are also large amounts of marine sediment. Pressure: The pressure at the bottom of the Mariana Trench is immense, reaching over 1,000 times the pressure at the surface. This extreme pressure creates a challenging environment for exploration and makes it difficult for organisms to survive. Exploration: Despite its extreme conditions, the Mariana Trench has been the subject of numerous scientific expeditions and explorations. One of the most famous explorations was the dive to the Challenger Deep by Swiss scientist Jacques Piccard and U.S. Navy Lieutenant Don Walsh in 1960. More recently, in 2012, filmmaker James Cameron made a solo dive to the bottom of the Challenger Deep in the Deepsea Challenger submersible. Biological Discoveries: Despite the harsh conditions, the Mariana Trench is home to a surprising variety of life forms, including unique species of deep-sea fish, crustaceans, and microbial life. Some organisms have adapted to survive in the extreme pressure and darkness of the trench. Environmental Importance: Studying the Mariana Trench provides valuable insights into the geology, biology, and oceanography of the deep sea. It also helps scientists better understand the processes that shape the Earth's crust and the distribution of life in the oceans. Conservation: Due to its remote location and extreme depths, the Mariana Trench has remained relatively untouched by human activity. However, there is growing concern about the potential impacts of deep-sea mining and pollution on this fragile ecosystem, highlighting the need for conservation efforts to protect this unique environment. https://seribangash.com/barber-shop-business-complete-guide-for-beginners/ https://seribangash.com/legend-virat-kohli-in-cricket-history/

The Mariana Trench remarkable geological features on Earth.pptx

seri bangash

300003-World Science Day For Peace And Development.pptx

ryanrooker

Molecular markers are identifiable DNA sequences used to locate genes associated with specific traits or genetic conditions. A molecular marker is a specific gene fragment present at a specific position called ‘locus’ (pleural loci) in the genome of a cell. In the pool of unknown DNA or in a whole chromosome, these molecular markers help in identification of particular sequence of DNA at particular location.

Molecular markers- RFLP, RAPD, AFLP, SNP etc.

Silpa

Ultrasound color Doppler imaging has been routinely used for the diagnosis of cardiovascular diseases, enabling real-time flow visualization through the Doppler effect. Yet, its inability to provide true flow velocity vectors due to its one-dimensional detection limits its efficacy. To overcome this limitation, various VFI schemes, including multi-angle beams, speckle tracking, and transverse oscillation, have been explored, with some already available commercially. However, many of these methods still rely on autocorrelation, which poses inherent issues such as underestimation, aliasing, and the need for large ensemble sizes. Conversely, speckle-tracking-based VFI enables lateral velocity estimation but suffers from significantly lower accuracy compared to axial velocity measurements. To address these challenges, we have presented a speckle-tracking-based VFI approach utilizing multi-angle ultrafast plane wave imaging. Our approach involves estimating axial velocity components projected onto individual steered plane waves, which are then combined to derive the velocity vector. Additionally, we've introduced a VFI visualization technique with high spatial and temporal resolutions capable of tracking flow particle trajectories. Simulation and flow phantom experiments demonstrate that the proposed VFI method outperforms both speckle-tracking-based VFI and autocorrelation VFI counterparts by at least a factor of three. Furthermore, in vivo measurements on carotid arteries using the Prodigy ultrasound scanner demonstrate the effectiveness of our approach compared to existing methods, providing a more robust imaging tool for hemodynamic studies. Learning objectives: - Understand fundamental limitations of color Doppler imaging. - Understand principles behind advanced vector flow imaging techniques. - Familiarize with the ultrasound speckle tracking technique and its implications in flow imaging. - Explore experiments conducted using multi-angle plane wave ultrafast imaging, specifically utilizing the pulse-sequence mode on a 128-channel ultrasound research platform.

(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...

Scintica Instrumentation

Recognising bot activity in collaborative software development

1. Recognizing bot activity in collaborative software development Natarajan Chidambaram Software Engineering Lab, University of Mons, Belgium Supported by Service public deWallonie – Recherche under grant n°2010235 “ARIAC BY DIGITALWALLONIA4.AI” and Fonds de la Recherche Scientifique – FNRS under grant numbers F.4515.23, O.0157.18F-RG43 and T.0017.18 S ECO-AS S IS T

2. M. Golzadeh, T. Mens, A. Decan, E. Constantinou and N. Chidambaram, "Recognizing Bot Activity in Collaborative Software Development," in IEEE Software, vol. 39, no. 5, pp. 56-61, Sept.-Oct. 2022, doi: 10.1109/MS.2022.3178601.

3. https://doi.org/10.1145/3528228.3528406 https://doi.org/10.1109/MS.2022.3178601

6. Detecting bots, why? • Recognise and accredit project contributors • Which types of contributions to consider? • How to identify the contributors? • How to measure contribution effort? • Find and hire experts • Understand and improve the project development process • Avoid bias in socio-technical and bot-based studies

7. Prevalence of Bots in GitHub • 1 out of 10 top contributors is a bot • 12 out of 21 are not marked as [bot] by GitHub x

8. BoDeGHa BoDeGiC

9. M. Golzadeh, A. Decan and N. Chidambaram, "On the Accuracy of Bot Detection Techniques," 2022 IEEE/ACM 4th International Workshop on Bots in Software Engineering (BotSE), 2022, pp. 1-5, doi: 10.1145/3528228.3528406.

10. Accuracy of bot Identification Contributor type for 540 contributors present in 27 GitHub projects

11. Accuracy of bot Identification

12. Bot Identification

13. N. Chidambaram, A. Decan, T. Mens, A dataset of bot and human activities in GitHub, in: International Conference on Mining Software Repositories (MSR), IEEE, 2023.

14. GitHub Events API: Can retrieve the latest 300 events in the last 90 days Closing issue branch Creating tag Creating branch Creating repository IssuesEvent IssueCommentEvent Closing issue created Reopening issue reopened CreateEvent Opening issue

15. # contributors # activities Bot dataset 385 649,755 Human dataset 616 184,056 total 1,001 833,811 • 834K activities obtained from 1M+ events • 24 activity types • 1K contributors • 105 days (25 Nov 2022 - 9 Mar 2023) { "date": "2022-11-26T14:13:19+00:00", "activity": "Commenting issue", "contributor": "kubevirt-bot", "repository": "kubevirt/kubevirt", "comment": { "length": 255, "GH_node": "IC_kwDOBJIk985PKH4s" }, "issue": { "id": 8294, "title": "SRIOV VF interface not found in VM", "created_at": "2022-08- 13T11:10:06+00:00", "status": "open", "closed_at": null, "resolved": false, "GH_node": "I_kwDOBJIk985Pvz5k" } "conversation": { "comments": 9 } } JSON format

16. Usefulness of the Dataset • Analyse most frequent activities • Find differences in behaviour between bots and humans • Forecast future contributor activities • Distinguish bot and human contributors • Develop a new bot detection technique bot human

17. Some Distinguishing Features Number of activity types Variation in activity frequency

18. Some Distinguishing Features Hours to shift between repositories Dispersion of activity types across repositories

19. SATToSE 2023

Notas del editor

Hello everyone. I am Natarajan Chidambaram doing my PhD in the software engineering lab at University of Mons, Belgium. I welcome you all to this presentation on the topic “Recognizing bot..”.
In this presentation, I am going to explain our work done in this research paper that is published in the IEEE special issue. This study is mainly done to highlight that bot activities should be recognised in collaborative software development.
Apart from highlighting the work done in one research paper, I will also talk about the results that we obtained in the further studies in this regard. DOI
This is an example of a contributor creating a pull request and we can see a GitHub app is commenting under it. Here, we can see that the GitHub API marks this contributor as [bot].
So, this brings the importance of detecting bots in GitHub repositories. First, from the organisation point of view, it is to recognise and accredit project contributors, but the challenge is which.., how.., how.., then to find and hire the experts in performing certain tasks. Second, from the researcher point of view, we need to understand.., Avoid bias…
So, to analyse the prevalence of bots in GitHub, we considered 10 large open-source projects that were used for developing programming languages such as java, java script, python and rust. Each row in the figure represents a project and we rank the contributor based on the number of commits that they made in the project. The blue boxes are bots that are identified by GitHub API, the black boxes are bots that manually detected and not identified by GitHub API. The other boxes are human contributors. Highlighted contributors with a black border are responsible for the at least 1% of total commits in the repository. Here we can clearly have 2 inferences. First – 10% of top contributors are bots. Second inference is that more than half of these bots are not marked as bot by GitHub. By having these many bots as top contributors in the list of contributors to a repository, the human contributors might lack loose motivation as their efforts are not acknowledged. For, example, the last row, bots are the top 2 contributors. So, we use bot identification tools to identify these bots.
These are the two bot identification techniques that were developed in our lab. BoDeGHa detects bots that are involved in commenting issue and pull request activity within a repository, BoDeGiC identifies the bots that are involved in commit messages within a repository.
In the further work, we worked on to evaluate the accuracy of existing bot identification techniques.
We considered top 20 contributors in terms of commits in 27 popular projects in GitHub. As mentioned earlier, BoDeGiC is a bot identification tool that works based on commit messages, list of bots have a list of bot contributors present in GitHub, “bot” suffix is having bot at the end of the contributor’s name, BoDeGHa is another bot identification tool and GitHub account type is the type of account provided by GitHub API. None of the tools are detecting bots perfectly. The contributors on the left side of the line are bots and contributors on the right are humans. List of bots, “bot” suffix and GitHub account type did not classify any human as bot, but they classified many bots as humans. Very few accounts are classified as bots by all the tools.
As none of the tools are perfect, we thought developing an ensembled model using all these tools and methods would improve the bot identification technique. So, we developed such a model named EnsBod. The ensembled model seems to work better compared to all the bot identification techniques. There are other bot identification tools such as BIMAN that considers the “bot” string at the end of the contributor’s name, pattern in commit messages and the features related to files changed in commits. There is another tool named BotHunter which was not available at the time of this study. It is also ensembled model of bodegha, bodegic, biman with some more additional features.
None of the tools and methods could detect some bot contributors, in the circle, we can see either the bots are marked as humans or there is not enough activity to come to decision. Even EnsBod cannot identify these bots. This is because the tools consider only a limited set of activities like commenting. The unknown contributors can be active in performing other activities such as publishing a release, performing code review and so on. So, by considering all the activities that these contributors are performing in software repositories, we might be able to detect bots more efficiently.
So, we developed such a dataset of contributor activities such as fork repositories, create a tag, delete a branch, publish a release and so on which can be used for further analysis.
To get the contributor activity data, we depend on GitHub events API. Through this API we can retrieve the latest 300 events that the contributor has performed in the last 90 days. So, to collect all the contributor events that can be used for the analysis, we queried the API at regular intervals. Here is an example of a GitHub event type IssuesEvent. The action value in the payload determines the activity that the contributor is performing. The action can be closed for Closing issue, opened for opening issue and reopened for reopening an issue. Although these are completely different activities, they are reported under the same Event type in GitHub Events. So, we created a list of contributors and created a dataset of all their ACTIVITIES in GtiHub. This is not a one to one mapping, we identified the activity types from a single or a combination of events. One event type, CreateEvent can lead to three different activities. Whereas on the other hand, for Closing Issue, If it is just closed, then only IssuesEvent will be triggered, but if it is closed with a comment then 2 events would be triggered, IssuesEvent and IssueCommentEvent. Also, depending on the payload the activity changes for the same combination of event types.
To quantify our dataset, we identified 834 thousand activities from more than 1 million events. It contains 24 different activity types, performed by 1000 contributors for a duration of 105 days. Earlier I mentioned that GitHub events API can provide the contributor events only the latest 300 events in the last 90 days, but our dataset contains ALL THE ACTIVITIES performed by these contributors for 105 days that cannot be obtained through the API right now. We have two datasets, one is for the activities performed by 385 bots and another is for the activities performed by 616 human contributors. On the right, we can see an extract of an activity from our dataset. The data is present in JSON format. For each contributor we provide the first 4 fields, that is the date of activity, the activity type, the contributor who performed the activity and the repository in which the activity is performed. The other fields such as comment, issue and conversation are specific for this activity type “commenting issue” and varies for other activity types.
There are various use-cases with this dataset, on the descriptive analysis side, one can analyse the most frequent activities and find differences in behaviour between bots and humans based on their activities. Whereas on the machine learning side, we can forecast future contributor activities, develop a model that can distinguish bot and human contributors and develop a new bot identification technique.
Further, the number of activity types and variation in activity frequency.
In an on-going work, using this dataset, we statistically identified some distinguishing features between bots and humans based on their activities. They are mainly the time taken by those contributors to shift between repositories, the dispersion of activity types across repositories
This preliminary insights are accepted at SATToSE, and we might identify more distinguishing features between bot and humans, train and validate a model that an identify bot contributors and develop a tool that can be used for this purpose.
To summarise, first we saw the prevalence of bots in GitHub and highlighted the lack of bot identification by GitHub, then we evaluated the performance of the existing bot identification techniques and found that they are not perfect as do not consider all the activities that the contributors are performing in software projects. So, we developed an activity dataset and found some distinguishing features between bots and humans which can be used to develop a new bot identification technique.

Recognising bot activity in collaborative software development

Recomendados

Recomendados

Más contenido relacionado

Similar a Recognising bot activity in collaborative software development

Similar a Recognising bot activity in collaborative software development (20)

Más de Tom Mens

Más de Tom Mens (20)

Último

Último (20)

Recognising bot activity in collaborative software development

Notas del editor