SlideShare a Scribd company logo
1 of 19
1
Production




             2
3
Features   Code Changes

FA         CA1

FB         CB1


                          4
• If change CA1 implements FA and        Feature
                                                   Code      Missing
                                                   Changes   Dependencies
  change CB1 implements FB
                                         FA        CA1       CA2 CB1
• If a change CA2 is added to modify
  FA and CA2 is dependent on CB1         FB        CB1




     CA1                          CA1 CA2 CB1
                   Integrate FA


     CB1
                                                                       5
                   Integrate FB
CA1   CA2   CB1




                  6
Automated
                                              Grouping ( during
   Define       Calibrate the      Commit       Integration)
Dissimilarity    Metrics on      Assignment
  Metrics       Prior Versions    Algorithm
                                              Developer Guided
                                              Grouping ( during
                                               Development)




                                                                  8
Given two commits characterized by files, developers and change requests (CRs)

    Metric                               Description
    File Dependency Distance (FD)        Captures source code dependencies
                                         between files involved in two commits
    File Association Distance (FA)       Captures logical dependencies between
                                         files involved in two commits

    Developer Dissimilarity Distance (DD) Captures the working relation between
                                          two developers submitting commits

    CR Dependency Distance (CRD)         Captures the dissimilarity between the
                                         CRs implemented by two commits


                                                                                  9
Automated
                                              Grouping ( during
   Define       Calibrate the      Commit       Integration)
Dissimilarity    Metrics on      Assignment
  Metrics       Prior Versions    Algorithm
                                              Developer Guided
                                              Grouping ( during
                                               Development)




                                                              10
For each of the four metrics -
                                                b3
• Min_Threshold = Avg(a)                  b2

• Max_Threshold = Avg(bmin)                     a

• Silhouette= Avg{(bmin-a)/max(bmin,a)}    b1



  A higher silhouette value is better


                                                     11
Automated
                                              Grouping ( during
   Define       Calibrate the      Commit       Integration)
Dissimilarity    Metrics on      Assignment
  Metrics       Prior Versions    Algorithm
                                              Developer Guided
                                              Grouping ( during
                                               Development)




                                                              12
• Apply the similarity metrics
  in order of their precedence

• If no suitable group is found
  for a commit, assign the
  commit to a new group

                                  Color > Shape


                                                  13
Automated
                                              Grouping ( during
   Define       Calibrate the      Commit       Integration)
Dissimilarity    Metrics on      Assignment
  Metrics       Prior Versions    Algorithm
                                              Developer Guided
                                              Grouping ( during
                                               Development)




                                                              14
Groups commits incrementally
and uses developers’ feedback
to improve the grouping during
development




Both approaches follow the k-means clustering method which consists
     in assigning each item to the cluster with the nearest mean.
                                                                      15
We analyzed three major versions of a family of mobile
                   applications

                                                         16
• Validate the dissimilarity metrics
  Can the proposed metrics be used to identify
  commit dependencies ?
• Validate the grouping approaches
  How efficient are our proposed grouping
  approaches?
• Value for Developers
  Can the proposed approaches identify commit
  dependencies missed by developers ?

                                                 17
The four similarity metrics display good abilities in
                     grouping commits ( i.e. high silhouette values)
                     1    0.94                        0.96                        0.96


                                                             0.79
                    0.8          0.76
                                                                    0.67                 0.67
Silhouette Value




                                        0.63
                    0.6                                                                         0.57
                                                                                                              CRD
                                                                           0.47                        0.49
                                               0.46
                                                                                                              FA
                    0.4                                                                                       DD
                                                                                                              FD
                    0.2


                     0
                                 Verion 1                    Version 2                   Version 3


                                                 CRD > FA > DD > FD
                                                                                                                    18
• Efficiency of the Grouping Approaches
  – 82% of commit dependencies were recovered by
    the automated grouping with a precision of 95%
  – The accuracy of the developer-guided grouping
    approach is 98%
  – We observed that precision/recall improves with
    longer history data
• Value for Developers
  – Automated grouping and Developer-guided
    grouping approaches were able to reduce
    integration failures by 76% and 94% respectively
                                                       19
20

More Related Content

Viewers also liked

Do Faster Releases Improve Software Quality?
Do Faster Releases Improve Software Quality? Do Faster Releases Improve Software Quality?
Do Faster Releases Improve Software Quality?
Foutse Khomh
 
Late Propagation in Software Clones
Late Propagation in Software ClonesLate Propagation in Software Clones
Late Propagation in Software Clones
Foutse Khomh
 
An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...
An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...
An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...
Foutse Khomh
 
Materi statitiska smp
Materi statitiska smpMateri statitiska smp
Materi statitiska smp
Endi Sudrajad
 

Viewers also liked (12)

Do Faster Releases Improve Software Quality?
Do Faster Releases Improve Software Quality? Do Faster Releases Improve Software Quality?
Do Faster Releases Improve Software Quality?
 
Oral
OralOral
Oral
 
Late Propagation in Software Clones
Late Propagation in Software ClonesLate Propagation in Software Clones
Late Propagation in Software Clones
 
How does Context affect the Distribution of Software Maintainability Metrics?
How does Context affect the Distribution of Software Maintainability Metrics?How does Context affect the Distribution of Software Maintainability Metrics?
How does Context affect the Distribution of Software Maintainability Metrics?
 
Robi activation-hamza
Robi activation-hamzaRobi activation-hamza
Robi activation-hamza
 
An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...
An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...
An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...
 
Online Journalism in Bangladesh
Online Journalism in BangladeshOnline Journalism in Bangladesh
Online Journalism in Bangladesh
 
Computer1 test 2 prep: processing, software, storage
Computer1   test 2 prep:  processing, software, storageComputer1   test 2 prep:  processing, software, storage
Computer1 test 2 prep: processing, software, storage
 
Country
CountryCountry
Country
 
蓝天#52
蓝天#52蓝天#52
蓝天#52
 
Materi statitiska smp
Materi statitiska smpMateri statitiska smp
Materi statitiska smp
 
daknet
daknetdaknet
daknet
 

Similar to Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

Icsm2012 selective codeintegration
Icsm2012 selective codeintegrationIcsm2012 selective codeintegration
Icsm2012 selective codeintegration
SAIL_QU
 
Collaborate12 Fce
Collaborate12 FceCollaborate12 Fce
Collaborate12 Fce
rkrivera
 
Collaborate12 fce
Collaborate12 fceCollaborate12 fce
Collaborate12 fce
rkrivera
 
WWW Conference 2012 - Web-Engineering - Cloudgenius
WWW Conference 2012 - Web-Engineering - CloudgeniusWWW Conference 2012 - Web-Engineering - Cloudgenius
WWW Conference 2012 - Web-Engineering - Cloudgenius
Dr.-Ing. Michael Menzel
 
Anish Karmakar S C A
Anish  Karmakar    S C AAnish  Karmakar    S C A
Anish Karmakar S C A
SOA Symposium
 
Lead Allocation System's Attribute Driven Design (ADD)
Lead Allocation System's Attribute Driven Design (ADD)Lead Allocation System's Attribute Driven Design (ADD)
Lead Allocation System's Attribute Driven Design (ADD)
Amin Bandeali
 
Dollars and dates are killing agile final
Dollars and dates are killing agile finalDollars and dates are killing agile final
Dollars and dates are killing agile final
drewz lin
 

Similar to Recovering Commit Dependencies for Selective Code Integration in Software Product Lines (20)

Icsm2012 selective codeintegration
Icsm2012 selective codeintegrationIcsm2012 selective codeintegration
Icsm2012 selective codeintegration
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency Graphs
 
Vb
VbVb
Vb
 
Postdoc Symposium - Abram Hindle
Postdoc Symposium - Abram HindlePostdoc Symposium - Abram Hindle
Postdoc Symposium - Abram Hindle
 
Collaborate12 Fce
Collaborate12 FceCollaborate12 Fce
Collaborate12 Fce
 
Collaborate12 fce
Collaborate12 fceCollaborate12 fce
Collaborate12 fce
 
WWW Conference 2012 - Web-Engineering - Cloudgenius
WWW Conference 2012 - Web-Engineering - CloudgeniusWWW Conference 2012 - Web-Engineering - Cloudgenius
WWW Conference 2012 - Web-Engineering - Cloudgenius
 
Keynote HotSWUp 2012
Keynote HotSWUp 2012Keynote HotSWUp 2012
Keynote HotSWUp 2012
 
Capacity Planning and Modelling
Capacity Planning and ModellingCapacity Planning and Modelling
Capacity Planning and Modelling
 
Database Change Management | Change Manager 5.1 Beta
Database Change Management | Change Manager 5.1 BetaDatabase Change Management | Change Manager 5.1 Beta
Database Change Management | Change Manager 5.1 Beta
 
eArtius HMGE Algorithm Applied to Optimization Tasks with 10,000 Design Varia...
eArtius HMGE Algorithm Applied to Optimization Tasks with 10,000 Design Varia...eArtius HMGE Algorithm Applied to Optimization Tasks with 10,000 Design Varia...
eArtius HMGE Algorithm Applied to Optimization Tasks with 10,000 Design Varia...
 
Cloud Migration: Moving to the Cloud
Cloud Migration: Moving to the CloudCloud Migration: Moving to the Cloud
Cloud Migration: Moving to the Cloud
 
CSMR06a.ppt
CSMR06a.pptCSMR06a.ppt
CSMR06a.ppt
 
Auto mapper public
Auto mapper publicAuto mapper public
Auto mapper public
 
Design1
Design1Design1
Design1
 
Anish Karmakar S C A
Anish  Karmakar    S C AAnish  Karmakar    S C A
Anish Karmakar S C A
 
Lead Allocation System's Attribute Driven Design (ADD)
Lead Allocation System's Attribute Driven Design (ADD)Lead Allocation System's Attribute Driven Design (ADD)
Lead Allocation System's Attribute Driven Design (ADD)
 
Framework Engineering 2.1
Framework Engineering 2.1Framework Engineering 2.1
Framework Engineering 2.1
 
Dollars and Dates are Killing Agile
Dollars and Dates are Killing AgileDollars and Dates are Killing Agile
Dollars and Dates are Killing Agile
 
Dollars and dates are killing agile final
Dollars and dates are killing agile finalDollars and dates are killing agile final
Dollars and dates are killing agile final
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

  • 1. 1
  • 3. 3
  • 4. Features Code Changes FA CA1 FB CB1 4
  • 5. • If change CA1 implements FA and Feature Code Missing Changes Dependencies change CB1 implements FB FA CA1 CA2 CB1 • If a change CA2 is added to modify FA and CA2 is dependent on CB1 FB CB1 CA1 CA1 CA2 CB1 Integrate FA CB1 5 Integrate FB
  • 6. CA1 CA2 CB1 6
  • 7. Automated Grouping ( during Define Calibrate the Commit Integration) Dissimilarity Metrics on Assignment Metrics Prior Versions Algorithm Developer Guided Grouping ( during Development) 8
  • 8. Given two commits characterized by files, developers and change requests (CRs) Metric Description File Dependency Distance (FD) Captures source code dependencies between files involved in two commits File Association Distance (FA) Captures logical dependencies between files involved in two commits Developer Dissimilarity Distance (DD) Captures the working relation between two developers submitting commits CR Dependency Distance (CRD) Captures the dissimilarity between the CRs implemented by two commits 9
  • 9. Automated Grouping ( during Define Calibrate the Commit Integration) Dissimilarity Metrics on Assignment Metrics Prior Versions Algorithm Developer Guided Grouping ( during Development) 10
  • 10. For each of the four metrics - b3 • Min_Threshold = Avg(a) b2 • Max_Threshold = Avg(bmin) a • Silhouette= Avg{(bmin-a)/max(bmin,a)} b1 A higher silhouette value is better 11
  • 11. Automated Grouping ( during Define Calibrate the Commit Integration) Dissimilarity Metrics on Assignment Metrics Prior Versions Algorithm Developer Guided Grouping ( during Development) 12
  • 12. • Apply the similarity metrics in order of their precedence • If no suitable group is found for a commit, assign the commit to a new group Color > Shape 13
  • 13. Automated Grouping ( during Define Calibrate the Commit Integration) Dissimilarity Metrics on Assignment Metrics Prior Versions Algorithm Developer Guided Grouping ( during Development) 14
  • 14. Groups commits incrementally and uses developers’ feedback to improve the grouping during development Both approaches follow the k-means clustering method which consists in assigning each item to the cluster with the nearest mean. 15
  • 15. We analyzed three major versions of a family of mobile applications 16
  • 16. • Validate the dissimilarity metrics Can the proposed metrics be used to identify commit dependencies ? • Validate the grouping approaches How efficient are our proposed grouping approaches? • Value for Developers Can the proposed approaches identify commit dependencies missed by developers ? 17
  • 17. The four similarity metrics display good abilities in grouping commits ( i.e. high silhouette values) 1 0.94 0.96 0.96 0.79 0.8 0.76 0.67 0.67 Silhouette Value 0.63 0.6 0.57 CRD 0.47 0.49 0.46 FA 0.4 DD FD 0.2 0 Verion 1 Version 2 Version 3 CRD > FA > DD > FD 18
  • 18. • Efficiency of the Grouping Approaches – 82% of commit dependencies were recovered by the automated grouping with a precision of 95% – The accuracy of the developer-guided grouping approach is 98% – We observed that precision/recall improves with longer history data • Value for Developers – Automated grouping and Developer-guided grouping approaches were able to reduce integration failures by 76% and 94% respectively 19
  • 19. 20

Editor's Notes

  1. Software products lines allow the development of similar products using common software components
  2. Whenever modifications are performed by developers on the main branch integrators selectively propagate the modifications to the respective products by picking changes relevant for the specific products.
  3. To ensure the success of these selective integration, development teams attempt to maintain clear mappings between code changes performed by developers and features from the products. However this mapping is not always maintains carefully, making this integration process very brittle.