SlideShare a Scribd company logo
1 of 24
Download to read offline
PROMISE at ICSE’07

MODELING the EFFECT of SIZE on
DEFECT PRONENESS for OPEN-SOURCE
SOFTWARE
A. Güneş Koru1, Donsong Zhang1, and Hongfang Liu2

1Department   of Information Systems
UMBC
Baltimore, MD, USA

2Georgetown Medical Center
Department of Bioinformatics, Biostatistics, and Biomathematics
Georgetown University, Washington, D.C., USA

E-mails: gkoru@umbc.edu, zhangd@umbc.edu, hl224@georgetown.edu
UMBC

• UMBC, University of Maryland, Baltimore County (http://umbc.edu/
  ~gkoru)
• Public research university with a focus on graduate education.
• Theoretically, all campuses belong to the University of Maryland but
  practically they look like different universities.
  • UMBC is located in Baltimore in a small suburban neighborhood called
    Catonsville. UMBC is not
     • University of Maryland, College Park
     • University of Baltimore (Business school)
     • University of Maryland Baltimore (Medical School)
• Hongfang Liu is with the Georgetown University located in Washington,
  D.C., Interested in Bioinformatics and Health Care.
Size--Defect Relationship
• Size is perhaps the oldest measure. Mostly, measured by lines of code
 (sometimes function points).
• Several studies found size to be associated with defect count. Earliest:
 A linear model in [Akiyama 71].
• Many other measures (e.g. cyclomatic complexity [McCabe 76],
 software science measures [Halstead 77]) are also correlated with size.
 There is some consensus that these are also size measures [Fenton and
 Pfleeger 96].
         “May be size does not explain everything, but it explains a lot.”
                                                 Bojan Cukic, PROMISE 2007
• Functional form of this relationship is still not understood well.
• Commonly, practitioners assume a linear relationship [El Emam 05].
• Only general conclusion is that there is a continuously increasing
 relationship between the two [Fenton and Ohlsson 00, El Emam et al.
 01].
Size--Defect Relationship: Alternative Forms
                  defects                defects              defects




                                  size                     size                 size
                            (a)                    (b)                  (c)

• Implications:                                          “Things are linear is open to questions”
                                                                  Tim Menzies, PROMISE 2007
   • (a) Linear: Smaller and larger
     modules are proportionally equally            • Theoretical and Practical Importance
     problematic
                                                         • Decomposition
   • (b) Quadratic: Larger modules are
                                                         • Focused quality assurance
     proportionally more problematic
                                                         • Functional Enhancements
   • (c) Logarithmic: Smaller modules
     are proportionally more problematic
Why the relationship is still unclear...
• Many earlier studies did not fully explore alternative functional forms or test the
  deviation from linearity significantly.
    • Linear models [Akiyama 71] or correlations [Andersson and Runeson 07] were
      found sufficient.
    • A study stated that linear models could be good as first approximations and
      there was better tool support [Shen 85]
• Number of data points were very limited in the earlier studies (e.g. Akiyama 71).
• Deriving models analytically and then fitting data to validate those models [Lipow
  82].
• Accepting correlations as a sign of a linear relationship [Schneidewind and
  Hoffman 78]. Correlations do not imply proportionality.
• Focus shift on defect density. Observations for optimal module size that
  minimizes defect density. U-shaped curve (Goldilock’s conjecture) [Withrow 90,
  Hatton 97, Hatton 98, etc]. See [El Emam 02] for a detailed review.
    • This approach can mask the plain size--defect relationship and mislead us. [El
      Emam 02, Fenton and Neil 00, and Rosenberg 97]
• Gets more difficult to understand from multivariate and sophisticated machine
  learning models (e.g. from Neural Networks in [Khoshgoftaar 97]).
Conventional Approach to
Investigate Size--Defect Relationship

• All these studies share a common characteristic

• A software system is measured at a snapshot time, then the
  obtained measurements are associated with the future defect count
  (note this might be pre-release or post release) For ex: [Koru and
  Tian 03] [Khoshgoftaar 96]

• Usually, measurement and analysis performed at module level.

• A common problem is the availability of data [Fenton and Ohlsson
  02].

• Publicly available Open Source Software (OSS) repositories: Source
  code, change data, and defect data [Koru and Tian 04].
Challenges with Using Conventional Method in
 OSS Context
• Evolutionary aspects of OSS. Continuous and concurrent functional
  enhancements, defect fixes, all other changes (perfective, adaptive, etc.) Bazaar
  model rather than cathedral model [Raymond 99].
• OSS, usually, developed by volunteers, not too much planning, no requirements
  or design documents, source code is the main artifact. [Mockus et al. 00, Mockus
  et al. 02].
• Quality assurance activities are not systematic in OSS (see Zhao and Elbaum 03,
  Koru et al. 07])
• So far, research using conventional approach focused on relatively better
  planned, analyzed, designed, and tested closed source products.
• Internal validity problems caused by the dynamic OSS context:
   • Deleted classes
   • Size changes
• There might be closed source products developed in an evolutionary manner and
  vice versa. Such comparisons are outside of the scope here (see [Paulson et al.
  04]))
In this study...
 “If developers play with a file, it can change its defect proneness”
                                           Elaine Weyuker, PROMISE 2007
• To gain a better understanding of the size--defect relationship, we
  used both
  • Novel approach that adopts Cox Proportional Hazards Modeling
    with Recurrent Events (Cox Modeling) [Cox 72].
• The data comes from a large-scale long-lived OSS product Mozilla
  (http://www.mozilla.org).
• The evolutionary aspects of the Mozilla project was shown in other
  studies:
  • Gyiomothy et al. [04] found that size of Mozilla increased
    significantly during successive releases.
  • Mockus et al. [02] found that there was no particular development
    process in Mozilla.
In the rest of this presentation...

•Methodology
  •Demonstrating the evolutionary aspects of Mozilla
  •Cox Modeling
  •Data Collection
•Modeling and Results
•Future Work
•Conclusion
Results:
Demonstrating Evolutionary Aspects of Mozilla

                                                                                                                                                                                                           (a)
                                       1000




                                                                                                                                                                                                                                                                                                                                                                           ●
Cumulative Number of Deleted Classes




                                                                                                                                                                                                                                                                                                                                                                           ●
                                                                                                                                                                                                                                                                                                                                                                          ●
                                                                                                                                                                                                                                                                                                                                                                          ●
                                                                                                                                                                                                                                                                                                                                                                          ●




                                                                                                                                                                                                                                                                                                                                                                                • For only Mozilla 1.0
                                                                                                                                                                                                                                                                                                                                                                          ●
                                                                                                                                                                                                                                                                                                                                                                         ●●
                                                                                                                                                                                                                                                                                                                                                                         ●●
                                                                                                                                                                                                                                                                                                                                                                         ●
                                                                                                                                                                                                                                                                                                                                                                         ●
                                                                                                                                                                                                                                                                                                                                                                         ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                 ●
                                                                                                                                                                                                                                                                                                                                                                ●
                                                                                                                                                                                                                                                                                                                                                         ●●
                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                   ●●
                                                                                                                                                                                                                                                                                                                                                   ●●
                                                                                                                                                                                                                                                                                                                                                   ●
                                                                                                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                                                                                             ● ●● ●
                                                                                                                                                                                                                                                                                                                                                ●●
                                                                                                                                                                                                                                                                                                                                ●●●
                                                                                                                                                                                                                                                                                                                                ●
                                                                                                                                                                                                                                                                                                                    ●●
                                                                                                                                                                                                                                                                                                              ●
                                                                                                                                                                                                                                                                                                        ●●
                                                                                                                                                                                                                                                                                              ● ●● ●
                                                                                                                                                                                                                                                                                                 ●
                                                                                                                                                                                                                                                                                                 ●
                                                                                                                                                                                                                                                                                      ●● ●
                                                                                                                                                                                                                                                                                  ●●●
                                       800




                                                                                                                                                                                                                                                                                  ●●●
                                                                                                                                                                                                                                                                                   ●
                                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                              ●●
                                                                                                                                                                                                                                                                              ● ●●
                                                                                                                                                                                                                                                                              ●
                                                                                                                                                                                                                                                                   ●




                                                                                                                                                                                                                                                                                                                                                                                  classes
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                       600




                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                                                                                                                ●●
                                                                                                                                                                                                                                                                ●●
                                                                                                                                                                                                                                                                ●●
                                                                                                                                                                                                                                                             ●● ●
                                                                                                                                                                                                                                                            ●●
                                                                                                                                                                                                                                                             ●
                                                                                                                                                                                                                                                     ●●
                                                                                                                                                                                                                                   ●● ● ●
                                                                                                                                                                                                                                  ●●
                                                                                                                                                                                                                                   ●
                                                                                                                                                                                                                                 ●●
                                                                                                                                                                                                                            ● ● ●●
                                                                                                                                                                                                                       ● ● ●●
                                                                                                                                                                                                                          ●
                                                                                                                                                                                                            ●● ●
                                                                                                                                                                                                             ●●
                                                                                                                                                                                                            ●●
                                                                                                                                                                                                            ●
                                                                                                                                                                                                            ●
                                                                                                                                                                                                            ●
                                                                                                                                                                                                            ●
                                                                                                                                                                                                            ●
                                                                                                                                                                                                           ●●
                                                                                                                                                                                                           ●●
                                                                                                                                                                                                  ●●
                                                                                                                                                                                                ●● ●●
                                                                                                                                                                                                ●●
                                                                                                                                                                                                ●
                                                                                                                                                                                                ●
                                       400




                                                                                                                                                                                                ●
                                                                                                                                                                                                ●
                                                                                                                                                                                                ●
                                                                                                                                                                                                ●
                                                                                                                                                                                                ●
                                                                                                                                                                                                ●
                                                                                                                                                                                                ●
                                                                                                                                                                                                ●
                                                                                                                                                                                                ●
                                                                                                                                                                                               ●●
                                                                                                                                                                                               ●●
                                                                                                                                                                                              ●
                                                                                                                                                                                              ●
                                                                                                                                                                                              ●
                                                                                                                                                                                         ● ●●
                                                                                                                                                                                        ●●
                                                                                                                                                                                        ●●
                                                                                                                                                                                        ●●●
                                                                                                                                                                                       ●
                                                                                                                                                                                       ●
                                                                                                                                                                                       ●
                                                                                                                                                                                       ●
                                                                                                                                                                                       ●
                                                                                                                                                                                   ●
                                                                                                                                                                             ●
                                                                                                                                                                           ●●
                                                                                                                                                                  ●
                                                                                                                                                                 ●●
                                                                                                                                                                 ●●
                                                                                                                                                            ●
                                                                                                                                                            ●
                                                                                                                                                            ●
                                                                                                                                                            ●
                                                                                                                                                       ●
                                                                                                                                                  ●●
                                                                                                                                                  ●●
                                                                                                                                              ●
                                                                                                                                             ●
                                                                                                                                        ●
                                                                                                                                        ●
                                                                                                                                        ●




                                                                                                                                                                                                                                                                                                                                                                                   • (a) Cumulative
                                                                                                                                       ●
                                                                                                                                     ●●
                                                                                                                                    ●●●
                                                                                                                                  ●●
                                                                                                                                  ●●
                                                                                                                                  ●●
                                                                                                                        ●
                                                                                                                        ●
                                                                                                                       ●
                                                                                                                       ●
                                                                                                                       ●
                                                                                                                       ●
                                                                                                                       ●
                                                                                                                       ●
                                       200




                                                                                                                       ●
                                                                                                                       ●
                                                                                                                      ●●
                                                                                                                   ● ●●
                                                                                                                   ●●●
                                                                                                                 ●●●
                                                                                                                ●●
                                                                                                                ●●
                                                                                                                ●
                                                                                                                ●
                                                                                                                ●
                                                                                                                ●
                                                                                                                ●
                                                                                                                ●
                                                                                                                ●
                                                                                                            ● ●●
                                                                                                            ●●
                                                                                                             ●
                                                                                                            ●
                                                                                                            ●
                                                                                                            ●
                                                                                                            ●
                                                                                                            ●
                                                                                                           ●●
                                                                                                           ●●
                                                                                                  ● ●●
                                                                                                     ●
                                                                                                    ●
                                                                                                   ●●
                                                                                              ●
                                                                                    ●●●
                                                                                    ●●




                                                                                                                                                                                                                                                                                                                                                                                     number of
                                                                               ●●
                                                                               ●
                                                                               ●
                                                                               ●
                                                                               ●
                                                                               ●
                                                                               ●
                                                                           ●●
                                                                          ●● ●
                                                                           ● ●●
                                                                          ●
                                                                          ●
                                                                    ● ●●
                                                                    ●● ● ●
                                                                  ●●●
                                                                 ●●●
                                                              ●●
                                                              ●●
                                                              ●
                                                          ●
                                                         ●
                                                    ●● ● ●
                                                 ● ●●●
                                       0




                                                     ●




                                                                                                                                                                                                                                                                                                                                                                                     deleted classes
                                                                                           2003                                                                                  2004                                                                                 2005                                                                                 2006
                                                                                                                                                                                                       Years




                                                                                                                                                                                                          (b)

                                                                                                                                                                                                                                                                                                                                                                                   • (b) Cumulative
                                                                                                                                                                                                                                                                                                                                                              ●●●●●●●●●●●●●●
                                                                                                                                                                                                                                                                                                                                                               ●●●●●●●●● ●●●●
                                                                                                                                                                                                                                                                                                                                                               ●●●●●●● ●●● ●●
                                                                                                                                                                                                                                                                                                                                                              ●●●●●●●●●●●●●●
                                                                                                                                                                                                                                                                                                                                                              ●●●●●●●●●●●●●
                                                                                                                                                                                                                                                                                                                                                       ●●●●●●●
                                                                                                                                                                                                                                                                                                                                                        ●● ●●●●
                                                                                                                                                                                                                                                                                                                                                      ●●●●●●●
                                                                                                                                                                                                                                                                                                                                                       ●●●●●●●
                                                                                                                                                                                                                                                                                                                                                       ●●●●●●
                                                                                                                                                                                                                                                                                                                                              ●●●●●●●●
                                                                                                                                                                                                                                                                                                                                                  ● ●●●
                                                                                                                                                                                                                                                                                                                                               ●●●●●●●
                                                                                                                                                                                                                                                                                                                                                  ● ●●●
                                                                                                                                                                                                                                                                                                                                                  ●●●●
                                                                                                                                                                                                                                                                                                                          ●●●●●●●● ●●●●●●●●●●●●●●
                                                                                                                                                                                                                                                                                                                           ●●●●● ● ●●●●● ●●●●●●●●●
                                                                                                                                                                                                                                                                                                                           ●● ●● ● ●●●● ●●●●●●●●
                                                                                                                                                                                                                                                                                                                       ●●●●●●●●●●●●●●●●●●●●●●
                                                                                                                                                                                                                                                                                                                        ●●●●●●●●●●●●●●●●●●●●●
                                                                                                                                                                                                                                                                                                                    ●●●●●
                                                                                                                                                                                                                                                                                                                      ●●●●
                                                                                                                                                                                                                                                                                                                     ●●●●
                                       1500000




                                                                                                                                                                                                                                                                                                        ●●●●●●●●●● ●●●●
                                                                                                                                                                                                                                                                                                         ●●●●●●●●●● ●●●
Cox Modeling
( A non-parametric approach)

• The instantaneous relative risk (hazard) of defect fix, also called event,
  becomes the response variable. Note that it can recur.
• A complete size history is obtained for each class by measuring size at each
  change and corrective changes are marked.
• Time of change is also noted. At each unique time, the hazard is calculated by
  dividing the events at that time by the classes at risk at that time.
                               λi (t) = λ0 (t)eβxi (t) .                         (1)
• Hazard function:
β is the regression coefficient for xi (t) and λ0 (t) is an unspecified non-negative
function of time called the baseline hazard function. It is the instantaneous
hazard of having an event without any covariate effect (i.e., when β = 0).
• Relative hazard:               eβ(xj (t)−xk (t))
• Note that the relative hazard is proportional to the difference in covariate
  values. This is called proportional hazards assumption and needs to be
  checked.
Methodology

• Relative log risk is noted by f(size) (for median size, it is set to zero).
• Examine the functional model with Cubic Spline Functions using four knots
  f (size) = β0 +β1 size+β2 (size−k1 )3 +β3 (size−k2 )3 +β4 (size−k3 )3 +β5 (size−k4 )3
                                      +               +               +               +
                                                                                 (1)
  where,
                                  (size − kn ), if (size − kn ) > 0
                (size − kn )+ =                                                  (2)
                                  0, otherwise

• Examined the alternative model visually
• Tested whether the deviation from linearity was statistically significant


                                H0 : β2 = β3 = β4 = 0
Methodology - Data Layout and Collection
                                                                     (A)
• We developed PERL scripts to extract                  class name   size   defect count
  source code, analyze CVS changes, and                      A        75          0
  to find whether a class is affected or not                  B       250          2
                                                             C       300          2
• (a) What would the data look like if                       D       600          2
  conventional approach was used.                            E       800          3
                                                             F       220          0
• (b) Novel Approach: Classes between                        G       300          0
  added to the system after Mozilla 1.0                      .         .          .
                                                             .         .          .
  release date were measured until Feb 22,
                                                                     (B)
  2006.                                         class name   start   end    event     size   state
    • Each change resulted in an observation        Y           0      50     0        75     0
                                                    Y          50     100     1       200     1
    • 15,545 observations                           Y         100     200     0       300     1
    • Events were identified by searching the        Z           0     200     1       250     0
                                                    Z         200     800     0       180     1
      CVS logs for words ‘bug’, ‘defect’, and       Z         800    1400     1       400     1
      ‘fix’. When we sampled 100 logs                Z        1400    1800     0       300     1
                                                    .           .       .     .         .
      randomly, we saw that this automated          .           .       .     .         .
      approach was correct for 98 of them.
Results - Functional Form
                                             2.0
                                             1.5
 Instantaneous relative risk of defect fix

                                             1.0
                                             0.5
                                             0.0
                                             −0.5
                                             −1.0




                                                    0   2000   4000     6000       8000   10000   12000

                                                                      Size (LOC)



   • When we use cubic spline functions the logarithmic form is also obvious. The
     curve down at the end is only for less than 0.3% of the data points. We can
     use log(size) directly in the Cox model
Results -- Modeling results
  MANUSCRIPT SUBMITTED TO TSE


                     coef     exp(coef)     se(coef)     robust se         z      p
  log(size)         0.368          1.44        0.00732             0.018   20.4       0


  Rsquare= 0.152 (max possible= 1)
  Likelihood ratio test= 2565             on 1 df,      p=0
  Wald test = 416                         on 1 df,      p=0
  Score (logrank) test = 2565             on 1 df,      p=0,
  Robust Score = 142 p=0


  Fig. 5.   Modeling results using logarithmic transform of size
Outlier Analysis -
Checking for overly influential data points
                0.0005
                                                   !
                                         !    !
                                     ! !!                       !!
                            ! ! !!! ! ! ! !
                           ! !! !!!! !!! ! ! ! ! ! ! ! ! !               !!
                                     ! ! ! ! ! !!! !!! !!
                                     !
                           ! ! !!! ! !! !
                                   !!                                     !
                                                                     !
                                                   !                             !!
                                                                         !
                                  !!                                            ! !! !
                                  !!!! ! !! ! ! ! !! ! ! ! ! ! ! ! !
                                     !
                              ! !!! !!! !! !!!! ! !! ! ! ! ! ! !! !
                                     !! ! ! !! ! ! ! ! ! ! ! ! ! !! !!! ! !!
                               ! !!!                                             !
                                  !! ! !
                                  !!                                    !
                                                 !! ! !                          ! !!
                                                                                 !
                                                     !!
                          !!!! ! ! ! ! !! !!!! !!! ! !! !!!!!!!! ! !!!          !!!! !!!
                            !!! !! ! !!!! !!!!!!! ! ! ! ! ! !!!!! ! !!!
                                                !! ! !                           !!
                            !!! ! ! !!         ! ! ! ! ! !! !
                             !! !! ! !! !! ! ! ! ! ! ! ! ! ! !! ! !!!
                                                            !! ! !
                                           ! !!!!! !!!!!! !! !! !!!!! !!!! ! !!!!! !!!
                             !!! ! !!!!!! !!! !!!!! !!!!!!!! !!!!!!!!!! !!!!!!!!!!
                                                !!!!!!! !! !!!! !!! ! ! !!! !!!!!!!!
                          !!!! ! !!! ! ! ! ! ! !! ! ! ! !!! !!                   !!! !
                           !! ! ! !! !!!!!! !!! ! ! ! ! !!! !! !! ! !!! !!!!!!!!!!
                          !!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                          !!!!!! ! ! ! ! ! ! ! ! ! ! !! ! ! !! ! ! !! ! ! ! !!!!!!  !
                                                                                !!! ! !!
                           !!! !! !! ! !!
                           !!! !
                          !!!!!!!!!!!!!!!!!!! !!! !!!!!!!!!!!!!!!!!!!!!! !!!!!!
                           !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!
                                                                     !!!! ! !!! !! !! !!
                           !!!!!!!!!!!! !!!!!! !!! !!!!!!!! !!!!!!!!!!!!!!!!!!!
                                                                                             Plotted Martingale Residuals
                          !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                           !!!!!!!!!! ! !!!! !!! ! ! !!!! !!!!! !!!! ! !! !! ! ! !!
                                         !! !!!!! ! !!! !!!!!!!!!!!!!!!! !!!!!!! !!!
                                                                             !
                                       !! !! !!!! ! ! ! ! ! ! ! !!! !! ! !! !! ! !
                          !!! !!!!!!!! !!! !!!!!!!!!! !!!!!!! !! !!! !!!!!! !!!!
                          !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                          !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                          !!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!
                          !!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!
                          !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                           !!          ! ! ! ! ! ! ! ! ! ! ! !!
                          !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                           !!! !
                           !!!!!!!!!!!!!!!!!!!!! !! !!!!!!!!!! !!!!!!!!!!!!!!!!
                          !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                             ! !!! ! !!!!! !!!! ! ! ! !!!! ! ! !!!! !!!!! !!!!!!!!!
                           !!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!! !!!!! !!!!!!!! !
                                       !!!!!!!!! !!!! !!!!! !!! !! !!! !! !!!!!!!!!
                                                                             !
                                        !!! ! !!! ! ! ! !! ! ! ! ! ! !! ! ! !    !!!
                                                                             !
                                        !!! !! !!! !!!! !! !! !! !! ! ! !! ! !!!!!!!
                          !! !!!!!! !! ! ! !
                           !! ! ! ! ! !
                          ! !!! ! !! !
                           !!
                          !!!!! ! !!!! !! !! !!!!!!!!! !!!
                          !!!!!! !! !!! !!! !! !!!!!!!! ! !!!!
                          !!!!!! !! !!!!!!!!!! !!!!!!! ! !!! ! !!!!!!!!!! !!!!!
                                                                         ! !! ! !!! !!!!
                                                                          ! !! !!!! !!!!
                          !!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!! ! !!!!!!!!!!!!!!!
                          !!!!!!!!!!!! !!! !!! ! !!!!! !!!!!!!! ! !!!! !!!!!!! !!
                                     !                                             !!!! !
                                                                                     !!
                           ! !!! ! !!!! ! !!!!!! !!!!! !!!!! !!
                                                       ! !!
                           !!                                                      !!!! !
                                                                !
                                                               !!                 ! !!! !
                                                                         !!! !!!!! !!!!
                                                       ! !!
                                     !
                           !! !!!! !!!!! ! !!! ! ! !!! !! !! ! !
                                                       !!
                                                       ! !!! ! !!
                            ! ! ! !! ! ! ! !                                 !
                                     !! !              !!
                                     !! !
                           !! !!! !!!!!!!!!! ! ! !! !!!! !!
                          !!!!!!!!!!!! !!! !!!! !!!!!! !!!!!!! ! ! !!! ! !! !!!!!!
                                                  !                         !
                                                        !!                          !! !
                           !! !!! !!!!! !!!!! !! ! ! !! !!!!
                                                       !!!
                                        ! ! ! ! ! ! !!
                              ! ! ! !!! ! ! ! ! ! !!              !      !!        !!!
                                                                             !
                                                                            !
                                          !!
                                 ! ! ! !!                        !
                                                              !!!
                                               !       !       !!        !!
                           !                                             !! ! ! !
                                                                             !
                            !!                                !! !
                                      ! !!                                       !
                !0.0005




                                                                  !                !
                                                                          !!
                                      ! ! ! ! !! ! !!
                            !                                 !! !
                                               ! !! !
                                               !               !!!
                               !                     ! ! ! ! !!! ! !
                                      ! ! ! !!                                  !!!
                                                                !!
                           !!                                 !!
                                               !     !        !!
                                    !       ! !! !                              !!
                                                         !                       !
                                                               !!
                                                                                                  Outliers are still
                                                                !!
                             !                               !! !
                                                 !!                                      !
                                                                             !
                             !                                            !!
                                                  !          ! !!
                            !                                                !
                                      !!             !
                                                  !!                              !
                            !!                       !         !
                                                  !!
                             !
                           !
                                                                                                 valid observations
    Influence




                                       !
                                                  !!              !
                                                                          !
                               !                                         !!
                !0.0015




                                                                                               Removing them only
                                                                                               brings the unit effect
                                              outliers
                                  !

                                                                                                of log size to 45 %
                !0.0025




                                                                                                  (small change)
                                                 !



                          0                5000               10000               15000
                                                                                              Decided to keep them.
                                                Observation id
Test of Proportional Hazards

                                                                         • Commonly, interaction with
                                                                           time is tested
                        20




                                                                         • Example: A drug only
                                                                           effective in the first hour.
                        10
Beta(t) for log(size)




                                                                         • Note: This test can also
                                                                           become significant when a
                        0




                                                                           wrong functional form is
                                                                           used.
                        !10




                                                                         • Result: p = 0.835 highly
                                                                           insignificant.
                        !20




                                                                         • A smooth plot of Schönfeld
                              0   500000   1000000   1500000   2000000

                                                                           residuals show almost a
                                           Time
                                                                           perfectly straight line.
Modeling the Effect of Size of Defect Proneness for Open-Source Software
Modeling the Effect of Size of Defect Proneness for Open-Source Software
Modeling the Effect of Size of Defect Proneness for Open-Source Software
Modeling the Effect of Size of Defect Proneness for Open-Source Software
Modeling the Effect of Size of Defect Proneness for Open-Source Software
Modeling the Effect of Size of Defect Proneness for Open-Source Software
Modeling the Effect of Size of Defect Proneness for Open-Source Software

More Related Content

Similar to Modeling the Effect of Size of Defect Proneness for Open-Source Software

United Maps - Where 2.0 2009 and WhereCamp
United Maps - Where 2.0 2009 and WhereCampUnited Maps - Where 2.0 2009 and WhereCamp
United Maps - Where 2.0 2009 and WhereCampStefan Knecht
 
United Maps - Pedestrian Map, Large Scale Samples Munich
United Maps - Pedestrian Map, Large Scale Samples MunichUnited Maps - Pedestrian Map, Large Scale Samples Munich
United Maps - Pedestrian Map, Large Scale Samples MunichStefan Knecht
 
United Maps - Pedestrian Map, Large Scale Samples Munich
United Maps - Pedestrian Map, Large Scale Samples MunichUnited Maps - Pedestrian Map, Large Scale Samples Munich
United Maps - Pedestrian Map, Large Scale Samples MunichStefan Knecht
 
Viedome Presentation Eu
Viedome Presentation EuViedome Presentation Eu
Viedome Presentation Eumwdgielen
 
Converting Existing Port Terminals —How we make it work
Converting Existing Port Terminals —How we make it workConverting Existing Port Terminals —How we make it work
Converting Existing Port Terminals —How we make it workPortek International Pte Ltd
 
2013 tech trends_poster
2013 tech trends_poster2013 tech trends_poster
2013 tech trends_posterDaniel Ross
 
Avid Shorty Ultimate Instructions
Avid Shorty Ultimate InstructionsAvid Shorty Ultimate Instructions
Avid Shorty Ultimate InstructionsGreg Keller
 
Eclipse idd2012 broerkens_bridgingthegapbetweentextualrequirementsandmodelbas...
Eclipse idd2012 broerkens_bridgingthegapbetweentextualrequirementsandmodelbas...Eclipse idd2012 broerkens_bridgingthegapbetweentextualrequirementsandmodelbas...
Eclipse idd2012 broerkens_bridgingthegapbetweentextualrequirementsandmodelbas...Mark Brörkens
 
Lizzie Raby: Critical Positions Presentation
Lizzie Raby: Critical Positions PresentationLizzie Raby: Critical Positions Presentation
Lizzie Raby: Critical Positions Presentationelizabethraby
 
Moll nimble hive_book-s
Moll nimble hive_book-sMoll nimble hive_book-s
Moll nimble hive_book-sNimble Hive
 
Primitives And Design Patterns for Top-Down SOA Implementations
Primitives And Design Patterns for Top-Down SOA ImplementationsPrimitives And Design Patterns for Top-Down SOA Implementations
Primitives And Design Patterns for Top-Down SOA ImplementationsMichael zur Muehlen
 
Get Out! Research Relief where they least expect it!
Get Out! Research Relief where they least expect it!Get Out! Research Relief where they least expect it!
Get Out! Research Relief where they least expect it!Sara Arnold-Garza
 
Demar Brochure Final Jan 2012
Demar Brochure Final Jan 2012Demar Brochure Final Jan 2012
Demar Brochure Final Jan 2012ImyMartinez
 

Similar to Modeling the Effect of Size of Defect Proneness for Open-Source Software (20)

Proposed CapX2020 routes
Proposed CapX2020 routesProposed CapX2020 routes
Proposed CapX2020 routes
 
CapX2020 map
CapX2020 mapCapX2020 map
CapX2020 map
 
United Maps - Where 2.0 2009 and WhereCamp
United Maps - Where 2.0 2009 and WhereCampUnited Maps - Where 2.0 2009 and WhereCamp
United Maps - Where 2.0 2009 and WhereCamp
 
United Maps - Pedestrian Map, Large Scale Samples Munich
United Maps - Pedestrian Map, Large Scale Samples MunichUnited Maps - Pedestrian Map, Large Scale Samples Munich
United Maps - Pedestrian Map, Large Scale Samples Munich
 
United Maps - Pedestrian Map, Large Scale Samples Munich
United Maps - Pedestrian Map, Large Scale Samples MunichUnited Maps - Pedestrian Map, Large Scale Samples Munich
United Maps - Pedestrian Map, Large Scale Samples Munich
 
6o kefalaio
6o kefalaio6o kefalaio
6o kefalaio
 
Viedome Presentation Eu
Viedome Presentation EuViedome Presentation Eu
Viedome Presentation Eu
 
Converting Existing Port Terminals —How we make it work
Converting Existing Port Terminals —How we make it workConverting Existing Port Terminals —How we make it work
Converting Existing Port Terminals —How we make it work
 
Net Metering in Retail Choice States
Net Metering in Retail Choice StatesNet Metering in Retail Choice States
Net Metering in Retail Choice States
 
2013 tech trends_poster
2013 tech trends_poster2013 tech trends_poster
2013 tech trends_poster
 
Network design
Network designNetwork design
Network design
 
Avid Shorty Ultimate Instructions
Avid Shorty Ultimate InstructionsAvid Shorty Ultimate Instructions
Avid Shorty Ultimate Instructions
 
TV Connect London 2013 Preview Brochure
TV Connect London 2013 Preview BrochureTV Connect London 2013 Preview Brochure
TV Connect London 2013 Preview Brochure
 
Eclipse idd2012 broerkens_bridgingthegapbetweentextualrequirementsandmodelbas...
Eclipse idd2012 broerkens_bridgingthegapbetweentextualrequirementsandmodelbas...Eclipse idd2012 broerkens_bridgingthegapbetweentextualrequirementsandmodelbas...
Eclipse idd2012 broerkens_bridgingthegapbetweentextualrequirementsandmodelbas...
 
Lizzie Raby: Critical Positions Presentation
Lizzie Raby: Critical Positions PresentationLizzie Raby: Critical Positions Presentation
Lizzie Raby: Critical Positions Presentation
 
Moll nimble hive_book-s
Moll nimble hive_book-sMoll nimble hive_book-s
Moll nimble hive_book-s
 
Primitives And Design Patterns for Top-Down SOA Implementations
Primitives And Design Patterns for Top-Down SOA ImplementationsPrimitives And Design Patterns for Top-Down SOA Implementations
Primitives And Design Patterns for Top-Down SOA Implementations
 
Agile in Practice
Agile in PracticeAgile in Practice
Agile in Practice
 
Get Out! Research Relief where they least expect it!
Get Out! Research Relief where they least expect it!Get Out! Research Relief where they least expect it!
Get Out! Research Relief where they least expect it!
 
Demar Brochure Final Jan 2012
Demar Brochure Final Jan 2012Demar Brochure Final Jan 2012
Demar Brochure Final Jan 2012
 

More from Tim Menzies

Assessing the Reliability of a Human Estimator
Assessing the Reliability of a Human EstimatorAssessing the Reliability of a Human Estimator
Assessing the Reliability of a Human EstimatorTim Menzies
 
Experiments on Design Pattern Discovery
Experiments on Design Pattern DiscoveryExperiments on Design Pattern Discovery
Experiments on Design Pattern DiscoveryTim Menzies
 
Using Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorUsing Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorTim Menzies
 
Project Data Incorporating Qualitative Factors for Improved Software Defect P...
Project Data Incorporating Qualitative Factors for Improved Software Defect P...Project Data Incorporating Qualitative Factors for Improved Software Defect P...
Project Data Incorporating Qualitative Factors for Improved Software Defect P...Tim Menzies
 
Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...
Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...
Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...Tim Menzies
 
Adequate and Precise Evaluation of Predictive Models in Software Engineering ...
Adequate and Precise Evaluation of Predictive Models in Software Engineering ...Adequate and Precise Evaluation of Predictive Models in Software Engineering ...
Adequate and Precise Evaluation of Predictive Models in Software Engineering ...Tim Menzies
 
Complexity Measures for Secure Service-Orieted Software Architectures
Complexity Measures for Secure Service-Orieted Software ArchitecturesComplexity Measures for Secure Service-Orieted Software Architectures
Complexity Measures for Secure Service-Orieted Software ArchitecturesTim Menzies
 
Decision Support Analyss for Software Effort Estimation by Analogy
Decision Support Analyss for Software Effort Estimation by AnalogyDecision Support Analyss for Software Effort Estimation by Analogy
Decision Support Analyss for Software Effort Estimation by AnalogyTim Menzies
 
Predicting Defects for Eclipse
Predicting Defects for EclipsePredicting Defects for Eclipse
Predicting Defects for EclipseTim Menzies
 

More from Tim Menzies (10)

Assessing the Reliability of a Human Estimator
Assessing the Reliability of a Human EstimatorAssessing the Reliability of a Human Estimator
Assessing the Reliability of a Human Estimator
 
Experiments on Design Pattern Discovery
Experiments on Design Pattern DiscoveryExperiments on Design Pattern Discovery
Experiments on Design Pattern Discovery
 
Using Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorUsing Developer Information as a Prediction Factor
Using Developer Information as a Prediction Factor
 
Project Data Incorporating Qualitative Factors for Improved Software Defect P...
Project Data Incorporating Qualitative Factors for Improved Software Defect P...Project Data Incorporating Qualitative Factors for Improved Software Defect P...
Project Data Incorporating Qualitative Factors for Improved Software Defect P...
 
Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...
Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...
Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...
 
Promise Keynote
Promise KeynotePromise Keynote
Promise Keynote
 
Adequate and Precise Evaluation of Predictive Models in Software Engineering ...
Adequate and Precise Evaluation of Predictive Models in Software Engineering ...Adequate and Precise Evaluation of Predictive Models in Software Engineering ...
Adequate and Precise Evaluation of Predictive Models in Software Engineering ...
 
Complexity Measures for Secure Service-Orieted Software Architectures
Complexity Measures for Secure Service-Orieted Software ArchitecturesComplexity Measures for Secure Service-Orieted Software Architectures
Complexity Measures for Secure Service-Orieted Software Architectures
 
Decision Support Analyss for Software Effort Estimation by Analogy
Decision Support Analyss for Software Effort Estimation by AnalogyDecision Support Analyss for Software Effort Estimation by Analogy
Decision Support Analyss for Software Effort Estimation by Analogy
 
Predicting Defects for Eclipse
Predicting Defects for EclipsePredicting Defects for Eclipse
Predicting Defects for Eclipse
 

Recently uploaded

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Recently uploaded (20)

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Modeling the Effect of Size of Defect Proneness for Open-Source Software

  • 1. PROMISE at ICSE’07 MODELING the EFFECT of SIZE on DEFECT PRONENESS for OPEN-SOURCE SOFTWARE A. Güneş Koru1, Donsong Zhang1, and Hongfang Liu2 1Department of Information Systems UMBC Baltimore, MD, USA 2Georgetown Medical Center Department of Bioinformatics, Biostatistics, and Biomathematics Georgetown University, Washington, D.C., USA E-mails: gkoru@umbc.edu, zhangd@umbc.edu, hl224@georgetown.edu
  • 2. UMBC • UMBC, University of Maryland, Baltimore County (http://umbc.edu/ ~gkoru) • Public research university with a focus on graduate education. • Theoretically, all campuses belong to the University of Maryland but practically they look like different universities. • UMBC is located in Baltimore in a small suburban neighborhood called Catonsville. UMBC is not • University of Maryland, College Park • University of Baltimore (Business school) • University of Maryland Baltimore (Medical School) • Hongfang Liu is with the Georgetown University located in Washington, D.C., Interested in Bioinformatics and Health Care.
  • 3. Size--Defect Relationship • Size is perhaps the oldest measure. Mostly, measured by lines of code (sometimes function points). • Several studies found size to be associated with defect count. Earliest: A linear model in [Akiyama 71]. • Many other measures (e.g. cyclomatic complexity [McCabe 76], software science measures [Halstead 77]) are also correlated with size. There is some consensus that these are also size measures [Fenton and Pfleeger 96]. “May be size does not explain everything, but it explains a lot.” Bojan Cukic, PROMISE 2007 • Functional form of this relationship is still not understood well. • Commonly, practitioners assume a linear relationship [El Emam 05]. • Only general conclusion is that there is a continuously increasing relationship between the two [Fenton and Ohlsson 00, El Emam et al. 01].
  • 4. Size--Defect Relationship: Alternative Forms defects defects defects size size size (a) (b) (c) • Implications: “Things are linear is open to questions” Tim Menzies, PROMISE 2007 • (a) Linear: Smaller and larger modules are proportionally equally • Theoretical and Practical Importance problematic • Decomposition • (b) Quadratic: Larger modules are • Focused quality assurance proportionally more problematic • Functional Enhancements • (c) Logarithmic: Smaller modules are proportionally more problematic
  • 5. Why the relationship is still unclear... • Many earlier studies did not fully explore alternative functional forms or test the deviation from linearity significantly. • Linear models [Akiyama 71] or correlations [Andersson and Runeson 07] were found sufficient. • A study stated that linear models could be good as first approximations and there was better tool support [Shen 85] • Number of data points were very limited in the earlier studies (e.g. Akiyama 71). • Deriving models analytically and then fitting data to validate those models [Lipow 82]. • Accepting correlations as a sign of a linear relationship [Schneidewind and Hoffman 78]. Correlations do not imply proportionality. • Focus shift on defect density. Observations for optimal module size that minimizes defect density. U-shaped curve (Goldilock’s conjecture) [Withrow 90, Hatton 97, Hatton 98, etc]. See [El Emam 02] for a detailed review. • This approach can mask the plain size--defect relationship and mislead us. [El Emam 02, Fenton and Neil 00, and Rosenberg 97] • Gets more difficult to understand from multivariate and sophisticated machine learning models (e.g. from Neural Networks in [Khoshgoftaar 97]).
  • 6. Conventional Approach to Investigate Size--Defect Relationship • All these studies share a common characteristic • A software system is measured at a snapshot time, then the obtained measurements are associated with the future defect count (note this might be pre-release or post release) For ex: [Koru and Tian 03] [Khoshgoftaar 96] • Usually, measurement and analysis performed at module level. • A common problem is the availability of data [Fenton and Ohlsson 02]. • Publicly available Open Source Software (OSS) repositories: Source code, change data, and defect data [Koru and Tian 04].
  • 7. Challenges with Using Conventional Method in OSS Context • Evolutionary aspects of OSS. Continuous and concurrent functional enhancements, defect fixes, all other changes (perfective, adaptive, etc.) Bazaar model rather than cathedral model [Raymond 99]. • OSS, usually, developed by volunteers, not too much planning, no requirements or design documents, source code is the main artifact. [Mockus et al. 00, Mockus et al. 02]. • Quality assurance activities are not systematic in OSS (see Zhao and Elbaum 03, Koru et al. 07]) • So far, research using conventional approach focused on relatively better planned, analyzed, designed, and tested closed source products. • Internal validity problems caused by the dynamic OSS context: • Deleted classes • Size changes • There might be closed source products developed in an evolutionary manner and vice versa. Such comparisons are outside of the scope here (see [Paulson et al. 04]))
  • 8. In this study... “If developers play with a file, it can change its defect proneness” Elaine Weyuker, PROMISE 2007 • To gain a better understanding of the size--defect relationship, we used both • Novel approach that adopts Cox Proportional Hazards Modeling with Recurrent Events (Cox Modeling) [Cox 72]. • The data comes from a large-scale long-lived OSS product Mozilla (http://www.mozilla.org). • The evolutionary aspects of the Mozilla project was shown in other studies: • Gyiomothy et al. [04] found that size of Mozilla increased significantly during successive releases. • Mockus et al. [02] found that there was no particular development process in Mozilla.
  • 9. In the rest of this presentation... •Methodology •Demonstrating the evolutionary aspects of Mozilla •Cox Modeling •Data Collection •Modeling and Results •Future Work •Conclusion
  • 10. Results: Demonstrating Evolutionary Aspects of Mozilla (a) 1000 ● Cumulative Number of Deleted Classes ● ● ● ● • For only Mozilla 1.0 ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ●● ● ●● ●●● ● ●● ● ●● ● ●● ● ● ● ●● ● ●●● 800 ●●● ● ● ●● ● ●● ● ● classes ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 600 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ● ●● ● ●● ●● ● ● ●● ● ●● ● ● ●● ● ● ●● ● ●● ● ●● ●● ● ● ● ● ● ●● ●● ●● ●● ●● ●● ● ● 400 ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ●● ●● ●●● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● • (a) Cumulative ● ●● ●●● ●● ●● ●● ● ● ● ● ● ● ● ● 200 ● ● ●● ● ●● ●●● ●●● ●● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ●● ● ●● ● ● ●● ● ●●● ●● number of ●● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ●● ●● ● ● ●●● ●●● ●● ●● ● ● ● ●● ● ● ● ●●● 0 ● deleted classes 2003 2004 2005 2006 Years (b) • (b) Cumulative ●●●●●●●●●●●●●● ●●●●●●●●● ●●●● ●●●●●●● ●●● ●● ●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●●●● ●● ●●●● ●●●●●●● ●●●●●●● ●●●●●● ●●●●●●●● ● ●●● ●●●●●●● ● ●●● ●●●● ●●●●●●●● ●●●●●●●●●●●●●● ●●●●● ● ●●●●● ●●●●●●●●● ●● ●● ● ●●●● ●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●● ●●●● ●●●● 1500000 ●●●●●●●●●● ●●●● ●●●●●●●●●● ●●●
  • 11. Cox Modeling ( A non-parametric approach) • The instantaneous relative risk (hazard) of defect fix, also called event, becomes the response variable. Note that it can recur. • A complete size history is obtained for each class by measuring size at each change and corrective changes are marked. • Time of change is also noted. At each unique time, the hazard is calculated by dividing the events at that time by the classes at risk at that time. λi (t) = λ0 (t)eβxi (t) . (1) • Hazard function: β is the regression coefficient for xi (t) and λ0 (t) is an unspecified non-negative function of time called the baseline hazard function. It is the instantaneous hazard of having an event without any covariate effect (i.e., when β = 0). • Relative hazard: eβ(xj (t)−xk (t)) • Note that the relative hazard is proportional to the difference in covariate values. This is called proportional hazards assumption and needs to be checked.
  • 12. Methodology • Relative log risk is noted by f(size) (for median size, it is set to zero). • Examine the functional model with Cubic Spline Functions using four knots f (size) = β0 +β1 size+β2 (size−k1 )3 +β3 (size−k2 )3 +β4 (size−k3 )3 +β5 (size−k4 )3 + + + + (1) where, (size − kn ), if (size − kn ) > 0 (size − kn )+ = (2) 0, otherwise • Examined the alternative model visually • Tested whether the deviation from linearity was statistically significant H0 : β2 = β3 = β4 = 0
  • 13. Methodology - Data Layout and Collection (A) • We developed PERL scripts to extract class name size defect count source code, analyze CVS changes, and A 75 0 to find whether a class is affected or not B 250 2 C 300 2 • (a) What would the data look like if D 600 2 conventional approach was used. E 800 3 F 220 0 • (b) Novel Approach: Classes between G 300 0 added to the system after Mozilla 1.0 . . . . . . release date were measured until Feb 22, (B) 2006. class name start end event size state • Each change resulted in an observation Y 0 50 0 75 0 Y 50 100 1 200 1 • 15,545 observations Y 100 200 0 300 1 • Events were identified by searching the Z 0 200 1 250 0 Z 200 800 0 180 1 CVS logs for words ‘bug’, ‘defect’, and Z 800 1400 1 400 1 ‘fix’. When we sampled 100 logs Z 1400 1800 0 300 1 . . . . . randomly, we saw that this automated . . . . . approach was correct for 98 of them.
  • 14. Results - Functional Form 2.0 1.5 Instantaneous relative risk of defect fix 1.0 0.5 0.0 −0.5 −1.0 0 2000 4000 6000 8000 10000 12000 Size (LOC) • When we use cubic spline functions the logarithmic form is also obvious. The curve down at the end is only for less than 0.3% of the data points. We can use log(size) directly in the Cox model
  • 15. Results -- Modeling results MANUSCRIPT SUBMITTED TO TSE coef exp(coef) se(coef) robust se z p log(size) 0.368 1.44 0.00732 0.018 20.4 0 Rsquare= 0.152 (max possible= 1) Likelihood ratio test= 2565 on 1 df, p=0 Wald test = 416 on 1 df, p=0 Score (logrank) test = 2565 on 1 df, p=0, Robust Score = 142 p=0 Fig. 5. Modeling results using logarithmic transform of size
  • 16. Outlier Analysis - Checking for overly influential data points 0.0005 ! ! ! ! !! !! ! ! !!! ! ! ! ! ! !! !!!! !!! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! !!! !!! !! ! ! ! !!! ! !! ! !! ! ! ! !! ! !! ! !! ! !!!! ! !! ! ! ! !! ! ! ! ! ! ! ! ! ! ! !!! !!! !! !!!! ! !! ! ! ! ! ! !! ! !! ! ! !! ! ! ! ! ! ! ! ! ! !! !!! ! !! ! !!! ! !! ! ! !! ! !! ! ! ! !! ! !! !!!! ! ! ! ! !! !!!! !!! ! !! !!!!!!!! ! !!! !!!! !!! !!! !! ! !!!! !!!!!!! ! ! ! ! ! !!!!! ! !!! !! ! ! !! !!! ! ! !! ! ! ! ! ! !! ! !! !! ! !! !! ! ! ! ! ! ! ! ! ! !! ! !!! !! ! ! ! !!!!! !!!!!! !! !! !!!!! !!!! ! !!!!! !!! !!! ! !!!!!! !!! !!!!! !!!!!!!! !!!!!!!!!! !!!!!!!!!! !!!!!!! !! !!!! !!! ! ! !!! !!!!!!!! !!!! ! !!! ! ! ! ! ! !! ! ! ! !!! !! !!! ! !! ! ! !! !!!!!! !!! ! ! ! ! !!! !! !! ! !!! !!!!!!!!!! !!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!! ! ! ! ! ! ! ! ! ! ! !! ! ! !! ! ! !! ! ! ! !!!!!! ! !!! ! !! !!! !! !! ! !! !!! ! !!!!!!!!!!!!!!!!!!! !!! !!!!!!!!!!!!!!!!!!!!!! !!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!! !!!! ! !!! !! !! !! !!!!!!!!!!!! !!!!!! !!! !!!!!!!! !!!!!!!!!!!!!!!!!!! Plotted Martingale Residuals !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!! ! !!!! !!! ! ! !!!! !!!!! !!!! ! !! !! ! ! !! !! !!!!! ! !!! !!!!!!!!!!!!!!!! !!!!!!! !!! ! !! !! !!!! ! ! ! ! ! ! ! !!! !! ! !! !! ! ! !!! !!!!!!!! !!! !!!!!!!!!! !!!!!!! !! !!! !!!!!! !!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !! ! ! ! ! ! ! ! ! ! ! ! !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!! ! !!!!!!!!!!!!!!!!!!!!! !! !!!!!!!!!! !!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! !!! ! !!!!! !!!! ! ! ! !!!! ! ! !!!! !!!!! !!!!!!!!! !!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!! !!!!! !!!!!!!! ! !!!!!!!!! !!!! !!!!! !!! !! !!! !! !!!!!!!!! ! !!! ! !!! ! ! ! !! ! ! ! ! ! !! ! ! ! !!! ! !!! !! !!! !!!! !! !! !! !! ! ! !! ! !!!!!!! !! !!!!!! !! ! ! ! !! ! ! ! ! ! ! !!! ! !! ! !! !!!!! ! !!!! !! !! !!!!!!!!! !!! !!!!!! !! !!! !!! !! !!!!!!!! ! !!!! !!!!!! !! !!!!!!!!!! !!!!!!! ! !!! ! !!!!!!!!!! !!!!! ! !! ! !!! !!!! ! !! !!!! !!!! !!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!! ! !!!!!!!!!!!!!!! !!!!!!!!!!!! !!! !!! ! !!!!! !!!!!!!! ! !!!! !!!!!!! !! ! !!!! ! !! ! !!! ! !!!! ! !!!!!! !!!!! !!!!! !! ! !! !! !!!! ! ! !! ! !!! ! !!! !!!!! !!!! ! !! ! !! !!!! !!!!! ! !!! ! ! !!! !! !! ! ! !! ! !!! ! !! ! ! ! !! ! ! ! ! ! !! ! !! !! ! !! !!! !!!!!!!!!! ! ! !! !!!! !! !!!!!!!!!!!! !!! !!!! !!!!!! !!!!!!! ! ! !!! ! !! !!!!!! ! ! !! !! ! !! !!! !!!!! !!!!! !! ! ! !! !!!! !!! ! ! ! ! ! ! !! ! ! ! !!! ! ! ! ! ! !! ! !! !!! ! ! !! ! ! ! !! ! !!! ! ! !! !! ! !! ! ! ! ! !! !! ! ! !! ! !0.0005 ! ! !! ! ! ! ! !! ! !! ! !! ! ! !! ! ! !!! ! ! ! ! ! !!! ! ! ! ! ! !! !!! !! !! !! ! ! !! ! ! !! ! !! ! ! !! Outliers are still !! ! !! ! !! ! ! ! !! ! ! !! ! ! !! ! !! ! !! ! ! !! ! ! valid observations Influence ! !! ! ! ! !! !0.0015 Removing them only brings the unit effect outliers ! of log size to 45 % !0.0025 (small change) ! 0 5000 10000 15000 Decided to keep them. Observation id
  • 17. Test of Proportional Hazards • Commonly, interaction with time is tested 20 • Example: A drug only effective in the first hour. 10 Beta(t) for log(size) • Note: This test can also become significant when a 0 wrong functional form is used. !10 • Result: p = 0.835 highly insignificant. !20 • A smooth plot of Schönfeld 0 500000 1000000 1500000 2000000 residuals show almost a Time perfectly straight line.