SlideShare una empresa de Scribd logo
1 de 7
Descargar para leer sin conexión
Mining Java Class Naming Conventions

           Simon Butler, Michel Wermelinger, Yijun Yu & Helen Sharp

                                      Centre for Research in Computing
                                            The Open University


                                          27 September 2011




           Centre for
           Research in Computing                                 m.a.wermelinger@open.ac.uk


Butler et al. (The Open University)     Mining Java Class Naming Conventions   27 September 2011   1/7
Class Identifier Names

         Despite the importance of
         class identifier names                                AbstractCollection           Set
         knowledge of their structure
         is limited
          adjective ∗ noun +
         approximation found to be                                           AbstractSet
         useful, but not universal
         What other part-of-speech
         patterns are commonly used?
         How are component words
                                                             EnumSet          HashSet         TreeSet
         repeated? How often?
         Are there project-specific
         naming conventions?


Butler et al. (The Open University)   Mining Java Class Naming Conventions          27 September 2011   2/7
Distribution of Java Classes in Inheritance Categories



                                                                     0.7
                                                                     0.6
                  Proportion of inheritance categories per project

                                                                     0.5
                                                                     0.4
                                                                     0.3
                                                                     0.2
                                                                     0.1
                                                                     0.0




                                                                           E0I0       E0I1       E0In       E1I0         E1I1   E1In



Butler et al. (The Open University)                                               Mining Java Class Naming Conventions             27 September 2011   3/7
Part-of-Speech Patterns
                Relative frequency of most common PoS patterns
                                             noun +
                                adjective +              verb +
                     noun +           +      adjective +
                                noun                     noun +
                                             noun +
             E0 I 0             0.85                 0.08                     0.01        0.01
             E0 I 1             0.73                 0.15                     0.02        0.02
             E0 I n             0.75                 0.15                     0.03        0.01
             E1 I 0             0.68                 0.12                     0.04        0.03
             E1 I 1             0.70                 0.15                     0.04        0.02
             E1 I n             0.75                 0.14                     0.04        0.02

       4 basic patterns account for 90% of class identifier names
       85% of E0 I0 class identifier names are composed of nouns
       The adjective ∗ noun + approximation includes 85% of class
       identifier names
Butler et al. (The Open University)    Mining Java Class Naming Conventions          27 September 2011   4/7
Component Word Inheritance
              Relative frequency distribution of name inheritance
                               Super Class Name                Interface Name
           Category             All    Fragment                 All Fragment        Both
           E0 I1                  -                    -       0.39          0.37        -
           E0 In                  -                    -       0.38          0.40        -
           E1 I0               0.23                 0.58          -             -        -
           E1 I1               0.14                 0.53       0.24          0.21     0.27
           E1 In               0.11                 0.50       0.15          0.25     0.18


       Fragments of super class name most commonly repeated
       Most common patterns:
               E0 I1 & E0 I1 : noun + interface name , noun + interface fragment
               E1 I0 : noun + super class fragment , noun + super class name
               E1 I1 & E1 In : noun + super class fragment ,
                interface name super class fragment , noun + super class name
Butler et al. (The Open University)   Mining Java Class Naming Conventions     27 September 2011   5/7
Case Study - Freemind

       652 class identifier names
       53 (8%) with uncommon PoS patterns
       Each class inspected with questions:
          1. Is the class identifier name a clear description of the class?
          2. Can the class identifier name be refactored to a more common PoS
             pattern?
          3. Can the class be refactored into classes that could be more
             conventionally named?
       We found:
               Class identifier names describing GUI actions initiated by the user, e.g.
               SelectAllAction ( verb determiner noun )
               Class identifier names that conform to local naming conventions
               7 class identifier names were candidates for name refactoring
               1 class was a candidate for refactoring



Butler et al. (The Open University)   Mining Java Class Naming Conventions   27 September 2011   6/7
Conclusions


       Contributions
               Identification of common PoS structures found in praxis
               Identification of common patterns of component word repetition
               Unconventional class names:
                       may conform to local naming conventions
                       may be candidates for refactoring
                       may indicate smells

       Practical Applications
               Recovery of class naming conventions
               Identification of unconventionally named classes
               Class identifier name recommendation systems




Butler et al. (The Open University)   Mining Java Class Naming Conventions   27 September 2011   7/7

Más contenido relacionado

Destacado

ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...
ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...
ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...ICSM 2011
 
Components - Graph Based Detection of Library API Limitations
Components - Graph Based Detection of Library API LimitationsComponents - Graph Based Detection of Library API Limitations
Components - Graph Based Detection of Library API LimitationsICSM 2011
 
ERA - Measuring Maintainability of Spreadsheets in the Wild
ERA - Measuring Maintainability of Spreadsheets in the Wild ERA - Measuring Maintainability of Spreadsheets in the Wild
ERA - Measuring Maintainability of Spreadsheets in the Wild ICSM 2011
 
ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...
ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...
ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...ICSM 2011
 
Faults and Regression Testing - Fault interaction and its repercussions
Faults and Regression Testing - Fault interaction and its repercussionsFaults and Regression Testing - Fault interaction and its repercussions
Faults and Regression Testing - Fault interaction and its repercussionsICSM 2011
 
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...ICSM 2011
 
Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...ICSM 2011
 
Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...
Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...
Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...ICSM 2011
 
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Industry -  Relating Developers' Concepts and Artefact Vocabulary in a Financ...Industry -  Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...ICSM 2011
 
ERA - Clustering and Recommending Collections of Code Relevant to Task
ERA - Clustering and Recommending Collections of Code Relevant to TaskERA - Clustering and Recommending Collections of Code Relevant to Task
ERA - Clustering and Recommending Collections of Code Relevant to TaskICSM 2011
 
Impact analysis - A Seismology-inspired Approach to Study Change Propagation
Impact analysis - A Seismology-inspired Approach to Study Change PropagationImpact analysis - A Seismology-inspired Approach to Study Change Propagation
Impact analysis - A Seismology-inspired Approach to Study Change PropagationICSM 2011
 
Postdoc Symposium - Abram Hindle
Postdoc Symposium - Abram HindlePostdoc Symposium - Abram Hindle
Postdoc Symposium - Abram HindleICSM 2011
 
Industry - Estimating software maintenance effort from use cases an indu...
Industry - Estimating software maintenance effort from use cases an      indu...Industry - Estimating software maintenance effort from use cases an      indu...
Industry - Estimating software maintenance effort from use cases an indu...ICSM 2011
 
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...ICSM 2011
 
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...ICSM 2011
 
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...ICSM 2011
 
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...ICSM 2011
 
Reliability and Quality - Predicting post-release defects using pre-release f...
Reliability and Quality - Predicting post-release defects using pre-release f...Reliability and Quality - Predicting post-release defects using pre-release f...
Reliability and Quality - Predicting post-release defects using pre-release f...ICSM 2011
 
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...ICSM 2011
 
ERA - Tracking Technical Debt
ERA - Tracking Technical DebtERA - Tracking Technical Debt
ERA - Tracking Technical DebtICSM 2011
 

Destacado (20)

ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...
ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...
ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...
 
Components - Graph Based Detection of Library API Limitations
Components - Graph Based Detection of Library API LimitationsComponents - Graph Based Detection of Library API Limitations
Components - Graph Based Detection of Library API Limitations
 
ERA - Measuring Maintainability of Spreadsheets in the Wild
ERA - Measuring Maintainability of Spreadsheets in the Wild ERA - Measuring Maintainability of Spreadsheets in the Wild
ERA - Measuring Maintainability of Spreadsheets in the Wild
 
ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...
ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...
ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...
 
Faults and Regression Testing - Fault interaction and its repercussions
Faults and Regression Testing - Fault interaction and its repercussionsFaults and Regression Testing - Fault interaction and its repercussions
Faults and Regression Testing - Fault interaction and its repercussions
 
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
 
Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...
 
Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...
Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...
Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...
 
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Industry -  Relating Developers' Concepts and Artefact Vocabulary in a Financ...Industry -  Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...
 
ERA - Clustering and Recommending Collections of Code Relevant to Task
ERA - Clustering and Recommending Collections of Code Relevant to TaskERA - Clustering and Recommending Collections of Code Relevant to Task
ERA - Clustering and Recommending Collections of Code Relevant to Task
 
Impact analysis - A Seismology-inspired Approach to Study Change Propagation
Impact analysis - A Seismology-inspired Approach to Study Change PropagationImpact analysis - A Seismology-inspired Approach to Study Change Propagation
Impact analysis - A Seismology-inspired Approach to Study Change Propagation
 
Postdoc Symposium - Abram Hindle
Postdoc Symposium - Abram HindlePostdoc Symposium - Abram Hindle
Postdoc Symposium - Abram Hindle
 
Industry - Estimating software maintenance effort from use cases an indu...
Industry - Estimating software maintenance effort from use cases an      indu...Industry - Estimating software maintenance effort from use cases an      indu...
Industry - Estimating software maintenance effort from use cases an indu...
 
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...
 
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
 
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
 
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
 
Reliability and Quality - Predicting post-release defects using pre-release f...
Reliability and Quality - Predicting post-release defects using pre-release f...Reliability and Quality - Predicting post-release defects using pre-release f...
Reliability and Quality - Predicting post-release defects using pre-release f...
 
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
 
ERA - Tracking Technical Debt
ERA - Tracking Technical DebtERA - Tracking Technical Debt
ERA - Tracking Technical Debt
 

Último

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Último (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Natural Language Analysis - Mining Java Class Naming Conventions

  • 1. Mining Java Class Naming Conventions Simon Butler, Michel Wermelinger, Yijun Yu & Helen Sharp Centre for Research in Computing The Open University 27 September 2011 Centre for Research in Computing m.a.wermelinger@open.ac.uk Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 1/7
  • 2. Class Identifier Names Despite the importance of class identifier names AbstractCollection Set knowledge of their structure is limited adjective ∗ noun + approximation found to be AbstractSet useful, but not universal What other part-of-speech patterns are commonly used? How are component words EnumSet HashSet TreeSet repeated? How often? Are there project-specific naming conventions? Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 2/7
  • 3. Distribution of Java Classes in Inheritance Categories 0.7 0.6 Proportion of inheritance categories per project 0.5 0.4 0.3 0.2 0.1 0.0 E0I0 E0I1 E0In E1I0 E1I1 E1In Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 3/7
  • 4. Part-of-Speech Patterns Relative frequency of most common PoS patterns noun + adjective + verb + noun + + adjective + noun noun + noun + E0 I 0 0.85 0.08 0.01 0.01 E0 I 1 0.73 0.15 0.02 0.02 E0 I n 0.75 0.15 0.03 0.01 E1 I 0 0.68 0.12 0.04 0.03 E1 I 1 0.70 0.15 0.04 0.02 E1 I n 0.75 0.14 0.04 0.02 4 basic patterns account for 90% of class identifier names 85% of E0 I0 class identifier names are composed of nouns The adjective ∗ noun + approximation includes 85% of class identifier names Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 4/7
  • 5. Component Word Inheritance Relative frequency distribution of name inheritance Super Class Name Interface Name Category All Fragment All Fragment Both E0 I1 - - 0.39 0.37 - E0 In - - 0.38 0.40 - E1 I0 0.23 0.58 - - - E1 I1 0.14 0.53 0.24 0.21 0.27 E1 In 0.11 0.50 0.15 0.25 0.18 Fragments of super class name most commonly repeated Most common patterns: E0 I1 & E0 I1 : noun + interface name , noun + interface fragment E1 I0 : noun + super class fragment , noun + super class name E1 I1 & E1 In : noun + super class fragment , interface name super class fragment , noun + super class name Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 5/7
  • 6. Case Study - Freemind 652 class identifier names 53 (8%) with uncommon PoS patterns Each class inspected with questions: 1. Is the class identifier name a clear description of the class? 2. Can the class identifier name be refactored to a more common PoS pattern? 3. Can the class be refactored into classes that could be more conventionally named? We found: Class identifier names describing GUI actions initiated by the user, e.g. SelectAllAction ( verb determiner noun ) Class identifier names that conform to local naming conventions 7 class identifier names were candidates for name refactoring 1 class was a candidate for refactoring Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 6/7
  • 7. Conclusions Contributions Identification of common PoS structures found in praxis Identification of common patterns of component word repetition Unconventional class names: may conform to local naming conventions may be candidates for refactoring may indicate smells Practical Applications Recovery of class naming conventions Identification of unconventionally named classes Class identifier name recommendation systems Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 7/7