SlideShare una empresa de Scribd logo
1 de 66
Descargar para leer sin conexión
LSE


         A Portfolio of
         Software Evolution
         Expertise
                   Stéphane Ducasse
                   stephane.ducasse@inria.fr
                   http://stephane.ducasse.free.fr/



Stéphane Ducasse                                        1
A word of presentation
            Co-author of Object-Oriented Reengineering Patterns
            Co-developer of Moose (reengineering platform)
            10 PhD Theses in reengineering
            50+ articles
            Grounded in reality
            Was maintainer of Squeak 3.9

            Worked with:
              Harman-Becker AG
              Bedag AG
              Nokia, Daimler



                                                                  LSE
S.Ducasse                              2
Roadmap
        •   Some facts
        •   Our approach
            •   Supporting maintenance
            •   Moose an open-platform
        •   Some visual examples
        •   Conclusion




                                             LSE
S.Ducasse                                3
Software is complex.

                 29% Succeeded

                    18% Failed



                 53% Challenged



               The Standish Group, 2004
                                          LSE
S.Ducasse                   4
How large is your project?




                                  LSE
S.Ducasse             5
How large is your project?




                                  LSE
S.Ducasse             5
How large is your project?




                                  LSE
S.Ducasse             5
How large is your project?




                                  LSE
S.Ducasse             5
How large is your project?


               1’000’000 lines of code




                                         LSE
S.Ducasse                 5
How large is your project?


                1’000’000 lines of code
               * 2 = 2’000’000 seconds




                                          LSE
S.Ducasse                  5
How large is your project?


                1’000’000 lines of code
               * 2 = 2’000’000 seconds
                  / 3600 = 560 hours




                                          LSE
S.Ducasse                  5
How large is your project?


                1’000’000 lines of code
               * 2 = 2’000’000 seconds
                  / 3600 = 560 hours
                     / 8 = 70 days




                                          LSE
S.Ducasse                  5
How large is your project?


                1’000’000 lines of code
               * 2 = 2’000’000 seconds
                  / 3600 = 560 hours
                      / 8 = 70 days
                    / 20 = 3 months




                                          LSE
S.Ducasse                  5
Maintenance is Continuous Development

                                                                 4.1% Other
                                    18.2% Adaptive
                                    (new platforms or OS)
     Relative Maintenance Effort
   Between 50% and 75% of global
         effort is spent on      17.4% Corrective
          “maintenance” !        (fixing reported errors)
                                                                        60.3% Perfective
                                                                        (new functionality)


            The bulk of the maintenance cost is due to new functionality
              even with better requirements, it is hard to predict new functions


                                                                                      LSE
S.Ducasse                                       6
Lehman’s Software Evolution Laws
            Continuous Change: “A program that is used in a
            real-world environment must change, or become
            progressively less useful in that environment.”

            Software Entropy: “As a program evolves, it becomes
            more complex, and extra resources are needed to
            preserve and simplify its structure.”




                                                                  LSE
S.Ducasse                            7
Roadmap
        •   Some facts
        •   Our approach
            •   Supporting maintenance
            •   Moose an open-platform
        •   Some visual examples
        •   Conclusion




                                             LSE
S.Ducasse                                8
Supporting the evolution of applications
            A research goal and agenda grounded in reality

            How to help companies maintaining their large
            software?
            What is the xray for software?
              code, people, practices
            Which analyses?
            How can you monitor your system (dashboards....)
            How to present extracted information?




S.Ducasse                               9
Covered topics
                                                                         Analyses


            Topics                                        Reverse
                                                          Engineering

              Metamodeling, Software metrics,
              Program understanding,                    Representation               Transformations

              Visualization, Evolution analysis,
              Duplicated code detection,                                 Evolution

              Code Analysis, Refactorings,
              Tests
            Contributions
              Moose: an open-source extensible reengineering
              environment: (Lugano, Bern, Annecy, Anvers, Louvain la
              neuve, ULB, UTSL)
            Contacts
              Harman-Becker (3 Millions C++), Bedag (Cobol), Nokia,
              ABB, IMEC

S.Ducasse                                          10
Software Metrics
                                                    [LMO99, OOPSLA00]
                                                 Duplicated Code Identification
Understanding Large Systems                         [ICSM99, ICSM02]
                                                 Group Identification
   [WCRE99, TSI00, TSE03]
Static/Dynamic Information                          [ASE03]
                                                 Test Generation
   [ICSM99]
Feature Analysis                                    [CSMR 06]
                                                 Concept Identification
    [JSME 06]
                                         Analyses [WCRE 06]
Class Understanding
   [OOPSLA01,TSE04]
Package Blueprints      Reverse
   [ICSM 07]
                        Engineering
Distribution Maps
   [ICSM 06]

                     Representation                   Transformations
                                                                 Language Independent
                                                                 Refactorings
                                                                    [IWPSE 00]
                                          Evolution
 Language Independent Meta
 Model (FAMIX)                        Reengineering Patterns
    [UML99]                           Version Analyses
 An Extensible Reengineering             [ICSM 05]
 Environment (Moose)                  HISMO metamodel
    [Models 06]                          [JSME 05]


                                                                                   LSE
S.Ducasse                                    11
One Example: who is responsible of what?


                      (4) Visualisation
    (3) Analyses

2) Modèle


              (1) Extraction




                                          Distribution Map of authors
                                          on JBoss
S.Ducasse                                  12
Moose is a reengineering tool which integrates
     multiple techniques
                Number of classes = 382
                Number of methods = 4268
                              Metrics
                …




                                                     Visualization



                             Moose

                                                Queries and Navigation


                                                 word1          word2

            …                                     Semantic Analysis
                    Evolution Analysis


                                                                         LSE
S.Ducasse                                  13
Moose is open and open-source
            meta-described
            meta-model aware


                          Method          Class


                                        Inheritance




                                                      LSE
S.Ducasse                          14
Designed to be extensible
                            Class
                            History

            Duplication     Class
                                        Author
                           Version

             Method         Class        File


              Event       Inheritance


              Trace




                                                 LSE
S.Ducasse                       15
Roadmap
        •   Some facts
        •   Our approach
            •   Supporting maintenance
            •   Moose an open-platform
        •   Some visual examples
        •   Conclusion




                                              LSE
S.Ducasse                                16
Understanding large systems
            Understanding code is difficult!
            Systems are large
            Code is abstract
            Should I really convinced you?

            Some existing approaches
              Metrics: problems you often get meaningless results once
              combined
              Visualization: often beautiful but without meaning




                                                                         LSE
S.Ducasse                                 17
Polymetric views




                             W: # fields
                             H: # methods
                             C: # lines of code


                                                  LSE
S.Ducasse               18
Polymetric views condense information
    To get a feel of the inheritance
    semantics: adding vs. reusing




                                                Classes+Inheritance
                                                   W: # of Added Methods
                                                   H: # of Overridden Method
                                                   C: # of Method Extended



                                methods
                                 LOC
                                 # statements
                                 # parameters

                                                                     LSE
S.Ducasse                               19
Navigating Views...




                                LSE
S.Ducasse                  20
Understanding classes
            Understanding even a class is difficult!




                                                      LSE
S.Ducasse                                 21
Class Blueprint
       Enriched call flow annotated with
       metrics to give semantics
            Initialization   External Interface   Internal Implementation   Accessor   Attribute




                                                     Invocation Sequence




                                                                                                   LSE
S.Ducasse                                                       22
Class Blueprint




                            LSE
S.Ducasse              23
Large delegating interface




                                  LSE
S.Ducasse             24
Sharing Flows




                          LSE
S.Ducasse            25
Regular Subclasses




                          LSE
S.Ducasse            26
Patterns




                     LSE
S.Ducasse       27
How can we predict changes?
            Common wisdom stresses that what changes yesterday
            will change today, but it is true?


            In the Sahara the weather is constant,
            tomorrow: 90% chance that it is the same as today



            In Belgium, the weather is changing really fast (sea
            influence), 30% chance that it is the same as today


                                                                   LSE
S.Ducasse                            28
With history analysis we can get the
     climate of a software system
                       Past Late               Future Early
                       Changers                  Changers



                                                                     1, TopLENOM1..i (S, t1) ∩
                                                                        TopEENOMi..n (S, t2) ≠ ∅
                                                          YWi(S) =
                                                                     0, TopLENOM1..i (S, t1) ∩
                                                                        TopEENOMi..n (S, t2) = ∅


                                                                           ∑ YWi(S, t1, t2)
                                                         YW(S, t1, t2) =
                               Past   Present Future                            n-2
                       hit
                             versions version versions




                                                                                              LSE
S.Ducasse                    29
How developers develop?
        •   More efficient to put people working together in the
            same office?
        •   How can we optimize software development?




                                                                  LSE
S.Ducasse                             30
Who did that?




Files




               Time

                           LSE
S.Ducasse             31
Line colors show which author owned
     which files in which period

                     Green author    Green author
                     large commit    ownership


            File A


            File B


                     Blue author
                     small commit




                                                    LSE
S.Ducasse                       32
Which author “possesses” which files?




                                        LSE
S.Ducasse              33
Alphabetical order is no order!




                                       LSE
S.Ducasse               34
Based on similar commit signature


                                               Edit       Takeover




            Monologue   Familiarization        Dialogue

                                                                     LSE
S.Ducasse                                 35
Understanding evolution of large systems
        •   How old are the hierarchies?
        •   How did the classes change?
        •   How did the inheritance change?




                                              LSE
S.Ducasse                             36
Evolution holds useful information

            A           A              A              A                  A

                BC          BC              BC            B


                              D                 D             D

                                                                      time



                     A is persistent       C was removed
                     B is stable           E is newborn
                     D inherited from C and then from A           …

                                                                             LSE
S.Ducasse                                  37
Hierarchy Evolution Complexity View
     characterizes class hierarchy histories
                                               ENOM

                       A                       Age
                                  ENOS                  Class
                                                        History
                                          Removed
            C              B


                                               Age      Inheritance
                                                        History
                       E
                D                         Removed



                    A is persistent      C was removed
                    B is stable          E is newborn
                    D inherited from C and then from A       …


                                                                      LSE
S.Ducasse                                 38
Class hierarchies over 40 versions of
     Jun - a 740 classes, 3D framework




                                             LSE
S.Ducasse               39
Identifying Duplicated Code
            “Parsing the program suite of interest requires a parser for the
            language dialect of interest. While this is nominally an easy task, in
            practice one must acquire a tested grammar for the dialect of the
            language at hand. Often for legacy codes, the dialect is unique and the
            developing organization will need to build their own parser. Worse,
            legacy systems often have a number of languages and a parser is
            needed for each. Standard tools such as Lex and Yacc are rather a
            disappointment for this purpose, as they deal poorly with lexical
            hiccups and language ambiguities.” [Baxter 98]

            Problems
               Unknown Duplicated Code
               Scalability
               Understanding


                                                                                      LSE
S.Ducasse                                       40
Language Independent                             a b c defa b cdef



            Language independent, Textual,
            
 [ICSM’99], M. Rieger’s PhD. Thesis

            Duploc handled
                                                           Exact Copies
              Pascal, Java, Smalltalk, Python,        a b c d e fa b x y e f

              Cobol, C++, PDP-11, C
            Slower than other approaches but...
            Max 45 min to adapt our approach to
            	 a new language
            Between 3% and 10%                             Copies with
            	 less identification than parametrized match


                                                                               LSE
S.Ducasse                                    41
A Conceptual Matrix
                   File A            File B
    a b c defa b cdef




                            File A
      Exact Copies
   a b c d e fa b x y e f




                            File B
       Copies with
        Variations
                                              42


                                                   LSE
S.Ducasse
Entities that change together can reveal hidden
     dependencies

                                                      (A,B,C,D,E)
                                                           ()
 A     2    3    3    3    4    6
                                                                (A,B,C,D)
                                                (A,D,E)
                                                                   (v6)
                                                  (v2)
 B     6    6    6    5    6    7

                                                                         (A,B,C)
                                       (D,E)               (A,D)
 C     3    3    5    5    8    9
                                                                         (v5,v6)
                                      (v2,v4)             (v2,v6)

 D     1    3    3    4    4    6
                                         (D)                                (C)
                                                          (A)
                                     (v2,v4,v6)                         (v3,v5,v6)
                                                      (v2,v5,v6)
 E     4    5    5    6    6    6

      v1    v2   v3   v4   v5   v6
                                                           ()
                                                  (v1,v2,v3,v4,v5,v6)




                                                                                     LSE
S.Ducasse                              43
How properties spread in large systems?
            Properties:
              Metrics
              People
              Symbol/Concepts

            Spread = how many packages does it touch?
            Focus = do packages and properties match?

            Distribution Map:
            a generic visualization




                                                        LSE
S.Ducasse                              44
Distribution Map




                             LSE
S.Ducasse               45
Ownership
        •   Authors in JBoss




                                    LSE
S.Ducasse                      46
Characterizing Packages
            Butterflies [Metrics05]
              Kind of Radar




                                          LSE
S.Ducasse                            47
Relative version




                             LSE
S.Ducasse               48
How to understand Packages
            Packages are key structuring elements
            But complex:
              import
              classes....


            Package Blueprints
            [ICSM 2007]




                                                    LSE
S.Ducasse                               49
Surfaces represent package communication



                                                           classes in P1
                                                              that do
                                                            references
                      A3           A4
             A2                          B4

                                                      B4     D1   E1
                                                                           P4 surface
                                          P4
               P2        P3
                                                      A4     C1

                                                      A2     A1            P2 surface
             A1     B1        C1    D1
                                                      A3     B1            P3 surface
                                         E1
                                                    referenced
            P1: analyzed package
                                                    classes
                                                      P1 blueprint




                                                                                        LSE
S.Ducasse                                      50
Principle

         P2                  P3                  P4
            A2    B2        A3       B3               A4



                                                                                          D1 E1 F1 G1 C1 A1 B1 H1 I1
      P1
                                                                                            D1 E1 F1 G1 C1 A1 B1 H1 I1
                                                                                          col col col col col col col col col
                                                                                            col col col col col col col col col
     A1          D1                                   G1
                                                                               Internal
                                                                                 Internal
                             E1        F1                                referenced classes
                                                                           referenced classes




                                                                     references
    B1            C1                            H1         I1                       A1                      C1          B1




                                                                       internal
                                                                  references




                                                                                                                                       head
                                                                                     A1                      C1          B1




                                                                    internal




                                                                                                                                         head
                                                                                    G1                                        H1 I1
                       Package under analysis
                                                                                     G1                                        H1 I1
                                                            P1
                                                                              B3          D1 E1 F1 G1
                                                                               B3          D1 E1 F1 G1
                                                                              A3          D1 E1        C1




                                                                                                                                       body
                                                                               A3          D1 E1        C1

                                                                    references




                                                                                                                                         body
                                                                     external
                                                                  references
                                                                              A2                                  A1
                                                                   external    A2                                  A1
                                                                              B2          D1
                                                                               B2          D1
                                                                              A4                E1 F1 G1
                                                                               A4                E1 F1 G1
                                                                                                         most—least
                                                                       External                            most—least
                                                                                                internal referencing classes
                                                                         External
                                                                 referenced classes               internal referencing classes
                                                                   referenced classes
                                                                                                                                   LSE
S.Ducasse                                                        51
Example




                    LSE
S.Ducasse      52
Symbols contain domain information
        •   What are the concepts used in an application?
        •   How can we use symbolic information?




                                                            LSE
S.Ducasse                              53
Looking at the Symbols
        •   Developers use meaningful names, which capture
            the domain knowledge.




                                                             LSE
S.Ducasse                          54
A cluster is a group of documents
     which use the same terms




                                         LSE
S.Ducasse               55
Moose has been validated on real life systems
            Several large, industrial case studies (NDA)
              Harman-Becker
              Nokia
              Daimler
              Siemens
            Different implementation languages (C++, Java, Smalltalk,
            Cobol)
              We use external C++ parsers
            Different sizes
            Moose is used in several research groups




                                                                        LSE
S.Ducasse                                56
Possible New Research Directions

        •   Remodularization
            •   Clustering analysis
            •   Open and Modular modules
        •   Service Identification in Service Oriented Architecture
        •   Architecture Extraction/Validation
        •   Software Quality
        •   Cost/Bugs prediction
        •   EJB evaluation
        •   Business rules extraction
        •   Model transformation
        •   Test


                                                                     LSE
S.Ducasse                              57
Evolution/Maintenance is a challenge

            Understanding and maintaining large and complex
            applications needs better tools/analyses
              Moose is a platform for developing new analyses
              Transfer to tool vendors




                                                                LSE
S.Ducasse                                58

Más contenido relacionado

Similar a Ducasse's Maintenance Expertise

20080115 04 - La qualimétrie pour comprendre et appréhender les SI
20080115 04 - La qualimétrie pour comprendre et appréhender les SI20080115 04 - La qualimétrie pour comprendre et appréhender les SI
20080115 04 - La qualimétrie pour comprendre et appréhender les SILeClubQualiteLogicielle
 
Scalability in Software Systems Engineering: The Good, the Bad, and the Ugly ...
Scalability in Software Systems Engineering: The Good, the Bad, and the Ugly ...Scalability in Software Systems Engineering: The Good, the Bad, and the Ugly ...
Scalability in Software Systems Engineering: The Good, the Bad, and the Ugly ...David Rosenblum
 
The Nuxeo Way: leveraging open source to build a world-class ECM platform
The Nuxeo Way: leveraging open source to build a world-class ECM platformThe Nuxeo Way: leveraging open source to build a world-class ECM platform
The Nuxeo Way: leveraging open source to build a world-class ECM platformNuxeo
 
Extending Perforce Scalability Using Job Content Synchronization
Extending Perforce Scalability Using Job Content SynchronizationExtending Perforce Scalability Using Job Content Synchronization
Extending Perforce Scalability Using Job Content SynchronizationPerforce
 
Docker Geneva Meetup - Introduction to Docker
Docker Geneva Meetup - Introduction to DockerDocker Geneva Meetup - Introduction to Docker
Docker Geneva Meetup - Introduction to DockerSmartWave
 
ABSE and AtomWeaver : A Quantum Leap in Software Development
ABSE and AtomWeaver : A Quantum Leap in Software DevelopmentABSE and AtomWeaver : A Quantum Leap in Software Development
ABSE and AtomWeaver : A Quantum Leap in Software DevelopmentRui Curado
 
Dataverse in the European Open Science Cloud
Dataverse in the European Open Science CloudDataverse in the European Open Science Cloud
Dataverse in the European Open Science Cloudvty
 
ATI Technical CONOPS and Concepts Technical Training Course Sampler
ATI Technical CONOPS and Concepts Technical Training Course SamplerATI Technical CONOPS and Concepts Technical Training Course Sampler
ATI Technical CONOPS and Concepts Technical Training Course SamplerJim Jenkins
 
Is There a Return on Investment from Model-Based Systems Engineering?
Is There a Return on Investment from Model-Based Systems Engineering?Is There a Return on Investment from Model-Based Systems Engineering?
Is There a Return on Investment from Model-Based Systems Engineering?Elizabeth Steiner
 
Brian Loesgen An Early Look At Oslo
Brian  Loesgen    An  Early  Look At  OsloBrian  Loesgen    An  Early  Look At  Oslo
Brian Loesgen An Early Look At OsloSOA Symposium
 
OW2con11 Use Case SOA, Nov 24-25, Paris
OW2con11 Use Case SOA, Nov 24-25, ParisOW2con11 Use Case SOA, Nov 24-25, Paris
OW2con11 Use Case SOA, Nov 24-25, ParisOW2
 
Running Cognos on Hadoop
Running Cognos on HadoopRunning Cognos on Hadoop
Running Cognos on HadoopSenturus
 
Introducción a Microservicios, SUSE CaaS Platform y Kubernetes
Introducción a Microservicios, SUSE CaaS Platform y KubernetesIntroducción a Microservicios, SUSE CaaS Platform y Kubernetes
Introducción a Microservicios, SUSE CaaS Platform y KubernetesSUSE España
 
Lean Engineering. Applying Lean Principles to Building Experiences
Lean Engineering. Applying Lean Principles to Building ExperiencesLean Engineering. Applying Lean Principles to Building Experiences
Lean Engineering. Applying Lean Principles to Building ExperiencesBill Scott
 
Containerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxContainerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxRavi Yadav
 

Similar a Ducasse's Maintenance Expertise (20)

20080115 04 - La qualimétrie pour comprendre et appréhender les SI
20080115 04 - La qualimétrie pour comprendre et appréhender les SI20080115 04 - La qualimétrie pour comprendre et appréhender les SI
20080115 04 - La qualimétrie pour comprendre et appréhender les SI
 
Scalability in Software Systems Engineering: The Good, the Bad, and the Ugly ...
Scalability in Software Systems Engineering: The Good, the Bad, and the Ugly ...Scalability in Software Systems Engineering: The Good, the Bad, and the Ugly ...
Scalability in Software Systems Engineering: The Good, the Bad, and the Ugly ...
 
The Nuxeo Way: leveraging open source to build a world-class ECM platform
The Nuxeo Way: leveraging open source to build a world-class ECM platformThe Nuxeo Way: leveraging open source to build a world-class ECM platform
The Nuxeo Way: leveraging open source to build a world-class ECM platform
 
Extending Perforce Scalability Using Job Content Synchronization
Extending Perforce Scalability Using Job Content SynchronizationExtending Perforce Scalability Using Job Content Synchronization
Extending Perforce Scalability Using Job Content Synchronization
 
Let's talk about... Microservices
Let's talk about... MicroservicesLet's talk about... Microservices
Let's talk about... Microservices
 
Docker Geneva Meetup - Introduction to Docker
Docker Geneva Meetup - Introduction to DockerDocker Geneva Meetup - Introduction to Docker
Docker Geneva Meetup - Introduction to Docker
 
Domain Driven Design
Domain Driven DesignDomain Driven Design
Domain Driven Design
 
ABSE and AtomWeaver : A Quantum Leap in Software Development
ABSE and AtomWeaver : A Quantum Leap in Software DevelopmentABSE and AtomWeaver : A Quantum Leap in Software Development
ABSE and AtomWeaver : A Quantum Leap in Software Development
 
Dataverse in the European Open Science Cloud
Dataverse in the European Open Science CloudDataverse in the European Open Science Cloud
Dataverse in the European Open Science Cloud
 
ATI Technical CONOPS and Concepts Technical Training Course Sampler
ATI Technical CONOPS and Concepts Technical Training Course SamplerATI Technical CONOPS and Concepts Technical Training Course Sampler
ATI Technical CONOPS and Concepts Technical Training Course Sampler
 
Is There a Return on Investment from Model-Based Systems Engineering?
Is There a Return on Investment from Model-Based Systems Engineering?Is There a Return on Investment from Model-Based Systems Engineering?
Is There a Return on Investment from Model-Based Systems Engineering?
 
Brian Loesgen An Early Look At Oslo
Brian  Loesgen    An  Early  Look At  OsloBrian  Loesgen    An  Early  Look At  Oslo
Brian Loesgen An Early Look At Oslo
 
Microservices Architecture
Microservices ArchitectureMicroservices Architecture
Microservices Architecture
 
OW2con11 Use Case SOA, Nov 24-25, Paris
OW2con11 Use Case SOA, Nov 24-25, ParisOW2con11 Use Case SOA, Nov 24-25, Paris
OW2con11 Use Case SOA, Nov 24-25, Paris
 
Running Cognos on Hadoop
Running Cognos on HadoopRunning Cognos on Hadoop
Running Cognos on Hadoop
 
Kubernetes on DC/OS
Kubernetes on DC/OSKubernetes on DC/OS
Kubernetes on DC/OS
 
Introducción a Microservicios, SUSE CaaS Platform y Kubernetes
Introducción a Microservicios, SUSE CaaS Platform y KubernetesIntroducción a Microservicios, SUSE CaaS Platform y Kubernetes
Introducción a Microservicios, SUSE CaaS Platform y Kubernetes
 
Lean Engineering. Applying Lean Principles to Building Experiences
Lean Engineering. Applying Lean Principles to Building ExperiencesLean Engineering. Applying Lean Principles to Building Experiences
Lean Engineering. Applying Lean Principles to Building Experiences
 
Containerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxContainerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptx
 
OHB_7040_25.09.2014
OHB_7040_25.09.2014OHB_7040_25.09.2014
OHB_7040_25.09.2014
 

Último

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Último (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

Ducasse's Maintenance Expertise

  • 1. LSE A Portfolio of Software Evolution Expertise Stéphane Ducasse stephane.ducasse@inria.fr http://stephane.ducasse.free.fr/ Stéphane Ducasse 1
  • 2. A word of presentation Co-author of Object-Oriented Reengineering Patterns Co-developer of Moose (reengineering platform) 10 PhD Theses in reengineering 50+ articles Grounded in reality Was maintainer of Squeak 3.9 Worked with: Harman-Becker AG Bedag AG Nokia, Daimler LSE S.Ducasse 2
  • 3. Roadmap • Some facts • Our approach • Supporting maintenance • Moose an open-platform • Some visual examples • Conclusion LSE S.Ducasse 3
  • 4. Software is complex. 29% Succeeded 18% Failed 53% Challenged The Standish Group, 2004 LSE S.Ducasse 4
  • 5. How large is your project? LSE S.Ducasse 5
  • 6. How large is your project? LSE S.Ducasse 5
  • 7. How large is your project? LSE S.Ducasse 5
  • 8. How large is your project? LSE S.Ducasse 5
  • 9. How large is your project? 1’000’000 lines of code LSE S.Ducasse 5
  • 10. How large is your project? 1’000’000 lines of code * 2 = 2’000’000 seconds LSE S.Ducasse 5
  • 11. How large is your project? 1’000’000 lines of code * 2 = 2’000’000 seconds / 3600 = 560 hours LSE S.Ducasse 5
  • 12. How large is your project? 1’000’000 lines of code * 2 = 2’000’000 seconds / 3600 = 560 hours / 8 = 70 days LSE S.Ducasse 5
  • 13. How large is your project? 1’000’000 lines of code * 2 = 2’000’000 seconds / 3600 = 560 hours / 8 = 70 days / 20 = 3 months LSE S.Ducasse 5
  • 14. Maintenance is Continuous Development 4.1% Other 18.2% Adaptive (new platforms or OS) Relative Maintenance Effort Between 50% and 75% of global effort is spent on 17.4% Corrective “maintenance” ! (fixing reported errors) 60.3% Perfective (new functionality) The bulk of the maintenance cost is due to new functionality even with better requirements, it is hard to predict new functions LSE S.Ducasse 6
  • 15. Lehman’s Software Evolution Laws Continuous Change: “A program that is used in a real-world environment must change, or become progressively less useful in that environment.” Software Entropy: “As a program evolves, it becomes more complex, and extra resources are needed to preserve and simplify its structure.” LSE S.Ducasse 7
  • 16. Roadmap • Some facts • Our approach • Supporting maintenance • Moose an open-platform • Some visual examples • Conclusion LSE S.Ducasse 8
  • 17. Supporting the evolution of applications A research goal and agenda grounded in reality How to help companies maintaining their large software? What is the xray for software? code, people, practices Which analyses? How can you monitor your system (dashboards....) How to present extracted information? S.Ducasse 9
  • 18. Covered topics Analyses Topics Reverse Engineering Metamodeling, Software metrics, Program understanding, Representation Transformations Visualization, Evolution analysis, Duplicated code detection, Evolution Code Analysis, Refactorings, Tests Contributions Moose: an open-source extensible reengineering environment: (Lugano, Bern, Annecy, Anvers, Louvain la neuve, ULB, UTSL) Contacts Harman-Becker (3 Millions C++), Bedag (Cobol), Nokia, ABB, IMEC S.Ducasse 10
  • 19. Software Metrics [LMO99, OOPSLA00] Duplicated Code Identification Understanding Large Systems [ICSM99, ICSM02] Group Identification [WCRE99, TSI00, TSE03] Static/Dynamic Information [ASE03] Test Generation [ICSM99] Feature Analysis [CSMR 06] Concept Identification [JSME 06] Analyses [WCRE 06] Class Understanding [OOPSLA01,TSE04] Package Blueprints Reverse [ICSM 07] Engineering Distribution Maps [ICSM 06] Representation Transformations Language Independent Refactorings [IWPSE 00] Evolution Language Independent Meta Model (FAMIX) Reengineering Patterns [UML99] Version Analyses An Extensible Reengineering [ICSM 05] Environment (Moose) HISMO metamodel [Models 06] [JSME 05] LSE S.Ducasse 11
  • 20. One Example: who is responsible of what? (4) Visualisation (3) Analyses 2) Modèle (1) Extraction Distribution Map of authors on JBoss S.Ducasse 12
  • 21. Moose is a reengineering tool which integrates multiple techniques Number of classes = 382 Number of methods = 4268 Metrics … Visualization Moose Queries and Navigation word1 word2 … Semantic Analysis Evolution Analysis LSE S.Ducasse 13
  • 22. Moose is open and open-source meta-described meta-model aware Method Class Inheritance LSE S.Ducasse 14
  • 23. Designed to be extensible Class History Duplication Class Author Version Method Class File Event Inheritance Trace LSE S.Ducasse 15
  • 24. Roadmap • Some facts • Our approach • Supporting maintenance • Moose an open-platform • Some visual examples • Conclusion LSE S.Ducasse 16
  • 25. Understanding large systems Understanding code is difficult! Systems are large Code is abstract Should I really convinced you? Some existing approaches Metrics: problems you often get meaningless results once combined Visualization: often beautiful but without meaning LSE S.Ducasse 17
  • 26. Polymetric views W: # fields H: # methods C: # lines of code LSE S.Ducasse 18
  • 27. Polymetric views condense information To get a feel of the inheritance semantics: adding vs. reusing Classes+Inheritance W: # of Added Methods H: # of Overridden Method C: # of Method Extended methods LOC # statements # parameters LSE S.Ducasse 19
  • 28. Navigating Views... LSE S.Ducasse 20
  • 29. Understanding classes Understanding even a class is difficult! LSE S.Ducasse 21
  • 30. Class Blueprint Enriched call flow annotated with metrics to give semantics Initialization External Interface Internal Implementation Accessor Attribute Invocation Sequence LSE S.Ducasse 22
  • 31. Class Blueprint LSE S.Ducasse 23
  • 32. Large delegating interface LSE S.Ducasse 24
  • 33. Sharing Flows LSE S.Ducasse 25
  • 34. Regular Subclasses LSE S.Ducasse 26
  • 35. Patterns LSE S.Ducasse 27
  • 36. How can we predict changes? Common wisdom stresses that what changes yesterday will change today, but it is true? In the Sahara the weather is constant, tomorrow: 90% chance that it is the same as today In Belgium, the weather is changing really fast (sea influence), 30% chance that it is the same as today LSE S.Ducasse 28
  • 37. With history analysis we can get the climate of a software system Past Late Future Early Changers Changers 1, TopLENOM1..i (S, t1) ∩ TopEENOMi..n (S, t2) ≠ ∅ YWi(S) = 0, TopLENOM1..i (S, t1) ∩ TopEENOMi..n (S, t2) = ∅ ∑ YWi(S, t1, t2) YW(S, t1, t2) = Past Present Future n-2 hit versions version versions LSE S.Ducasse 29
  • 38. How developers develop? • More efficient to put people working together in the same office? • How can we optimize software development? LSE S.Ducasse 30
  • 39. Who did that? Files Time LSE S.Ducasse 31
  • 40. Line colors show which author owned which files in which period Green author Green author large commit ownership File A File B Blue author small commit LSE S.Ducasse 32
  • 41. Which author “possesses” which files? LSE S.Ducasse 33
  • 42. Alphabetical order is no order! LSE S.Ducasse 34
  • 43. Based on similar commit signature Edit Takeover Monologue Familiarization Dialogue LSE S.Ducasse 35
  • 44. Understanding evolution of large systems • How old are the hierarchies? • How did the classes change? • How did the inheritance change? LSE S.Ducasse 36
  • 45. Evolution holds useful information A A A A A BC BC BC B D D D time A is persistent C was removed B is stable E is newborn D inherited from C and then from A … LSE S.Ducasse 37
  • 46. Hierarchy Evolution Complexity View characterizes class hierarchy histories ENOM A Age ENOS Class History Removed C B Age Inheritance History E D Removed A is persistent C was removed B is stable E is newborn D inherited from C and then from A … LSE S.Ducasse 38
  • 47. Class hierarchies over 40 versions of Jun - a 740 classes, 3D framework LSE S.Ducasse 39
  • 48. Identifying Duplicated Code “Parsing the program suite of interest requires a parser for the language dialect of interest. While this is nominally an easy task, in practice one must acquire a tested grammar for the dialect of the language at hand. Often for legacy codes, the dialect is unique and the developing organization will need to build their own parser. Worse, legacy systems often have a number of languages and a parser is needed for each. Standard tools such as Lex and Yacc are rather a disappointment for this purpose, as they deal poorly with lexical hiccups and language ambiguities.” [Baxter 98] Problems Unknown Duplicated Code Scalability Understanding LSE S.Ducasse 40
  • 49. Language Independent a b c defa b cdef Language independent, Textual, [ICSM’99], M. Rieger’s PhD. Thesis Duploc handled Exact Copies Pascal, Java, Smalltalk, Python, a b c d e fa b x y e f Cobol, C++, PDP-11, C Slower than other approaches but... Max 45 min to adapt our approach to a new language Between 3% and 10% Copies with less identification than parametrized match LSE S.Ducasse 41
  • 50. A Conceptual Matrix File A File B a b c defa b cdef File A Exact Copies a b c d e fa b x y e f File B Copies with Variations 42 LSE S.Ducasse
  • 51. Entities that change together can reveal hidden dependencies (A,B,C,D,E) () A 2 3 3 3 4 6 (A,B,C,D) (A,D,E) (v6) (v2) B 6 6 6 5 6 7 (A,B,C) (D,E) (A,D) C 3 3 5 5 8 9 (v5,v6) (v2,v4) (v2,v6) D 1 3 3 4 4 6 (D) (C) (A) (v2,v4,v6) (v3,v5,v6) (v2,v5,v6) E 4 5 5 6 6 6 v1 v2 v3 v4 v5 v6 () (v1,v2,v3,v4,v5,v6) LSE S.Ducasse 43
  • 52. How properties spread in large systems? Properties: Metrics People Symbol/Concepts Spread = how many packages does it touch? Focus = do packages and properties match? Distribution Map: a generic visualization LSE S.Ducasse 44
  • 53. Distribution Map LSE S.Ducasse 45
  • 54. Ownership • Authors in JBoss LSE S.Ducasse 46
  • 55. Characterizing Packages Butterflies [Metrics05] Kind of Radar LSE S.Ducasse 47
  • 56. Relative version LSE S.Ducasse 48
  • 57. How to understand Packages Packages are key structuring elements But complex: import classes.... Package Blueprints [ICSM 2007] LSE S.Ducasse 49
  • 58. Surfaces represent package communication classes in P1 that do references A3 A4 A2 B4 B4 D1 E1 P4 surface P4 P2 P3 A4 C1 A2 A1 P2 surface A1 B1 C1 D1 A3 B1 P3 surface E1 referenced P1: analyzed package classes P1 blueprint LSE S.Ducasse 50
  • 59. Principle P2 P3 P4 A2 B2 A3 B3 A4 D1 E1 F1 G1 C1 A1 B1 H1 I1 P1 D1 E1 F1 G1 C1 A1 B1 H1 I1 col col col col col col col col col col col col col col col col col col A1 D1 G1 Internal Internal E1 F1 referenced classes referenced classes references B1 C1 H1 I1 A1 C1 B1 internal references head A1 C1 B1 internal head G1 H1 I1 Package under analysis G1 H1 I1 P1 B3 D1 E1 F1 G1 B3 D1 E1 F1 G1 A3 D1 E1 C1 body A3 D1 E1 C1 references body external references A2 A1 external A2 A1 B2 D1 B2 D1 A4 E1 F1 G1 A4 E1 F1 G1 most—least External most—least internal referencing classes External referenced classes internal referencing classes referenced classes LSE S.Ducasse 51
  • 60. Example LSE S.Ducasse 52
  • 61. Symbols contain domain information • What are the concepts used in an application? • How can we use symbolic information? LSE S.Ducasse 53
  • 62. Looking at the Symbols • Developers use meaningful names, which capture the domain knowledge. LSE S.Ducasse 54
  • 63. A cluster is a group of documents which use the same terms LSE S.Ducasse 55
  • 64. Moose has been validated on real life systems Several large, industrial case studies (NDA) Harman-Becker Nokia Daimler Siemens Different implementation languages (C++, Java, Smalltalk, Cobol) We use external C++ parsers Different sizes Moose is used in several research groups LSE S.Ducasse 56
  • 65. Possible New Research Directions • Remodularization • Clustering analysis • Open and Modular modules • Service Identification in Service Oriented Architecture • Architecture Extraction/Validation • Software Quality • Cost/Bugs prediction • EJB evaluation • Business rules extraction • Model transformation • Test LSE S.Ducasse 57
  • 66. Evolution/Maintenance is a challenge Understanding and maintaining large and complex applications needs better tools/analyses Moose is a platform for developing new analyses Transfer to tool vendors LSE S.Ducasse 58