SlideShare una empresa de Scribd logo
1 de 76
“Next Generation” Version Control Systems
       ...or why bzr, hg, and git rock so hard


                       S. Vilain

                 Catalyst IT (NZ) Limited


          Open Source Conference, 2007
Outline


   Historical Context

   Distributed >> Centralised
      Benefits of Distribution
      The Model of Distribution
      Revision Data Warehousing
      Stable Development Model

   The Variety of NG VCS Experience
      The Players
      Differentiating Factors
      Pragmatic Concerns
Outline


   Historical Context

   Distributed >> Centralised
      Benefits of Distribution
      The Model of Distribution
      Revision Data Warehousing
      Stable Development Model

   The Variety of NG VCS Experience
      The Players
      Differentiating Factors
      Pragmatic Concerns
Outline


   Historical Context

   Distributed >> Centralised
      Benefits of Distribution
      The Model of Distribution
      Revision Data Warehousing
      Stable Development Model

   The Variety of NG VCS Experience
      The Players
      Differentiating Factors
      Pragmatic Concerns
RNA - Information and Function
c. 2-3bn. BCE




         Basic Copying of
     ◮
         Information
         No protection or
     ◮
         validation
Encapsulation of Information
c. 1-2bn. BCE




         Protected Copying of
     ◮
         Information
         Now fully distributed
     ◮
Neural Networks
c. 1bn. BCE-today




         Transmission of
     ◮
         references to
         information
         Distributed in clusters
     ◮
         (organisms)
         Generally support
     ◮
         forking well
         Facilitated
     ◮
         development of
         Mathematics and
         Computer Science
Synchronous Editing
SCCS (c.1970), RCS (c.1980)




         Source and
     ◮
         repository kept
         together
         Free (GNU) version –
     ◮
         RCS
         Practice - locking for
     ◮
         edit, “lock stealing”
         RCS still best
     ◮
         practice for SysAdmin
         use
Detached Repository - Concurrent Versions System
1991-c.2001




         Shell scripts to
     ◮
         separate source from
         checkout
         Network separation
     ◮
         via rsh
         Used branching
     ◮
         support in RCS for
         concurrent
         development
Sub- Version System
2001-




            C programs to
        ◮
            separate source from
            checkout
            Network separation
        ◮
            via binary protocol,
            WebDAV
            Flattened branching,
        ◮
            some copy efficiency,
            new “dimension” of
            properties
Distributed Development - Patch-based systems
from 1985




         patch: automatic
     ◮
         application of context
         diffs
         Unified diffs - allow
     ◮
         changes to be
         reviewed
         “tags” simply
     ◮
         snapshots of source
         many tools based on
     ◮
         patches - arch,
         bazaar, darcs
“Next Generation” tools
c.2002-




          Fully distributed
     ◮

          Every revision
     ◮
          trace-able
          Efficient
     ◮
          packs/bundles (sets
          of revisions)
          Complete, uniform
     ◮
          distribution of history
          Many peripheral
     ◮
          benefits
Outline


   Historical Context

   Distributed >> Centralised
      Benefits of Distribution
      The Model of Distribution
      Revision Data Warehousing
      Stable Development Model

   The Variety of NG VCS Experience
      The Players
      Differentiating Factors
      Pragmatic Concerns
Distribution Benefits 1 of many
Central point of failure



     Central point of failure:
            centralisation requires a “master” to, at the very least,
        ◮
            assign commit IDs
            decentralisation assigns commit IDs in unique ways
        ◮
            (content hashing, UUIDs)
     So:
            Central servers become points of failure (for the services
        ◮
            they provide) and contention. ie, your server goes down,
            people are interrupted
            collaboration between disconnected people impeded
        ◮
Distribution Benefits 1 of many
Central point of failure



     Central point of failure:
            centralisation requires a “master” to, at the very least,
        ◮
            assign commit IDs
            decentralisation assigns commit IDs in unique ways
        ◮
            (content hashing, UUIDs)
     So:
            Central servers become points of failure (for the services
        ◮
            they provide) and contention. ie, your server goes down,
            people are interrupted
            collaboration between disconnected people impeded
        ◮
Distribution Benefits 2 of many
transactions or atomic commits?



    So, your Centralised VCS gives you “Atomic Updates”
    Unix guarantees write ordering on filehandles, but that does not
    make it a database.
    “A” is only one letter out of “ACID”.
    So,
           centralisation is inherantly non-transactional - “dirty read” -
       ◮
           changes all in the same place
           decentralisation is inherantly transactional - “consistent
       ◮
           read” - your changes don’t affect others
Distribution Benefits 2 of many
transactions or atomic commits?



    So, your Centralised VCS gives you “Atomic Updates”
    Unix guarantees write ordering on filehandles, but that does not
    make it a database.
    “A” is only one letter out of “ACID”.
    So,
           centralisation is inherantly non-transactional - “dirty read” -
       ◮
           changes all in the same place
           decentralisation is inherantly transactional - “consistent
       ◮
           read” - your changes don’t affect others
Distribution Benefits 3 of many
any to any merge pattern




         centralisation requires
     ◮
         the “Star” pattern
               one big cluster of
           ◮

               develpment
         decentralisation makes
     ◮
         such constructs optional
         or notional
               self-forming clusters
           ◮

               of develpment
Distribution Benefits 3 of many
any to any merge pattern




         centralisation requires
     ◮
         the “Star” pattern
               one big cluster of
           ◮

               develpment
         decentralisation makes
     ◮
         such constructs optional
         or notional
               self-forming clusters
           ◮

               of develpment
Outline


   Historical Context

   Distributed >> Centralised
      Benefits of Distribution
      The Model of Distribution
      Revision Data Warehousing
      Stable Development Model

   The Variety of NG VCS Experience
      The Players
      Differentiating Factors
      Pragmatic Concerns
Revision Model requirements for Distribution
The Revision DAG




         must represent
     ◮
         merges to work
         versions must not
     ◮
         change by location
         branching is in the
     ◮
         direction of changes
When Revision Models go Wrong #1
merge tracking can refer to other repositories




         initial state
     ◮

         sync from mirror
     ◮

         perform Merge
     ◮

         another change
     ◮

         push - whoops
     ◮

         push
     ◮
When Revision Models go Wrong #1
merge tracking can refer to other repositories




         initial state
     ◮

         sync from mirror
     ◮

         perform Merge
     ◮

         another change
     ◮

         push - whoops
     ◮

         push
     ◮
When Revision Models go Wrong #1
merge tracking can refer to other repositories




         initial state
     ◮

         sync from mirror
     ◮

         perform Merge
     ◮

         another change
     ◮

         push - whoops
     ◮

         push
     ◮
When Revision Models go Wrong #1
merge tracking can refer to other repositories




         initial state
     ◮

         sync from mirror
     ◮

         perform Merge
     ◮

         another change
     ◮

         push - whoops
     ◮

         push
     ◮
When Revision Models go Wrong #1
merge tracking can refer to other repositories




         initial state
     ◮

         sync from mirror
     ◮

         perform Merge
     ◮

         another change
     ◮

         push - whoops
     ◮

         push
     ◮
When Revision Models go Wrong #1
merge tracking can refer to other repositories




         initial state
     ◮

         sync from mirror
     ◮

         perform Merge
     ◮

         another change
     ◮

         push - whoops
     ◮

         push
     ◮
When Revision Models go Wrong #2
merge tracking within a repository

   (as in SVN 1.5+)
         initial state
     ◮

         make a change and new
     ◮
         branch
         make a change to trunk
     ◮

         merge trunk to branch
     ◮

         merge branch to trunk. Note
     ◮
         r4, r5 have (or should have)
         identical content and set of
         changes.
         what will merge from trunk to
     ◮
         branch do?
         distributed model resists the
     ◮
         problem
When Revision Models go Wrong #2
merge tracking within a repository

   (as in SVN 1.5+)
         initial state
     ◮

         make a change and new
     ◮
         branch
         make a change to trunk
     ◮

         merge trunk to branch
     ◮

         merge branch to trunk. Note
     ◮
         r4, r5 have (or should have)
         identical content and set of
         changes.
         what will merge from trunk to
     ◮
         branch do?
         distributed model resists the
     ◮
         problem
When Revision Models go Wrong #2
merge tracking within a repository

   (as in SVN 1.5+)
         initial state
     ◮

         make a change and new
     ◮
         branch
         make a change to trunk
     ◮

         merge trunk to branch
     ◮

         merge branch to trunk. Note
     ◮
         r4, r5 have (or should have)
         identical content and set of
         changes.
         what will merge from trunk to
     ◮
         branch do?
         distributed model resists the
     ◮
         problem
When Revision Models go Wrong #2
merge tracking within a repository

   (as in SVN 1.5+)
         initial state
     ◮

         make a change and new
     ◮
         branch
         make a change to trunk
     ◮

         merge trunk to branch
     ◮

         merge branch to trunk. Note
     ◮
         r4, r5 have (or should have)
         identical content and set of
         changes.
         what will merge from trunk to
     ◮
         branch do?
         distributed model resists the
     ◮
         problem
When Revision Models go Wrong #2
merge tracking within a repository

   (as in SVN 1.5+)
         initial state
     ◮

         make a change and new
     ◮
         branch
         make a change to trunk
     ◮

         merge trunk to branch
     ◮

         merge branch to trunk. Note
     ◮
         r4, r5 have (or should have)
         identical content and set of
         changes.
         what will merge from trunk to
     ◮
         branch do?
         distributed model resists the
     ◮
         problem
When Revision Models go Wrong #2
merge tracking within a repository

   (as in SVN 1.5+)
         initial state
     ◮

         make a change and new
     ◮
         branch
         make a change to trunk
     ◮

         merge trunk to branch
     ◮

         merge branch to trunk. Note
     ◮
         r4, r5 have (or should have)
         identical content and set of
         changes.
         what will merge from trunk to
     ◮
         branch do?
         distributed model resists the
     ◮
         problem
When Revision Models go Wrong #2
merge tracking within a repository

   (as in SVN 1.5+)
         initial state
     ◮

         make a change and new
     ◮
         branch
         make a change to trunk
     ◮

         merge trunk to branch
     ◮

         merge branch to trunk. Note
     ◮
         r4, r5 have (or should have)
         identical content and set of
         changes.
         what will merge from trunk to
     ◮
         branch do?
         distributed model resists the
     ◮
         problem
Outline


   Historical Context

   Distributed >> Centralised
      Benefits of Distribution
      The Model of Distribution
      Revision Data Warehousing
      Stable Development Model

   The Variety of NG VCS Experience
      The Players
      Differentiating Factors
      Pragmatic Concerns
Revision Data Warehouse
On-Line Analytical Processing of your source




           Goal: find more information on the reasons for changes
       ◮

           Observation: ‘log’ is only one approach of many
       ◮

           git-log -S’string’ : find changes that introduced
       ◮
           “string” (also hg grep, hgrep plug-in for bzr)
           git-annotate -C : follows lines moving between source
       ◮
           files
           visualization tools: gitk, hgk, olive, that allow advanced
       ◮
           inspection of history
Outline


   Historical Context

   Distributed >> Centralised
      Benefits of Distribution
      The Model of Distribution
      Revision Data Warehousing
      Stable Development Model

   The Variety of NG VCS Experience
      The Players
      Differentiating Factors
      Pragmatic Concerns
Stable Development Model
It’s not just a slogan




            Perhaps the major benefit of DSCM
        ◮

            Using DSCM doesn’t give you a stable development model
        ◮
            for free
            This development practice is possible with SVK or even
        ◮
            Subversion, in princple
            ...but with DSCM it’s actually achievable and even common
        ◮
            practice, because bad changes are more easily dropped
Stable Development Model
It’s not just a slogan




            Perhaps the major benefit of DSCM
        ◮

            Using DSCM doesn’t give you a stable development model
        ◮
            for free
            This development practice is possible with SVK or even
        ◮
            Subversion, in princple
            ...but with DSCM it’s actually achievable and even common
        ◮
            practice, because bad changes are more easily dropped
Stable Development Model
It’s not just a slogan




            Perhaps the major benefit of DSCM
        ◮

            Using DSCM doesn’t give you a stable development model
        ◮
            for free
            This development practice is possible with SVK or even
        ◮
            Subversion, in princple
            ...but with DSCM it’s actually achievable and even common
        ◮
            practice, because bad changes are more easily dropped
Stable Development Model
What you pay for what pay-off



    Discipline:
           no single commit can break the (build/test suite/etc)
       ◮

           every single commit is well described
       ◮

    Benefits:
           bisect: finding exactly which commit ruined your day, as
       ◮
           every revision should build and work
           review: much easier for third parties to comment
       ◮

           stability: if done right, even people tracking bleeding edge
       ◮
           don’t get put off working on the project by romping
           instability
Stable Development Model
What you pay for what pay-off



    Discipline:
           no single commit can break the (build/test suite/etc)
       ◮

           every single commit is well described
       ◮

    Benefits:
           bisect: finding exactly which commit ruined your day, as
       ◮
           every revision should build and work
           review: much easier for third parties to comment
       ◮

           stability: if done right, even people tracking bleeding edge
       ◮
           don’t get put off working on the project by romping
           instability
Outline


   Historical Context

   Distributed >> Centralised
      Benefits of Distribution
      The Model of Distribution
      Revision Data Warehousing
      Stable Development Model

   The Variety of NG VCS Experience
      The Players
      Differentiating Factors
      Pragmatic Concerns
Version Control: The Next Generation
Enterprise-ready




           Bazaar-NG (bzr) - python-based. Longest runner, not as
       ◮
           fast as the others but still keeping pace with features.
           git - Unix-style CL-API to internals. Blinding fast at almost
       ◮
           everything. Extremely active. Sports the “Content Hashed
           Filesystem” idea stolen from Monotone.
           Mercurial (hg) - python and C. Also extremely fast, with
       ◮
           progress and features defying their relatively small
           community.
Version Control: The Next Generation
Enterprise-ready




           Bazaar-NG (bzr) - python-based. Longest runner, not as
       ◮
           fast as the others but still keeping pace with features.
           git - Unix-style CL-API to internals. Blinding fast at almost
       ◮
           everything. Extremely active. Sports the “Content Hashed
           Filesystem” idea stolen from Monotone.
           Mercurial (hg) - python and C. Also extremely fast, with
       ◮
           progress and features defying their relatively small
           community.
Version Control: The Next Generation
Enterprise-ready




           Bazaar-NG (bzr) - python-based. Longest runner, not as
       ◮
           fast as the others but still keeping pace with features.
           git - Unix-style CL-API to internals. Blinding fast at almost
       ◮
           everything. Extremely active. Sports the “Content Hashed
           Filesystem” idea stolen from Monotone.
           Mercurial (hg) - python and C. Also extremely fast, with
       ◮
           progress and features defying their relatively small
           community.
Outline


   Historical Context

   Distributed >> Centralised
      Benefits of Distribution
      The Model of Distribution
      Revision Data Warehousing
      Stable Development Model

   The Variety of NG VCS Experience
      The Players
      Differentiating Factors
      Pragmatic Concerns
NG Tools: Inodes vs Content
who do you trust?



           git - considers inode history uninteresting, derivable from
       ◮
           the content. No conventions for explicitly recording inode
           movement history (renames etc)
                 pros: history mining more advanced by necessity, no scope
             ◮

                 for recording such information incorrectly.
                 cons: occasionally doesn’t detect renames, any
             ◮

                 inode-based operation relatively slow
           hg, bzr - file-based backing store means that inode history
       ◮
           is the primary approach
                 pros: “lossless” (or, GIGO if you prefer) storing of file history
             ◮

                 cons: commit files the wrong way and you might not see the
             ◮

                 real history.
NG Tools: Inodes vs Content
who do you trust?



           git - considers inode history uninteresting, derivable from
       ◮
           the content. No conventions for explicitly recording inode
           movement history (renames etc)
                 pros: history mining more advanced by necessity, no scope
             ◮

                 for recording such information incorrectly.
                 cons: occasionally doesn’t detect renames, any
             ◮

                 inode-based operation relatively slow
           hg, bzr - file-based backing store means that inode history
       ◮
           is the primary approach
                 pros: “lossless” (or, GIGO if you prefer) storing of file history
             ◮

                 cons: commit files the wrong way and you might not see the
             ◮

                 real history.
NG Tools: UUID generation
random or just pseudo-random?




          hg, git - revision IDs checksum the content and revision
      ◮
          history, therefore offer lossless revisions
          bzr - patch IDs are purely UUIDs. Technically could
      ◮
          therefore lose data back to the last signed tag (“testament”)
    UUIDs:
          pros: when cherry picking, give you a token to refer to.
      ◮
          Also, partial imports when dealing with foreign VCSes
          easier
          cons: such tokens don’t guarantee anything about the
      ◮
          delivered change
NG Tools: UUID generation
random or just pseudo-random?




          hg, git - revision IDs checksum the content and revision
      ◮
          history, therefore offer lossless revisions
          bzr - patch IDs are purely UUIDs. Technically could
      ◮
          therefore lose data back to the last signed tag (“testament”)
    UUIDs:
          pros: when cherry picking, give you a token to refer to.
      ◮
          Also, partial imports when dealing with foreign VCSes
          easier
          cons: such tokens don’t guarantee anything about the
      ◮
          delivered change
NG Tools: UUID generation
random or just pseudo-random?




          hg, git - revision IDs checksum the content and revision
      ◮
          history, therefore offer lossless revisions
          bzr - patch IDs are purely UUIDs. Technically could
      ◮
          therefore lose data back to the last signed tag (“testament”)
    UUIDs:
          pros: when cherry picking, give you a token to refer to.
      ◮
          Also, partial imports when dealing with foreign VCSes
          easier
          cons: such tokens don’t guarantee anything about the
      ◮
          delivered change
NG Tools: Lightweight Branches

        important to encourage frequent topic branching
    ◮

        ideal: every bug, feature, etc can get its own branch from
    ◮
        last stable release until it is fully reviewed and known to be
        good.
        git - virtually pioneered lightweight branches
    ◮

        hg - now supports lightweight branches well, though
    ◮
        repositories still accumulate deleted branches. Use mq
        bzr - not directly supported, so less efficient (but branching
    ◮
        still common practice)
  Advanced add-ons to manage refining changes - Stacked Git
  (stg) and guilt in git and Mercurial Queues (mq) in hg.
  For bzr there is also Patch Queue Manager (PQM) - branch
  dashboard, and rebase plug-in
NG Tools: Lightweight Branches

        important to encourage frequent topic branching
    ◮

        ideal: every bug, feature, etc can get its own branch from
    ◮
        last stable release until it is fully reviewed and known to be
        good.
        git - virtually pioneered lightweight branches
    ◮

        hg - now supports lightweight branches well, though
    ◮
        repositories still accumulate deleted branches. Use mq
        bzr - not directly supported, so less efficient (but branching
    ◮
        still common practice)
  Advanced add-ons to manage refining changes - Stacked Git
  (stg) and guilt in git and Mercurial Queues (mq) in hg.
  For bzr there is also Patch Queue Manager (PQM) - branch
  dashboard, and rebase plug-in
NG Tools: Lightweight Branches

        important to encourage frequent topic branching
    ◮

        ideal: every bug, feature, etc can get its own branch from
    ◮
        last stable release until it is fully reviewed and known to be
        good.
        git - virtually pioneered lightweight branches
    ◮

        hg - now supports lightweight branches well, though
    ◮
        repositories still accumulate deleted branches. Use mq
        bzr - not directly supported, so less efficient (but branching
    ◮
        still common practice)
  Advanced add-ons to manage refining changes - Stacked Git
  (stg) and guilt in git and Mercurial Queues (mq) in hg.
  For bzr there is also Patch Queue Manager (PQM) - branch
  dashboard, and rebase plug-in
Outline


   Historical Context

   Distributed >> Centralised
      Benefits of Distribution
      The Model of Distribution
      Revision Data Warehousing
      Stable Development Model

   The Variety of NG VCS Experience
      The Players
      Differentiating Factors
      Pragmatic Concerns
NG SCMs: ease of use
nanna-ready




          bzr and hg - ease of use and simplicity always considered
      ◮
          a driving focus.
          git - “If you want a usability feature, go implement it you
      ◮
          lazy user, this is Open Source, scratch your own itch would
          you?”
    ...however these days it’s much of a muchness
    ...however also remember I prefer git, so I would say that
NG SCMs: ease of use
nanna-ready




          bzr and hg - ease of use and simplicity always considered
      ◮
          a driving focus.
          git - “If you want a usability feature, go implement it you
      ◮
          lazy user, this is Open Source, scratch your own itch would
          you?”
    ...however these days it’s much of a muchness
    ...however also remember I prefer git, so I would say that
NG SCMs: ease of use
nanna-ready




          bzr and hg - ease of use and simplicity always considered
      ◮
          a driving focus.
          git - “If you want a usability feature, go implement it you
      ◮
          lazy user, this is Open Source, scratch your own itch would
          you?”
    ...however these days it’s much of a muchness
    ...however also remember I prefer git, so I would say that
NG SCMs: ease of use
nanna-ready




          bzr and hg - ease of use and simplicity always considered
      ◮
          a driving focus.
          git - “If you want a usability feature, go implement it you
      ◮
          lazy user, this is Open Source, scratch your own itch would
          you?”
    ...however these days it’s much of a muchness
    ...however also remember I prefer git, so I would say that
NG SCMs: portability




        hg - performs well on Windows and Unix
    ◮

        bzr - performs and installs well on both
    ◮

        git - written by Linus Torvalds
    ◮
NG SCMs: portability




        hg - performs well on Windows and Unix
    ◮

        bzr - performs and installs well on both
    ◮

        git - written by Linus Torvalds
    ◮
NG SCMs: portability




        hg - performs well on Windows and Unix
    ◮

        bzr - performs and installs well on both
    ◮

        git - written by Linus Torvalds
    ◮
NG SCMs: portability




       hg - performs well on
   ◮
                                “You want it to work on Windows?
       Windows and Unix
                                This is Open Source, scratch your
       bzr - performs and
   ◮
                                own itch would you you lazy
       installs well on both
                                Windows user?!”
       git - written by Linus
   ◮
       Torvalds
NG SCMs: portability




                                    Cygwin works today (slowly)
                                ◮
       hg - performs well on
   ◮
       Windows and Unix             MinGW port shows promise
                                ◮

       bzr - performs and           Minimal pure-Java
   ◮                            ◮
       installs well on both        implementation (for Eclipse)
       git - written by Linus       C# .NET implementation
   ◮                            ◮
       Torvalds                     underway by Mono crew
NG SCMs: speed




        bzr - not a primary focus. “Fast enough for most users”.
    ◮
        Good to many thousands of changes.
        hg - optimised for the “cold cache” case. Very fast.
    ◮

        git - optimises for the “warm cache” case. Very fast.
    ◮

  git and hg have certain operations one or the other is faster or
  slower at, but they are both far beyond the performance of
  virtually everything else.
NG SCMs: speed




        bzr - not a primary focus. “Fast enough for most users”.
    ◮
        Good to many thousands of changes.
        hg - optimised for the “cold cache” case. Very fast.
    ◮

        git - optimises for the “warm cache” case. Very fast.
    ◮

  git and hg have certain operations one or the other is faster or
  slower at, but they are both far beyond the performance of
  virtually everything else.
NG SCMs: speed




        bzr - not a primary focus. “Fast enough for most users”.
    ◮
        Good to many thousands of changes.
        hg - optimised for the “cold cache” case. Very fast.
    ◮

        git - optimises for the “warm cache” case. Very fast.
    ◮

  git and hg have certain operations one or the other is faster or
  slower at, but they are both far beyond the performance of
  virtually everything else.
NG SCMs: speed




        bzr - not a primary focus. “Fast enough for most users”.
    ◮
        Good to many thousands of changes.
        hg - optimised for the “cold cache” case. Very fast.
    ◮

        git - optimises for the “warm cache” case. Very fast.
    ◮

  git and hg have certain operations one or the other is faster or
  slower at, but they are both far beyond the performance of
  virtually everything else.
NG SCMs: repository size


        bzr - not a primary focus, though still quite efficient.
    ◮

        hg - very tight packs and repositories. Can “repack” via hg
    ◮
        bundle
        git - very tight packs and repositories. in principle more
    ◮
        space efficient than hg, but only rarely bourne out in
        practice. Typically 10 times smaller repositories than SVN
        fsfs.
  Both git and hg often get the entire project history and head
  checkout into a space smaller than a single svn HEAD
  checkout. GCC is one extreme example of this - 1.1GB SVN
  head checkout, 280MB git repository (vs approx. 18GB for the
  full SVN repository).
NG SCMs: repository size


        bzr - not a primary focus, though still quite efficient.
    ◮

        hg - very tight packs and repositories. Can “repack” via hg
    ◮
        bundle
        git - very tight packs and repositories. in principle more
    ◮
        space efficient than hg, but only rarely bourne out in
        practice. Typically 10 times smaller repositories than SVN
        fsfs.
  Both git and hg often get the entire project history and head
  checkout into a space smaller than a single svn HEAD
  checkout. GCC is one extreme example of this - 1.1GB SVN
  head checkout, 280MB git repository (vs approx. 18GB for the
  full SVN repository).
NG SCMs: repository size


        bzr - not a primary focus, though still quite efficient.
    ◮

        hg - very tight packs and repositories. Can “repack” via hg
    ◮
        bundle
        git - very tight packs and repositories. in principle more
    ◮
        space efficient than hg, but only rarely bourne out in
        practice. Typically 10 times smaller repositories than SVN
        fsfs.
  Both git and hg often get the entire project history and head
  checkout into a space smaller than a single svn HEAD
  checkout. GCC is one extreme example of this - 1.1GB SVN
  head checkout, 280MB git repository (vs approx. 18GB for the
  full SVN repository).
NG SCMs: repository size


        bzr - not a primary focus, though still quite efficient.
    ◮

        hg - very tight packs and repositories. Can “repack” via hg
    ◮
        bundle
        git - very tight packs and repositories. in principle more
    ◮
        space efficient than hg, but only rarely bourne out in
        practice. Typically 10 times smaller repositories than SVN
        fsfs.
  Both git and hg often get the entire project history and head
  checkout into a space smaller than a single svn HEAD
  checkout. GCC is one extreme example of this - 1.1GB SVN
  head checkout, 280MB git repository (vs approx. 18GB for the
  full SVN repository).
Summary




       The three NG tools covered differ chiefly in efficiency and
   ◮
       user interface
       The revision graph concept is seen in all these tools.
   ◮

       A location independent revision model is paramount to
   ◮
       successfully achieving the stable development model
For Further Reading I




      Various sections of relevance on Wikipedia articles
      http://en.wikipedia.org/wiki/Comparison of revision control software

      http://en.wikipedia.org/wiki/Git %28software%29#References


      S. Vilain.
      An Introduction to git-svn for Subversion/SVK users and
      deserters (advocacy and limitations sections)
      http://utsl.gen.nz/talks/git-svn/intro.html#wtf-why
      http://utsl.gen.nz/talks/git-svn/intro.html#sux

Más contenido relacionado

Destacado (12)

33(2)
33(2)33(2)
33(2)
 
Historiatecnologia
HistoriatecnologiaHistoriatecnologia
Historiatecnologia
 
Iddc
IddcIddc
Iddc
 
Λατινικά, ενότητα 6
Λατινικά, ενότητα 6Λατινικά, ενότητα 6
Λατινικά, ενότητα 6
 
ARCHITECTURAL PORTFOLIO- ECE ADIGÜZEL
ARCHITECTURAL PORTFOLIO-  ECE ADIGÜZELARCHITECTURAL PORTFOLIO-  ECE ADIGÜZEL
ARCHITECTURAL PORTFOLIO- ECE ADIGÜZEL
 
Intervento pagano 25 giugno 2015
Intervento pagano 25 giugno 2015Intervento pagano 25 giugno 2015
Intervento pagano 25 giugno 2015
 
TV projekat Mixer (talk show)
TV projekat Mixer (talk show)TV projekat Mixer (talk show)
TV projekat Mixer (talk show)
 
Estadisticas 2016 acumulado 30.01.2017
Estadisticas 2016 acumulado 30.01.2017Estadisticas 2016 acumulado 30.01.2017
Estadisticas 2016 acumulado 30.01.2017
 
Palestr vest pdf
Palestr vest pdfPalestr vest pdf
Palestr vest pdf
 
Diablos danzantes de yare
Diablos danzantes de yareDiablos danzantes de yare
Diablos danzantes de yare
 
Chico's fas investor presentation (july 2016) v-f
Chico's fas investor presentation (july 2016) v-fChico's fas investor presentation (july 2016) v-f
Chico's fas investor presentation (july 2016) v-f
 
Tipos de licencias
Tipos de licenciasTipos de licencias
Tipos de licencias
 

Similar a Os Vilain

Galera Multi Master Synchronous My S Q L Replication Clusters
Galera  Multi Master  Synchronous  My S Q L  Replication  ClustersGalera  Multi Master  Synchronous  My S Q L  Replication  Clusters
Galera Multi Master Synchronous My S Q L Replication Clusters
PerconaPerformance
 
Deployment With Subversion - Lorna Mitchell
Deployment With Subversion - Lorna MitchellDeployment With Subversion - Lorna Mitchell
Deployment With Subversion - Lorna Mitchell
dpc
 
Right-size Deployment Instances to Meet Enterprise Demand
Right-size Deployment Instances to Meet Enterprise Demand Right-size Deployment Instances to Meet Enterprise Demand
Right-size Deployment Instances to Meet Enterprise Demand
WSO2
 

Similar a Os Vilain (20)

Galera Multi Master Synchronous My S Q L Replication Clusters
Galera  Multi Master  Synchronous  My S Q L  Replication  ClustersGalera  Multi Master  Synchronous  My S Q L  Replication  Clusters
Galera Multi Master Synchronous My S Q L Replication Clusters
 
Version Control with SVN
Version Control with SVNVersion Control with SVN
Version Control with SVN
 
Scylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent DatabasesScylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent Databases
 
PHP Deployment With SVN
PHP Deployment With SVNPHP Deployment With SVN
PHP Deployment With SVN
 
Deployment With Subversion - Lorna Mitchell
Deployment With Subversion - Lorna MitchellDeployment With Subversion - Lorna Mitchell
Deployment With Subversion - Lorna Mitchell
 
VMware vSphere 6.0 - Troubleshooting Training - Day 3
VMware vSphere 6.0 - Troubleshooting Training - Day 3 VMware vSphere 6.0 - Troubleshooting Training - Day 3
VMware vSphere 6.0 - Troubleshooting Training - Day 3
 
VMware Advance Troubleshooting Workshop - Day 3
VMware Advance Troubleshooting Workshop - Day 3VMware Advance Troubleshooting Workshop - Day 3
VMware Advance Troubleshooting Workshop - Day 3
 
The Very Very Latest in Database Development - Oracle Open World 2012
The Very Very Latest in Database Development - Oracle Open World 2012The Very Very Latest in Database Development - Oracle Open World 2012
The Very Very Latest in Database Development - Oracle Open World 2012
 
The Very Very Latest In Database Development - Lucas Jellema - Oracle OpenWor...
The Very Very Latest In Database Development - Lucas Jellema - Oracle OpenWor...The Very Very Latest In Database Development - Lucas Jellema - Oracle OpenWor...
The Very Very Latest In Database Development - Lucas Jellema - Oracle OpenWor...
 
Right-size Deployment Instances to Meet Enterprise Demand
Right-size Deployment Instances to Meet Enterprise Demand Right-size Deployment Instances to Meet Enterprise Demand
Right-size Deployment Instances to Meet Enterprise Demand
 
The Good, The Bad, and The Avro (Graham Stirling, Saxo Bank and David Navalho...
The Good, The Bad, and The Avro (Graham Stirling, Saxo Bank and David Navalho...The Good, The Bad, and The Avro (Graham Stirling, Saxo Bank and David Navalho...
The Good, The Bad, and The Avro (Graham Stirling, Saxo Bank and David Navalho...
 
Distributed Version Control (DVCS) With Mercurial
Distributed Version Control (DVCS) With MercurialDistributed Version Control (DVCS) With Mercurial
Distributed Version Control (DVCS) With Mercurial
 
JavaEdge 2008: Your next version control system
JavaEdge 2008: Your next version control systemJavaEdge 2008: Your next version control system
JavaEdge 2008: Your next version control system
 
Disaggregated Networking - The Drivers, the Software & The High Availability
Disaggregated Networking - The Drivers, the Software & The High AvailabilityDisaggregated Networking - The Drivers, the Software & The High Availability
Disaggregated Networking - The Drivers, the Software & The High Availability
 
Development Tools - Maven
Development Tools - MavenDevelopment Tools - Maven
Development Tools - Maven
 
Microsoft SQL Server Testing Frameworks
Microsoft SQL Server Testing FrameworksMicrosoft SQL Server Testing Frameworks
Microsoft SQL Server Testing Frameworks
 
OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow Demos
 
Smart Client Development
Smart Client DevelopmentSmart Client Development
Smart Client Development
 
ESXpert strategies VMware vSphere
ESXpert strategies VMware vSphereESXpert strategies VMware vSphere
ESXpert strategies VMware vSphere
 
Implementing dr w. hyper v clustering
Implementing dr w. hyper v clusteringImplementing dr w. hyper v clustering
Implementing dr w. hyper v clustering
 

Más de oscon2007

J Ruby Whirlwind Tour
J Ruby Whirlwind TourJ Ruby Whirlwind Tour
J Ruby Whirlwind Tour
oscon2007
 
Solr Presentation5
Solr Presentation5Solr Presentation5
Solr Presentation5
oscon2007
 
Os Fitzpatrick Sussman Wiifm
Os Fitzpatrick Sussman WiifmOs Fitzpatrick Sussman Wiifm
Os Fitzpatrick Sussman Wiifm
oscon2007
 
Performance Whack A Mole
Performance Whack A MolePerformance Whack A Mole
Performance Whack A Mole
oscon2007
 
Os Lanphier Brashears
Os Lanphier BrashearsOs Lanphier Brashears
Os Lanphier Brashears
oscon2007
 
Os Fitzpatrick Sussman Swp
Os Fitzpatrick Sussman SwpOs Fitzpatrick Sussman Swp
Os Fitzpatrick Sussman Swp
oscon2007
 
Os Berlin Dispelling Myths
Os Berlin Dispelling MythsOs Berlin Dispelling Myths
Os Berlin Dispelling Myths
oscon2007
 
Os Keysholistic
Os KeysholisticOs Keysholistic
Os Keysholistic
oscon2007
 
Os Jonphillips
Os JonphillipsOs Jonphillips
Os Jonphillips
oscon2007
 
Os Urnerupdated
Os UrnerupdatedOs Urnerupdated
Os Urnerupdated
oscon2007
 

Más de oscon2007 (20)

J Ruby Whirlwind Tour
J Ruby Whirlwind TourJ Ruby Whirlwind Tour
J Ruby Whirlwind Tour
 
Solr Presentation5
Solr Presentation5Solr Presentation5
Solr Presentation5
 
Os Borger
Os BorgerOs Borger
Os Borger
 
Os Harkins
Os HarkinsOs Harkins
Os Harkins
 
Os Fitzpatrick Sussman Wiifm
Os Fitzpatrick Sussman WiifmOs Fitzpatrick Sussman Wiifm
Os Fitzpatrick Sussman Wiifm
 
Os Bunce
Os BunceOs Bunce
Os Bunce
 
Yuicss R7
Yuicss R7Yuicss R7
Yuicss R7
 
Performance Whack A Mole
Performance Whack A MolePerformance Whack A Mole
Performance Whack A Mole
 
Os Fogel
Os FogelOs Fogel
Os Fogel
 
Os Lanphier Brashears
Os Lanphier BrashearsOs Lanphier Brashears
Os Lanphier Brashears
 
Os Tucker
Os TuckerOs Tucker
Os Tucker
 
Os Fitzpatrick Sussman Swp
Os Fitzpatrick Sussman SwpOs Fitzpatrick Sussman Swp
Os Fitzpatrick Sussman Swp
 
Os Furlong
Os FurlongOs Furlong
Os Furlong
 
Os Berlin Dispelling Myths
Os Berlin Dispelling MythsOs Berlin Dispelling Myths
Os Berlin Dispelling Myths
 
Os Kimsal
Os KimsalOs Kimsal
Os Kimsal
 
Os Pruett
Os PruettOs Pruett
Os Pruett
 
Os Alrubaie
Os AlrubaieOs Alrubaie
Os Alrubaie
 
Os Keysholistic
Os KeysholisticOs Keysholistic
Os Keysholistic
 
Os Jonphillips
Os JonphillipsOs Jonphillips
Os Jonphillips
 
Os Urnerupdated
Os UrnerupdatedOs Urnerupdated
Os Urnerupdated
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 

Os Vilain

  • 1. “Next Generation” Version Control Systems ...or why bzr, hg, and git rock so hard S. Vilain Catalyst IT (NZ) Limited Open Source Conference, 2007
  • 2. Outline Historical Context Distributed >> Centralised Benefits of Distribution The Model of Distribution Revision Data Warehousing Stable Development Model The Variety of NG VCS Experience The Players Differentiating Factors Pragmatic Concerns
  • 3. Outline Historical Context Distributed >> Centralised Benefits of Distribution The Model of Distribution Revision Data Warehousing Stable Development Model The Variety of NG VCS Experience The Players Differentiating Factors Pragmatic Concerns
  • 4. Outline Historical Context Distributed >> Centralised Benefits of Distribution The Model of Distribution Revision Data Warehousing Stable Development Model The Variety of NG VCS Experience The Players Differentiating Factors Pragmatic Concerns
  • 5. RNA - Information and Function c. 2-3bn. BCE Basic Copying of ◮ Information No protection or ◮ validation
  • 6. Encapsulation of Information c. 1-2bn. BCE Protected Copying of ◮ Information Now fully distributed ◮
  • 7. Neural Networks c. 1bn. BCE-today Transmission of ◮ references to information Distributed in clusters ◮ (organisms) Generally support ◮ forking well Facilitated ◮ development of Mathematics and Computer Science
  • 8. Synchronous Editing SCCS (c.1970), RCS (c.1980) Source and ◮ repository kept together Free (GNU) version – ◮ RCS Practice - locking for ◮ edit, “lock stealing” RCS still best ◮ practice for SysAdmin use
  • 9. Detached Repository - Concurrent Versions System 1991-c.2001 Shell scripts to ◮ separate source from checkout Network separation ◮ via rsh Used branching ◮ support in RCS for concurrent development
  • 10. Sub- Version System 2001- C programs to ◮ separate source from checkout Network separation ◮ via binary protocol, WebDAV Flattened branching, ◮ some copy efficiency, new “dimension” of properties
  • 11. Distributed Development - Patch-based systems from 1985 patch: automatic ◮ application of context diffs Unified diffs - allow ◮ changes to be reviewed “tags” simply ◮ snapshots of source many tools based on ◮ patches - arch, bazaar, darcs
  • 12. “Next Generation” tools c.2002- Fully distributed ◮ Every revision ◮ trace-able Efficient ◮ packs/bundles (sets of revisions) Complete, uniform ◮ distribution of history Many peripheral ◮ benefits
  • 13. Outline Historical Context Distributed >> Centralised Benefits of Distribution The Model of Distribution Revision Data Warehousing Stable Development Model The Variety of NG VCS Experience The Players Differentiating Factors Pragmatic Concerns
  • 14. Distribution Benefits 1 of many Central point of failure Central point of failure: centralisation requires a “master” to, at the very least, ◮ assign commit IDs decentralisation assigns commit IDs in unique ways ◮ (content hashing, UUIDs) So: Central servers become points of failure (for the services ◮ they provide) and contention. ie, your server goes down, people are interrupted collaboration between disconnected people impeded ◮
  • 15. Distribution Benefits 1 of many Central point of failure Central point of failure: centralisation requires a “master” to, at the very least, ◮ assign commit IDs decentralisation assigns commit IDs in unique ways ◮ (content hashing, UUIDs) So: Central servers become points of failure (for the services ◮ they provide) and contention. ie, your server goes down, people are interrupted collaboration between disconnected people impeded ◮
  • 16. Distribution Benefits 2 of many transactions or atomic commits? So, your Centralised VCS gives you “Atomic Updates” Unix guarantees write ordering on filehandles, but that does not make it a database. “A” is only one letter out of “ACID”. So, centralisation is inherantly non-transactional - “dirty read” - ◮ changes all in the same place decentralisation is inherantly transactional - “consistent ◮ read” - your changes don’t affect others
  • 17. Distribution Benefits 2 of many transactions or atomic commits? So, your Centralised VCS gives you “Atomic Updates” Unix guarantees write ordering on filehandles, but that does not make it a database. “A” is only one letter out of “ACID”. So, centralisation is inherantly non-transactional - “dirty read” - ◮ changes all in the same place decentralisation is inherantly transactional - “consistent ◮ read” - your changes don’t affect others
  • 18. Distribution Benefits 3 of many any to any merge pattern centralisation requires ◮ the “Star” pattern one big cluster of ◮ develpment decentralisation makes ◮ such constructs optional or notional self-forming clusters ◮ of develpment
  • 19. Distribution Benefits 3 of many any to any merge pattern centralisation requires ◮ the “Star” pattern one big cluster of ◮ develpment decentralisation makes ◮ such constructs optional or notional self-forming clusters ◮ of develpment
  • 20. Outline Historical Context Distributed >> Centralised Benefits of Distribution The Model of Distribution Revision Data Warehousing Stable Development Model The Variety of NG VCS Experience The Players Differentiating Factors Pragmatic Concerns
  • 21. Revision Model requirements for Distribution The Revision DAG must represent ◮ merges to work versions must not ◮ change by location branching is in the ◮ direction of changes
  • 22. When Revision Models go Wrong #1 merge tracking can refer to other repositories initial state ◮ sync from mirror ◮ perform Merge ◮ another change ◮ push - whoops ◮ push ◮
  • 23. When Revision Models go Wrong #1 merge tracking can refer to other repositories initial state ◮ sync from mirror ◮ perform Merge ◮ another change ◮ push - whoops ◮ push ◮
  • 24. When Revision Models go Wrong #1 merge tracking can refer to other repositories initial state ◮ sync from mirror ◮ perform Merge ◮ another change ◮ push - whoops ◮ push ◮
  • 25. When Revision Models go Wrong #1 merge tracking can refer to other repositories initial state ◮ sync from mirror ◮ perform Merge ◮ another change ◮ push - whoops ◮ push ◮
  • 26. When Revision Models go Wrong #1 merge tracking can refer to other repositories initial state ◮ sync from mirror ◮ perform Merge ◮ another change ◮ push - whoops ◮ push ◮
  • 27. When Revision Models go Wrong #1 merge tracking can refer to other repositories initial state ◮ sync from mirror ◮ perform Merge ◮ another change ◮ push - whoops ◮ push ◮
  • 28. When Revision Models go Wrong #2 merge tracking within a repository (as in SVN 1.5+) initial state ◮ make a change and new ◮ branch make a change to trunk ◮ merge trunk to branch ◮ merge branch to trunk. Note ◮ r4, r5 have (or should have) identical content and set of changes. what will merge from trunk to ◮ branch do? distributed model resists the ◮ problem
  • 29. When Revision Models go Wrong #2 merge tracking within a repository (as in SVN 1.5+) initial state ◮ make a change and new ◮ branch make a change to trunk ◮ merge trunk to branch ◮ merge branch to trunk. Note ◮ r4, r5 have (or should have) identical content and set of changes. what will merge from trunk to ◮ branch do? distributed model resists the ◮ problem
  • 30. When Revision Models go Wrong #2 merge tracking within a repository (as in SVN 1.5+) initial state ◮ make a change and new ◮ branch make a change to trunk ◮ merge trunk to branch ◮ merge branch to trunk. Note ◮ r4, r5 have (or should have) identical content and set of changes. what will merge from trunk to ◮ branch do? distributed model resists the ◮ problem
  • 31. When Revision Models go Wrong #2 merge tracking within a repository (as in SVN 1.5+) initial state ◮ make a change and new ◮ branch make a change to trunk ◮ merge trunk to branch ◮ merge branch to trunk. Note ◮ r4, r5 have (or should have) identical content and set of changes. what will merge from trunk to ◮ branch do? distributed model resists the ◮ problem
  • 32. When Revision Models go Wrong #2 merge tracking within a repository (as in SVN 1.5+) initial state ◮ make a change and new ◮ branch make a change to trunk ◮ merge trunk to branch ◮ merge branch to trunk. Note ◮ r4, r5 have (or should have) identical content and set of changes. what will merge from trunk to ◮ branch do? distributed model resists the ◮ problem
  • 33. When Revision Models go Wrong #2 merge tracking within a repository (as in SVN 1.5+) initial state ◮ make a change and new ◮ branch make a change to trunk ◮ merge trunk to branch ◮ merge branch to trunk. Note ◮ r4, r5 have (or should have) identical content and set of changes. what will merge from trunk to ◮ branch do? distributed model resists the ◮ problem
  • 34. When Revision Models go Wrong #2 merge tracking within a repository (as in SVN 1.5+) initial state ◮ make a change and new ◮ branch make a change to trunk ◮ merge trunk to branch ◮ merge branch to trunk. Note ◮ r4, r5 have (or should have) identical content and set of changes. what will merge from trunk to ◮ branch do? distributed model resists the ◮ problem
  • 35. Outline Historical Context Distributed >> Centralised Benefits of Distribution The Model of Distribution Revision Data Warehousing Stable Development Model The Variety of NG VCS Experience The Players Differentiating Factors Pragmatic Concerns
  • 36. Revision Data Warehouse On-Line Analytical Processing of your source Goal: find more information on the reasons for changes ◮ Observation: ‘log’ is only one approach of many ◮ git-log -S’string’ : find changes that introduced ◮ “string” (also hg grep, hgrep plug-in for bzr) git-annotate -C : follows lines moving between source ◮ files visualization tools: gitk, hgk, olive, that allow advanced ◮ inspection of history
  • 37. Outline Historical Context Distributed >> Centralised Benefits of Distribution The Model of Distribution Revision Data Warehousing Stable Development Model The Variety of NG VCS Experience The Players Differentiating Factors Pragmatic Concerns
  • 38. Stable Development Model It’s not just a slogan Perhaps the major benefit of DSCM ◮ Using DSCM doesn’t give you a stable development model ◮ for free This development practice is possible with SVK or even ◮ Subversion, in princple ...but with DSCM it’s actually achievable and even common ◮ practice, because bad changes are more easily dropped
  • 39. Stable Development Model It’s not just a slogan Perhaps the major benefit of DSCM ◮ Using DSCM doesn’t give you a stable development model ◮ for free This development practice is possible with SVK or even ◮ Subversion, in princple ...but with DSCM it’s actually achievable and even common ◮ practice, because bad changes are more easily dropped
  • 40. Stable Development Model It’s not just a slogan Perhaps the major benefit of DSCM ◮ Using DSCM doesn’t give you a stable development model ◮ for free This development practice is possible with SVK or even ◮ Subversion, in princple ...but with DSCM it’s actually achievable and even common ◮ practice, because bad changes are more easily dropped
  • 41. Stable Development Model What you pay for what pay-off Discipline: no single commit can break the (build/test suite/etc) ◮ every single commit is well described ◮ Benefits: bisect: finding exactly which commit ruined your day, as ◮ every revision should build and work review: much easier for third parties to comment ◮ stability: if done right, even people tracking bleeding edge ◮ don’t get put off working on the project by romping instability
  • 42. Stable Development Model What you pay for what pay-off Discipline: no single commit can break the (build/test suite/etc) ◮ every single commit is well described ◮ Benefits: bisect: finding exactly which commit ruined your day, as ◮ every revision should build and work review: much easier for third parties to comment ◮ stability: if done right, even people tracking bleeding edge ◮ don’t get put off working on the project by romping instability
  • 43. Outline Historical Context Distributed >> Centralised Benefits of Distribution The Model of Distribution Revision Data Warehousing Stable Development Model The Variety of NG VCS Experience The Players Differentiating Factors Pragmatic Concerns
  • 44.
  • 45. Version Control: The Next Generation Enterprise-ready Bazaar-NG (bzr) - python-based. Longest runner, not as ◮ fast as the others but still keeping pace with features. git - Unix-style CL-API to internals. Blinding fast at almost ◮ everything. Extremely active. Sports the “Content Hashed Filesystem” idea stolen from Monotone. Mercurial (hg) - python and C. Also extremely fast, with ◮ progress and features defying their relatively small community.
  • 46. Version Control: The Next Generation Enterprise-ready Bazaar-NG (bzr) - python-based. Longest runner, not as ◮ fast as the others but still keeping pace with features. git - Unix-style CL-API to internals. Blinding fast at almost ◮ everything. Extremely active. Sports the “Content Hashed Filesystem” idea stolen from Monotone. Mercurial (hg) - python and C. Also extremely fast, with ◮ progress and features defying their relatively small community.
  • 47. Version Control: The Next Generation Enterprise-ready Bazaar-NG (bzr) - python-based. Longest runner, not as ◮ fast as the others but still keeping pace with features. git - Unix-style CL-API to internals. Blinding fast at almost ◮ everything. Extremely active. Sports the “Content Hashed Filesystem” idea stolen from Monotone. Mercurial (hg) - python and C. Also extremely fast, with ◮ progress and features defying their relatively small community.
  • 48. Outline Historical Context Distributed >> Centralised Benefits of Distribution The Model of Distribution Revision Data Warehousing Stable Development Model The Variety of NG VCS Experience The Players Differentiating Factors Pragmatic Concerns
  • 49. NG Tools: Inodes vs Content who do you trust? git - considers inode history uninteresting, derivable from ◮ the content. No conventions for explicitly recording inode movement history (renames etc) pros: history mining more advanced by necessity, no scope ◮ for recording such information incorrectly. cons: occasionally doesn’t detect renames, any ◮ inode-based operation relatively slow hg, bzr - file-based backing store means that inode history ◮ is the primary approach pros: “lossless” (or, GIGO if you prefer) storing of file history ◮ cons: commit files the wrong way and you might not see the ◮ real history.
  • 50. NG Tools: Inodes vs Content who do you trust? git - considers inode history uninteresting, derivable from ◮ the content. No conventions for explicitly recording inode movement history (renames etc) pros: history mining more advanced by necessity, no scope ◮ for recording such information incorrectly. cons: occasionally doesn’t detect renames, any ◮ inode-based operation relatively slow hg, bzr - file-based backing store means that inode history ◮ is the primary approach pros: “lossless” (or, GIGO if you prefer) storing of file history ◮ cons: commit files the wrong way and you might not see the ◮ real history.
  • 51. NG Tools: UUID generation random or just pseudo-random? hg, git - revision IDs checksum the content and revision ◮ history, therefore offer lossless revisions bzr - patch IDs are purely UUIDs. Technically could ◮ therefore lose data back to the last signed tag (“testament”) UUIDs: pros: when cherry picking, give you a token to refer to. ◮ Also, partial imports when dealing with foreign VCSes easier cons: such tokens don’t guarantee anything about the ◮ delivered change
  • 52. NG Tools: UUID generation random or just pseudo-random? hg, git - revision IDs checksum the content and revision ◮ history, therefore offer lossless revisions bzr - patch IDs are purely UUIDs. Technically could ◮ therefore lose data back to the last signed tag (“testament”) UUIDs: pros: when cherry picking, give you a token to refer to. ◮ Also, partial imports when dealing with foreign VCSes easier cons: such tokens don’t guarantee anything about the ◮ delivered change
  • 53. NG Tools: UUID generation random or just pseudo-random? hg, git - revision IDs checksum the content and revision ◮ history, therefore offer lossless revisions bzr - patch IDs are purely UUIDs. Technically could ◮ therefore lose data back to the last signed tag (“testament”) UUIDs: pros: when cherry picking, give you a token to refer to. ◮ Also, partial imports when dealing with foreign VCSes easier cons: such tokens don’t guarantee anything about the ◮ delivered change
  • 54. NG Tools: Lightweight Branches important to encourage frequent topic branching ◮ ideal: every bug, feature, etc can get its own branch from ◮ last stable release until it is fully reviewed and known to be good. git - virtually pioneered lightweight branches ◮ hg - now supports lightweight branches well, though ◮ repositories still accumulate deleted branches. Use mq bzr - not directly supported, so less efficient (but branching ◮ still common practice) Advanced add-ons to manage refining changes - Stacked Git (stg) and guilt in git and Mercurial Queues (mq) in hg. For bzr there is also Patch Queue Manager (PQM) - branch dashboard, and rebase plug-in
  • 55. NG Tools: Lightweight Branches important to encourage frequent topic branching ◮ ideal: every bug, feature, etc can get its own branch from ◮ last stable release until it is fully reviewed and known to be good. git - virtually pioneered lightweight branches ◮ hg - now supports lightweight branches well, though ◮ repositories still accumulate deleted branches. Use mq bzr - not directly supported, so less efficient (but branching ◮ still common practice) Advanced add-ons to manage refining changes - Stacked Git (stg) and guilt in git and Mercurial Queues (mq) in hg. For bzr there is also Patch Queue Manager (PQM) - branch dashboard, and rebase plug-in
  • 56. NG Tools: Lightweight Branches important to encourage frequent topic branching ◮ ideal: every bug, feature, etc can get its own branch from ◮ last stable release until it is fully reviewed and known to be good. git - virtually pioneered lightweight branches ◮ hg - now supports lightweight branches well, though ◮ repositories still accumulate deleted branches. Use mq bzr - not directly supported, so less efficient (but branching ◮ still common practice) Advanced add-ons to manage refining changes - Stacked Git (stg) and guilt in git and Mercurial Queues (mq) in hg. For bzr there is also Patch Queue Manager (PQM) - branch dashboard, and rebase plug-in
  • 57. Outline Historical Context Distributed >> Centralised Benefits of Distribution The Model of Distribution Revision Data Warehousing Stable Development Model The Variety of NG VCS Experience The Players Differentiating Factors Pragmatic Concerns
  • 58. NG SCMs: ease of use nanna-ready bzr and hg - ease of use and simplicity always considered ◮ a driving focus. git - “If you want a usability feature, go implement it you ◮ lazy user, this is Open Source, scratch your own itch would you?” ...however these days it’s much of a muchness ...however also remember I prefer git, so I would say that
  • 59. NG SCMs: ease of use nanna-ready bzr and hg - ease of use and simplicity always considered ◮ a driving focus. git - “If you want a usability feature, go implement it you ◮ lazy user, this is Open Source, scratch your own itch would you?” ...however these days it’s much of a muchness ...however also remember I prefer git, so I would say that
  • 60. NG SCMs: ease of use nanna-ready bzr and hg - ease of use and simplicity always considered ◮ a driving focus. git - “If you want a usability feature, go implement it you ◮ lazy user, this is Open Source, scratch your own itch would you?” ...however these days it’s much of a muchness ...however also remember I prefer git, so I would say that
  • 61. NG SCMs: ease of use nanna-ready bzr and hg - ease of use and simplicity always considered ◮ a driving focus. git - “If you want a usability feature, go implement it you ◮ lazy user, this is Open Source, scratch your own itch would you?” ...however these days it’s much of a muchness ...however also remember I prefer git, so I would say that
  • 62. NG SCMs: portability hg - performs well on Windows and Unix ◮ bzr - performs and installs well on both ◮ git - written by Linus Torvalds ◮
  • 63. NG SCMs: portability hg - performs well on Windows and Unix ◮ bzr - performs and installs well on both ◮ git - written by Linus Torvalds ◮
  • 64. NG SCMs: portability hg - performs well on Windows and Unix ◮ bzr - performs and installs well on both ◮ git - written by Linus Torvalds ◮
  • 65. NG SCMs: portability hg - performs well on ◮ “You want it to work on Windows? Windows and Unix This is Open Source, scratch your bzr - performs and ◮ own itch would you you lazy installs well on both Windows user?!” git - written by Linus ◮ Torvalds
  • 66. NG SCMs: portability Cygwin works today (slowly) ◮ hg - performs well on ◮ Windows and Unix MinGW port shows promise ◮ bzr - performs and Minimal pure-Java ◮ ◮ installs well on both implementation (for Eclipse) git - written by Linus C# .NET implementation ◮ ◮ Torvalds underway by Mono crew
  • 67. NG SCMs: speed bzr - not a primary focus. “Fast enough for most users”. ◮ Good to many thousands of changes. hg - optimised for the “cold cache” case. Very fast. ◮ git - optimises for the “warm cache” case. Very fast. ◮ git and hg have certain operations one or the other is faster or slower at, but they are both far beyond the performance of virtually everything else.
  • 68. NG SCMs: speed bzr - not a primary focus. “Fast enough for most users”. ◮ Good to many thousands of changes. hg - optimised for the “cold cache” case. Very fast. ◮ git - optimises for the “warm cache” case. Very fast. ◮ git and hg have certain operations one or the other is faster or slower at, but they are both far beyond the performance of virtually everything else.
  • 69. NG SCMs: speed bzr - not a primary focus. “Fast enough for most users”. ◮ Good to many thousands of changes. hg - optimised for the “cold cache” case. Very fast. ◮ git - optimises for the “warm cache” case. Very fast. ◮ git and hg have certain operations one or the other is faster or slower at, but they are both far beyond the performance of virtually everything else.
  • 70. NG SCMs: speed bzr - not a primary focus. “Fast enough for most users”. ◮ Good to many thousands of changes. hg - optimised for the “cold cache” case. Very fast. ◮ git - optimises for the “warm cache” case. Very fast. ◮ git and hg have certain operations one or the other is faster or slower at, but they are both far beyond the performance of virtually everything else.
  • 71. NG SCMs: repository size bzr - not a primary focus, though still quite efficient. ◮ hg - very tight packs and repositories. Can “repack” via hg ◮ bundle git - very tight packs and repositories. in principle more ◮ space efficient than hg, but only rarely bourne out in practice. Typically 10 times smaller repositories than SVN fsfs. Both git and hg often get the entire project history and head checkout into a space smaller than a single svn HEAD checkout. GCC is one extreme example of this - 1.1GB SVN head checkout, 280MB git repository (vs approx. 18GB for the full SVN repository).
  • 72. NG SCMs: repository size bzr - not a primary focus, though still quite efficient. ◮ hg - very tight packs and repositories. Can “repack” via hg ◮ bundle git - very tight packs and repositories. in principle more ◮ space efficient than hg, but only rarely bourne out in practice. Typically 10 times smaller repositories than SVN fsfs. Both git and hg often get the entire project history and head checkout into a space smaller than a single svn HEAD checkout. GCC is one extreme example of this - 1.1GB SVN head checkout, 280MB git repository (vs approx. 18GB for the full SVN repository).
  • 73. NG SCMs: repository size bzr - not a primary focus, though still quite efficient. ◮ hg - very tight packs and repositories. Can “repack” via hg ◮ bundle git - very tight packs and repositories. in principle more ◮ space efficient than hg, but only rarely bourne out in practice. Typically 10 times smaller repositories than SVN fsfs. Both git and hg often get the entire project history and head checkout into a space smaller than a single svn HEAD checkout. GCC is one extreme example of this - 1.1GB SVN head checkout, 280MB git repository (vs approx. 18GB for the full SVN repository).
  • 74. NG SCMs: repository size bzr - not a primary focus, though still quite efficient. ◮ hg - very tight packs and repositories. Can “repack” via hg ◮ bundle git - very tight packs and repositories. in principle more ◮ space efficient than hg, but only rarely bourne out in practice. Typically 10 times smaller repositories than SVN fsfs. Both git and hg often get the entire project history and head checkout into a space smaller than a single svn HEAD checkout. GCC is one extreme example of this - 1.1GB SVN head checkout, 280MB git repository (vs approx. 18GB for the full SVN repository).
  • 75. Summary The three NG tools covered differ chiefly in efficiency and ◮ user interface The revision graph concept is seen in all these tools. ◮ A location independent revision model is paramount to ◮ successfully achieving the stable development model
  • 76. For Further Reading I Various sections of relevance on Wikipedia articles http://en.wikipedia.org/wiki/Comparison of revision control software http://en.wikipedia.org/wiki/Git %28software%29#References S. Vilain. An Introduction to git-svn for Subversion/SVK users and deserters (advocacy and limitations sections) http://utsl.gen.nz/talks/git-svn/intro.html#wtf-why http://utsl.gen.nz/talks/git-svn/intro.html#sux