Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Git: The Lean, Mean, Distributed Machine

Chris Wanstrath


hi everyone, i’m chris wanstrath.

how many peop...
i play guitar

i have a schecter classic similar to this. mine is prettier.
i’m from cincinnati ohio
Próximo SlideShare
From SVN to Git
From SVN to Git
Cargando en…3

Eche un vistazo a continuación

1 de 139 Anuncio

Más Contenido Relacionado

Similares a Git: The Lean, Mean, Distributed Machine (20)


Más reciente (20)

Git: The Lean, Mean, Distributed Machine

  1. 1. Chris Wanstrath hi everyone, i’m chris wanstrath. how many people here use git?
  2. 2. i play guitar i have a schecter classic similar to this. mine is prettier.
  3. 3. i’m from cincinnati ohio
  4. 4. but live in san francisco
  5. 5. i started as a lowly paid consultant
  6. 6. Then I worked at CNET then worked at CNET for a few years
  7. 7. Then I worked at CNET (which is now owned by CBS)
  8. 8. Then I was a highly paid consultant after that i became a highly paid consultant
  9. 9. before co-founding github
  10. 10. git The Lean Mean Distributed Machine By Chris Wanstrath anyway, i want to talk a bit about git today
  11. 11. if we’re going to talk about git, we need to start by talking about source control management (SCM) or version control
  12. 12. basically version control is like wikipedia
  13. 13. for your code
  14. 14. you use it to see what changes others made
  15. 15. inspect those changes
  16. 16. and contribute your own
  17. 17. git? who uses token promo slide
  18. 18. these companies use git
  19. 19. and so do these open source projects
  20. 20. let’s briefly go through the history of open source SCMs (at least the ones we care about)
  21. 21. Revision Control System (RCS) was written in the early 80s and used to store history on a file by file basis. A directory of source code could contain many RCS repositories, each concerning itself with a single file. it’s like a hut - very basic, primitive even, but works great when all you need is shelter Later that decade, a professor began working with two grad students on a C compiler (in the name of scholarly pursuits). As they began using RCS, the professor noted a number of limitations. It was difficult to share files, and even more difficult to share entire projects.
  22. 22. So, they wrote CVS - the Concurrent Versioned System - and released it as open source in the early 1990s. Concurrent because it allowed multiple individuals to collaborate on a project together, without stepping on each other's toes, and versioned system because it was initially a collection of RCS repositories with network awareness. CVS was like a cabin. better than a hut, but still pretty crappy it worked well for a while, but there were limitations. Dealing with directories was difficult, and much different than the way one would normally deal with directories in Unix.
  23. 23. Ten years later (see a pattern?) a new revision control system was released, called Subversion (or SVN). Subversion was intended to replace CVS by improving on CVS. History, directories, deletions, and other CVS warts were fixed. mod_dav integration was included, as well as anonymous checkout. (Anonymous checkout in CVS was literally a hack added by the OpenBSD.) Subversion was not subversive, but it did work well enough. Many felt it a welcome relief and hurriedly switched. Big, lumbering organizations spent years converting their repositories to SVN. IDEs and editors included Subversion integration. it was like a house, same idea but much better than a cabin
  24. 24. Server Committer Committer Committer this is the rcs / cvs / svn model.
  25. 25. Server Committer Committer Committer someone commits to the server
  26. 26. Server Committer Committer Committer and everyone else pulls down the changes
  27. 27. this is bad why?
  28. 28. Server first off, the server is the babysitter
  29. 29. “The Subversion ser ver’s down” SVN’s down! you can’t do anything without the server’s permission
  30. 30. second, low visibility into your coworkers and subordinates’ activity.
  31. 31. at cnet, we used bugzilla. it was awesome (as you can see)
  32. 32. we’d also get diffs emailed to us after each commit
  33. 33. because our group was large, and fluid, i’d often get commits emailed to me i didnt care about
  34. 34. or understand
  35. 35. this meant i spent extra time throwing away junk
  36. 36. emails that come to me should be for me
  37. 37. Trac if you’re lucky, you use trac to watch what everyone is doing. it has rss and is a bit smarter. usually you have to set it up yourself
  38. 38. another problem: your subversion workflow is single threaded it’s hard to stop working on a feature to quickly fix a bug without losing your feature’s changes changed files are either committed or discarded well, not entirely true...
  39. 39. you could always use a branch but that sucks, and big changes are usually disasters i once worked on a project for 3 months where we worked on a massive new feature
  40. 40. as it neared completion, we had a big meeting
  41. 41. a 2 hour meeting
  42. 42. there we decided how to merge in the changes from our branch to trunk it was one of the worst meetings ever
  43. 43. and i’ve been in some bad meetings
  44. 44. one person was assigned with the task of merging the branch and trunk
  45. 45. he was merging and fixing bugs in code he did not write
  46. 46. or understand
  47. 47. it did not go well
  48. 48. another problem with the ‘babysitter’ model is that experimentation is difficult
  49. 49. all your experiments are public
  50. 50. everyone sees everything you commit
  51. 51. solution? don’t commit at least, that was my solution
  52. 52. and that’s just work stuff the centralized model, when applied to open source, is a huge pain it’s difficult to maintain patched versions of open source independently if a project dies on the internet, does anyone care?
  53. 53. what’s the answer? dvcs! git! happy!
  54. 54. git was started by linus torvalds a DVCS built as an efficient content addressable file system
  55. 55. the guy who started the linux kernel
  56. 56. if rcs is a hut
  57. 57. cvs is a cabin
  58. 58. and subversion is a house
  59. 59. git is a castle
  60. 60. or a ninja
  61. 61. or shaq
  62. 62. Server Committer Committer Committer the thing that makes git, and all distributed version control systems like it different, is the idea that it’s “distributed” take this centralized model
  63. 63. Server Committer Committer Committer and make every copy of the code its own, full fledged repository any copy can accept or create commits. anyone can pull commits from any copy.
  64. 64. Server Committer Committer Committer now you can push to the server
  65. 65. Server Committer Committer Committer or committers can push and pull from each other
  66. 66. Server Committer instead of “checking out” code
  67. 67. Server Committer you “clone” or copy a repository
  68. 68. if github explodes, you don’t lose any code
  69. 69. Server Server Server Committer in fact, because you have a full copy of your repository at all times, you don’t need to tie yourself to a single remote repository
  70. 70. you can push to as many servers as you want
  71. 71. another word for ‘clone’ or ‘copy’ is ‘fork’
  72. 72. it may seem like anarchy at first but sane and useful workflows have evolved
  73. 73. #1 in fact, the first workflow i want to talk about is called Anarchy
  74. 74. Server Committer Committer Committer you remove the server
  75. 75. Committer Committer Committer then make everyone a peer
  76. 76. Committer Committer Committer then take away commit access from each other
  77. 77. Coder Coder Coder and you end up with repositories floating in the void this is how the internet works, or how git works by default
  78. 78. Coder Coder Coder everyone pushes and pulls from each other, managing their own version of the code for small projects with very few contributors, it works fine and it would work great on small, experimental projects inside of any organization (you just need a place to publish your changes)
  79. 79. an example of this, let’s say i was on github and i wanted to add a patch to schacon’s ticgit
  80. 80. i’d click the fork button
  81. 81. i now have a copy of schacon’s ticgit called defunkt’s ticgit now i can make changes and ask scott to check them out. if he likes them, he’ll merge them in if he doesn’t like them, oh well. i can still use them in my project and keep up to date with his changes. someone can come along and fork from me if they want, too
  82. 82. #2 blessed but anarchy doesn’t scale the second workflow is called Blessed
  83. 83. Coder Coder Coder Coder the blessed workflow has the same basic idea as Anarchy
  84. 84. Blessed Coder Coder Coder but one of the repositories is the Blessed repo
  85. 85. Blessed Coder Coder Coder everyone takes their cues from the blessed repository its development is considered the mainline, or trunk deploys and packages are pushed from the blessed repo
  86. 86. Blessed Coder Coder Coder others can still push and pull from each other, remember
  87. 87. in the business world this works great for dealing with contractors
  88. 88. Blessed Coder Coder Coder don’t give them push access, just pull access
  89. 89. Blessed Coder they pull down your code, make their changes, then tell you when the changes are ready
  90. 90. Blessed Coder if you like what you see, you merge in the contractor’s changes the contractor never has direct write access to your company’s code
  91. 91. if there were a bunch of us working on ticgit, scott’s may be the Blessed repository he started the project and is in charge of merging in all changes. we all watch his changes
  92. 92. this is how rails works rails/rails is the Blessed repository, from which the gems are built and david controls. we all follow this repo’s development and treat it as “official” by convention only
  93. 93. this is also how rentzsch’s click to flash works click to flash is an amazing safari plugin that disables flash, similar to the firefox extension
  94. 94. it was forked from google code and has been given a life of its own on github, under rentzsch’s guidance. contributors fork his repository and he merges in good changes. the plugin’s development has been a perfect example of how distributed version control puts the power in the hands of the developer, not the server
  95. 95. #3 lieutenant the next workflow is called ‘lieutenant’ - great for massive projects, like the kernel
  96. 96. Blessed Lieutenant Lieutenant Coder Coder Coder Coder in this model, there is a blessed repository and a few designated lieutenants the lieutenants are people trusted by the blessed repository
  97. 97. Blessed Lieutenant Lieutenant Coder Coder Coder Coder coders will pull from a lieutenant, make their changes, then request the lieutenant merge in their changes
  98. 98. Blessed Lieutenant Lieutenant Coder Coder Coder Coder lieutenants are usually in charge of a specific subsystem or part of the large system if they like the change, they will pull it in
  99. 99. Blessed Lieutenant Lieutenant Coder Coder Coder Coder they’ll then inform the blessed repository that they have changes which need to be merged in the blessed repo, trusting the lieutenant, pulls in the changes
  100. 100. Blessed Lieutenant Lieutenant Coder Coder Coder Coder this is typically coordinated over a mailing list
  101. 101. this is the kernel’s model
  102. 102. #4 centralized finally the centralized model, one repository acts as the ‘server’
  103. 103. Server Committer Committer Committer this mimics the traditional babysitter model but you only need your babysitter to pull and push changes - branches and commits can still be created locally whenever
  104. 104. Server Committer Committer Committer the server being down does not dramatically hamper your work
  105. 105. Server Committer Committer Committer yet the flow stays mostly the same
  106. 106. Deploy Server Committer Committer Committer the central server, as in the old model, can also be used to deploy
  107. 107. Staging Production Committer Committer Committer and staging servers can easily be setup
  108. 108. git isn’t all about distributed servers in fact, one of the best parts about git is its branching support
  109. 109. branches are local, incredibly lightweight, and easy to switch between it’s easy to devote each branch to a single feature or bug we call these ‘topic branches’
  110. 110. $ git checkout -b bug_2342 from your working directory you just made a new branch
  111. 111. buckets of different things because branches are so cheap, you can keep around buckets filled with experiments, new ideas, or new features no one will ever see them unless you want them to be seen
  112. 112. Staging Production Committer Committer Committer this changes the staging server idea
  113. 113. Feature A Production Feature B Committer Committer Committer you may start to have topic staging servers where you boot up staging for a single branch and test out a new feature, no more generic ‘staging’ branch - each person may even have their own staging server
  114. 114. Coder because every copy is its own repository, we are given the freedom to structure our workflows socially rather than technically
  115. 115. Team A Team B it’s easy to have multiple small teams, move people between projects, and monitor multiple projects no need for one monolithic subversion server - git repositories are a breeze to setup
  116. 116. with something like github, watching your team’s development is trivial. do it with rss...
  117. 117. or with a service you’re already comfortable using integration with campfire, email, fogbugz, lighthouse, friendfeed, twitter, etc
  118. 118. the site also lets you comment on commits, providing dead simple and effective code review git and github are what we use in our private client work and on our own websites, as well as for our open source
  119. 119. as far as git IDE support, the textmate ProjectPlus extension shows you the status of tracked files right in the drawer
  120. 120. there’s also a git textmate bundle available on github
  121. 121. if you’re an eclipse user, the egit plugin lets you commit to, manage, and track git repositories from within eclipse it’s written using jgit, a pure-java implementation of git
  122. 122. emacs people can use DVC which aims to provide a common interface for all distributed version control systems there’s also a git mode
  123. 123. or my person favorite, magit
  124. 124. if you use os x, an open source program called GitX is under active development
  125. 125. which is based on the cross platform Git-GUI
  126. 126. for OS X there’s also GitNub which isn’t as actively developed
  127. 127. as far as libraries go, a search for ‘git’ on github returns almost 3000 unique repositories darcs or hg to git converters, git vim projects, git in .NET, even blogs and wikis based on git
  128. 128. remember when i said git was a content addressable file system? well, it’s true this is gist. it’s a git powered pastie
  129. 129. you paste in code and share it with coworkers or friends
  130. 130. but the best part are these clone URLs i can check out a pastie i made, make changes, then push a new version
  131. 131. these are the revisions you can’t tell the difference between changes i made on the web and changes i made locally then pushed
  132. 132. this kind of stuff is the future imagine a distributed, versioned wiki or documentation project or book a distributed, versioned bug tracker a distributed, versioned chat application
  133. 133. in fact, a number of books are already being written on github
  134. 134. scott’s book is being translated right now
  135. 135. a distributed, versioned everything
  136. 136. this has been a fairly basic overview of git for more information, check out
  137. 137. thanks questions?
  138. 138. flickr
  139. 139. flickr