An overview of how the UC San Diego Library development team uses Git, Github, Stash with the Git Flow workflow for development, continuous integration and release management.
2. Questions
• Who is using a version control system in their current
position?
• Who is using Github, either personally or professionally?
• Who develops code entirely on their own?
3. Background
• UC San Diego Library Development Team
• 6 Developers
• Open-Source repositories: 3
• Internal repositories: 46
5. Github
• Social Coding – 6 mil+ people
• Collaboration with community projects
– https://github.com/projecthydra
– https://github.com/projectblacklight/
– https://github.com/fcrepo4
• Best Practices are inherited
13. Other Workflows
• Git Tags and Branches
• Spencer and Tom’s workflows
• Subversion has equivalents
• Goal: Consistent workflow that works for your team
I’m Matt Critchlow and I’m the manager of Development and Web Services for the UC San Diego Library
I manage a team of 6 developers who work on various projects ranging from website design and development to ruby and java web applications
I’m going to focus my part of this session on how we’re using Git for our version control system, and how it’s incorporated into our development and release workflow
But first, I’d like to ask a few questions, to gauge where we are with the folks in the room
So…
So, to answer some of those questions from our end. As I said our development team has 6 developers, so while some projects may have leads there is no isolation
At the moment we have 3 open source repositories
And we have 46 internal repositories
So as a result, while we use Git for every code repository, we have two systems for managing those repositories
We use Atlassian’s Stash for managing our internal repositories
And we use Github for our open source projects that can be made publicly available.
So why Github?
It is the open source development standard. It is the world’s largest code host with over 6 million users. It’s pretty amazing.
There is a lot to be gained by working out in the open. Opportunity to create a bigger “team” than you have locally
In the Library, we’re building our new Digital Asset Management System on top of an open source framework called Project Hydra. It, in turn, depends on a number of other open source projects
We’re also contributing development time directly to a critical component of the Hydra architecture, namely the Fedora repository layer.
The way we’re looking at this goes back to the statement “If you want to go fast, go alone. If you want to go far, go together”
In the Hydra project, we’re working with over 20 partners ranging from Stanford, Notre Dame and other universities, to groups like WGBH Boston and the Rock and Roll Hall of Fame. All told, that’s a team of dozens of developers working together on the core underlying framework that we’re all leveraging.
So we are big fans of Github, but as you saw, only a few of our projects are on Github, the rest are in either private Github repositories or in Atlassian Stash
Stash runs on your own internal servers, behind your firewall.
Stash runs on Git. If most of the goodies of the Github UI such as Pull Requests with in-line code comments, auto-merging, etc. that facilitate Code Review and team collaboration. I’ll talk a bit more about how we do that on the next few slides..
You don’t get things like Gists, which we use often in Github. But that may not be an issue for you.
Also, if you happen to use JIRA, as we do, there is even more integration in how you manage JIRA tickets with Pull Requests and Commits. I think Tom will talk about that more in his demo, so I’ll leave it at that.
So regardless of whether we’re working on a repository in Stash or on Github, our team has been using the Git Flow
If you can’t read this, I apologize, but if you search for Git flow in google, the blog post that contains this image should be your first hit.
We use git flow for managing our day to day code development workflow, and it is integrated into our CI and release management workflow as well
So I’m going to walk through a few examples of how we use Git flow, but first I’ll note two things. In Git flow, there are two primary code branches that always exist for your project, develop and master.
Master should always reflect what is your current production-ready state. Active development never happens on this branch. Master should be “clean”. If something is merged into master, that is a release.
The Develop branch is where integration happens. This is where continuous integration builds happen for the development environment. We have our CI system, Atlassian’s Bamboo product, respond to commits to the remote develop branch by automatically deploying to the development environment
Before talking about the rest of Git Flow, I want to mention that there is a project on github that adds a number of git extensions that allow the examples I’m about to describe to be managed with single commands, rather than a a set of commands.
Installation is shown on the link provided, and then if you want to initiate git flow for a particular project, you navigate to the project’s main folder and type git flow init –d. The –d command uses all the defaults, but you can leave it out if you want to walk through the process yourself.
The screenshot here lists the available subcommands for managing the workflow
OK, let’s get to the supporting branches and how they fit into this.
The first is a feature branch. The primary idea here, is that new features should be developed in isolation from the develop branch to keep the develop branch stable. Features are generally large enough efforts that they are going to take a significant amount time. We don’t know when they’ll be done, and we don’t want them to break anything.
So on the right you can see how this works graphically. On the left, you can see how the branch is created and managed.
In this example, I’m working in a dashboard project, and I want to create a new widget. So I invoke the git flow subcommand on the left which does a few things.
It creates a new feature branch using the name I provide and bases this branch of what is currently the most up-to-date version of the develop branch. Then is checks out that branch automatically for me, so I can start work on my feature in isolation.
After making a number of commits and feeling confident the feature is working as expected, one can run the command lists in the terminal screenshot to finish the feature branch. This does a few things.
It merges the feature back into the develop branch. If any merge conficts arise, you’ll be prompted to resolve them. Then the feature branch is deleted automatically, since the entire commit history now exists in the develop branch.
We generally do a slight variation on this with our team. We have developers create Pull Requests, rather than merge their feature branch into the develop branch on their own. Another developer is then assigned to review the Pull Requests and either accept or reject it. I’m happy to talk more about that process if anyone is interested.
When it’s time for a release, a release branch is created. Generally named with the version number you intend to release.
Like a feature branch, this creates a branch off of the develop branch and checks out that branch automatically
A developer can then push that release branch to a remote git branch
We have our Altassian Bamboo CI server setup to look for new remote branches with the release/ prefix, and if it encounters one, it automatically builds and deploys that release to our test environment.
Our Product Owner, and other stakeholders, generally review the release and verify whether the release can be deployed to production.
If the release is approved, the developer can run the release finish command, which will do a number of things.
It will attempt to merge the release branch into both the develop branch and the master branch.
It merges into the develop branch to make sure it is up do date with any changes made to the release branch are preserved
It merges into the master branch as a signal that a new release should be deployed to production off of master
Then the release branch, again, is safely deleted
I do want to note that while we could have a similar trigger for production as we do for our test enviornment deployments, we choose to have our Operations group manually handle deploying our applications to production. This is a separation of responsibility that we’re all comfortable with for now, but it’s worth noting that you could absolutely use a system like Bamboo, Travis, or Jenkins to deploy automatically to production for you
The last supporting branch is the hotfix, or the oh crap something leaked into production and we need to fix it yesterday branch
This is invoked with in the same command pattern as the release and feature branches. So I won’t show that again here.
The key differences are the following:
When the hotfix branch is created, it is branched off of master
When the hotfix branch is finished, it merges into both master and develop
I’m happy to report that we’ve rarely ever needed to use this, as I think this workflow as a whole in with a solid test suite help avoid unseen bugs of this magnitude, stuff does happen. And this is a nice mechanism for addressing it
So while git flow has worked well for us, it’s certainly not the only option if you’re using Git.
Using tags in git, with some combination of separate branches can give you the same flexibility and safety that git flow provides
Subversion of course has equivalents to this workflow as well, with it’s own options
The goal should really be to have a consistent workflow that works for your team and the technology stack your working with
And I’d be very interested to hear what the rest of you are doing, what you like about it, what you don’t and what we might be able to learn from each other.
- So thanks for listening, here are a few additional links