Moving from SVN to Git in 1,000 easy steps!
This past summer we completed a project that spanned several months of planning and preparation – moving our source control from Subversion to Git. The code that runs our search engine, front-end web stack, support/admin tools, API, configuration management, and more are now stored in and deployed from Git. We thought some of you might find our approach migrating an 80-100 person engineering team interesting and possibly instructive.
We went through three phases:
- Follow through
Preparation was the longest and most difficult phase. We dealt with figuring out when and how to move, how to educate our team, and making it a smooth transition for everyone. The execution phase had to be done quickly, because at the rate we are committing and releasing changes it would be counter-productive for the cutover to take more than a few hours. We spent a lot of the time in the preparation phase making sure that was possible. The follow through phase refers to supporting our team from the point we cutover and into the future.
This post isn’t prescriptive, but before getting into details here is our only piece of advice:
If you can deal with your current source code system, do not go through this pain. Seriously. This was a long, painful process for us. Over the years, many tools, systems, and processes had become deeply intertwined with our subversion installation. That said, if your team is small, or your source control system isn’t tied into anything, go for it! Just do it as soon as possible – the only time better than today was yesterday.
Moving to Git is something we’ve been talking about for at least 2 years. It’s also something we put off for a lot of very good reasons. Around that time we had been introducing the culture of continuous deployment, which included the mind shift of moving away from long lived branches, and instead branching in code with feature flags, making small frequent deploys, and using percentage rampups to slowly roll out features. At the time we didn’t want to introduce any other road blocks to instilling this into our engineering culture.
What we found happening more and more was that new engineers were coming in already familiar with Git, using things like git-svn and writing tools to make SVN act more like Git. While evaluating the options, it was clear that for our team, Git was a better fit than SVN (and a better fit than any other distributed version control system as well). One of the biggest reasons is github.com and its popularity for open source collaboration. Not only do we put our open source contributions on there, but so do Twitter, Facebook and many others.
Though we did not move to Git for its branching capability, our tools weren’t capturing some of the work we were already doing with patches and pushing changes directly between team members for review and testing. We also felt that re-examining and adding new tools to the mix seemed like a healthy trait to have in our culture, and felt the switch to Git would increase engineer happiness.
After we committed to the decision, we handled the move to Git slowly and delicately for a few reasons. One is that we deploy around 30 times a day across an engineering organization that was about 80 people at the time. We didn’t want to lose any of that velocity (we knew we might lose some in the beginning, but wanted it to be as seamless as possible). Another was that we had a varying range of Git familiarity across the team. From Git experts to people who had never touched it. Education played a huge part in our successful transition. It was also important for us to continue the use of flagging code on/off and having a continuously deployed trunk mentality even after the switch to git.
The first few months of prep consisted of slowly reorganizing our SVN repositories to be more in line with how we were working. Our code had become spread across many different repositories, and we wanted to make sure that when you were ready to work on the main website, that you only needed to clone one repository. This was a good thing for us even if we hadn’t moved to Git, because it introduced a logical organization that was more in line with how our site was laid out.
We were also starting to decide which tools we could use around Git. We wanted a front end tool with a good UI, so naturally we contacted GitHub for a trial version of github enterprise. At the same time we tried out some tools such as gitweb with something like gitosis or gitolite underneath. We also took a look at gitorious which is probably the next closest thing to GitHub if you are looking for a free solution.
We started to look around for training, and it became obvious that github’s training program is the best out there. We wanted to make sure everyone was well prepared, and they offer a training that you can go through online, or have the instructors come to you for a more hands on experience. We also wanted to examine our workflow and integration issues, and found that bringing an instructor on site was the best use of our time.
Since we were training people with various skill levels, we decided to split the training into separate sessions based on experience. We surveyed our team, and grouped people into beginner and intermediate buckets. We broke about 3/4 of our team across two beginner days, and the rest in a more advanced session on the last day. We also spent some time after the training each day to discuss our workflow and the integration with our current tooling.
Our instructor, Matthew McCullough, couldn’t have been better in explaining git in a sensible way to the team. As a bonus, since we already had github enterprise installed, we were able to use that for the hands on training to get people acclimated to using it, and by the end of the sessions people were creating and hacking on their own repositories in our private installation. It quickly became apparent that this was going to be a great tool for collaboration, with a fantastic UI and all the benefits of the public GitHub, while maintaining the privacy of our codebase that we required.
After the training, we had to attempt to move fairly quickly so that all that was learned in training was not forgotten. The best way to learn a tool is to use it, and we had to plan how to carry out the actual migration.
There were a few key things we did at this stage. First, we created an Engineering organization within our GitHub, and created a repository that held our web code. We then created a cron that mirrored our SVN commits into the GitHub repo. We were able to use that to update our deploy and testing tools in the background without affecting our current flow.
We also created documents in our wiki that described our workflow, including an explanation of how to do similar tasks that one would do in SVN, with Git. We made it clear with a few weeks notice of our plans for the move so that everyone was mentally prepared for the switch, and even did a few in house training sessions specifically on our new workflow. In the end, we kept our workflow similar to SVN to ease our transition. We still don’t use branches (most of the time), we still deploy from trunk (…well, master).
The next step was actually flipping the switch. At this point we’d done so much preparation that we were just ready to make this happen and get it over with. We had a code freeze (no commits) one evening in late June, and migrated our deployment and testing tools to use the new Git repo. Our commits were already mirrored into Git, so the new repository was up to date. But we also had to be sure the Git repository was getting chef‘d out to each developer’s VM, with the web configuration in place to have engineers sit down the next day and be ready to code and deploy. We had to make sure that our hooks were working, and that our commit emails and IRC notifications were uninterrupted. All in all the code freeze lasted about 12 hours, and we were ready to go for the next day.
As part of our preparation, we made sure to identify some of the members across the team who were key in helping assist others with the transition to this new tool. It certainly helps to have a few people on hand that are familiar with Git and its distributed model to help people get acclimated. We set up a #git IRC channel (we use IRC across the entire company for communication) and we also had our documentation to point to, which people were able to add to if they encountered any new problems or needed clarification with the new workflow.
The first day on Git our velocity was above average – we wanted to make sure everyone was comfortable and able to work in this new system, and the migration didn’t end up slowing us down at all. We stated that everyone had to clone-pull-add-commit-push on that first day so that there was no getting lost for weeks. In our opinion this was one of the things that was the most successful aspects of the move. Just like we have people deploy on their first day here, overcoming the fear is a big part of adapting to a new process.
Overall we can say the Git migration was a success. It turned out to be an immense task with a maze of dependencies, but in the end we’re on a current version control system that should last for years to come. It opens up many new workflow possibilites and solves some of our existing problems, not to mention it’s blazing fast. If you’re more interested in the technical instead of the social migration of SVN to Git, I wrote a blog post a few years ago on my personal blog that you may be interested in, and there’s also a couple of pointers over on github on how to make the conversion.