Did you ‘try’ it before you committed?

Posted by on October 11, 2011

At Etsy, we deploy often, from head, and everything that is committed is to trunk and must be ready for production immediately. This makes it very important to test your code before committing; otherwise, you will be holding up everyone else from committing and deploying since trunk always needs to be clean.

Over a year ago, everyone had at least one common gripe: You could not run the entire test suite on a developer virtual machine (VM), and if you thought you could run the entire suite on a developer VM, then you thought it would take at least 3 hours.

Fortunately, on the Continuous Integration (CI) cluster the tests would take roughly 30 minutes. But really, the chances that the tests would pass were so low that ‘rebuilding reds’ (aka rebuilding a test failure to make sure it was not just a flaky test) and re-running the test once there was a ‘fix’ would actually turn that 30 minutes into an hour, hour and a half, two hours, or more.

Why were the chances of the tests passing so low on CI? because almost no one would ever run a full test suite before the integration step.

The problem was: developers could not run tests in a reasonable amount of time with the resources they were given, so they would not run the tests until they had access to the shared resources that had the ‘Oomph’ to run the tests in a more reasonable time.

So one day, someone told one of the developers, that he could test on one of the behemoths in the CI cluster. That developer would SSH into the machine. Edit some code. Run the tests on the much more robust hardware, and PROFIT! But then came along another developer. He began to do the same on the machine. Then another came, and most likely a couple more. There were maybe a handful of developers, and the machine was spec-ed high enough to handle the load, but that was not the issue.

The problem was: developers were testing, but they had no way to orchestrate who was using the machine at the time. Also, the manner in which the test suite was architected involved a lot of shared fixtures which would inevitably cause collusions during concurrent test runs. Sigh…

While this situation was brewing, I was working with one developer on how to get his changes onto the machine in the CI cluster. We figured out what options we needed for svn diff to make the patch file that we wanted. Then we figured out which options we needed for patch to patch in the patch file we wanted to the svn working copy on the CI machine. Eureka!

Here’s the simple solution to allowing the developers to utilize the awesome resources in the CI cluster without stepping on each other’s toes:

  1. Create a new Jenkins Freestyle Project(or copy an existing) and
    • Select Parameterized Build
      • File Parameter: patch.diff
      • String Parameter: username
    • Set up the SCM as usual
    • Add an Execute Shell build step: Apply the patch.diff
    • Use $username in the recipient list of the e-mail publisher

  2. Write a short bash script that
    • Creates a patch
    • Sends a cURL request with the patch and $USERNAME to start a build of the Jenkins job

The original script looked something like this:


    cd $LOCATION
    svn diff > $PATCH

    file_param="{'name': 'patch.diff', 'file': 'file0'}"
    user_param="{'name': 'executor', 'value': '$USER'}"
    for ((i=0;i<${#args[@]};i++)); do
      curl -F file0=@$LOCATION/$PATCH 
           -F json="{'parameter': [$file_param, $user_param]}" 
           -F Submit=Build http://$HUDSON/job/try-${args[i]}/build

We called this new service try. The day we introduced try to the team, the number of deploys went from maybe a handful a day to more than 20 deploys a day, and we have not really looked back.
Increase in deploys since the introduction of 'try'

Every new deployer at Etsy, deploys code on his or her first day, and each one of them is told by someone, “Make sure you use try before you commit.”

try has evolved with the rest the rest of our CI infrastructure. If you read Divide and Concur, you can probably imagine how we handle so many test jobs. try has also been our guinea pig for integrating Jenkins and Deployinator. We will save the details of all of this for another post.

In the meantime, please take a gander at other ‘try’ implementations:

Posted by on October 11, 2011
Category: engineering, infrastructure


[…] out my post on Code as Craft: Did you ‘try’ before you committed? Tweet(function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async […]

Nice idea

Btw, what tests (unit, functional, acceptance, …) do you have so they take so much time?

ps: jenkins url is with typo

    Last year, when we walked in the door, there was a predominately functional test suite. That test suite ran via BuildBot on EC2. Each BuildBot slave would have the entire web stack in it to execute the Selenium tests against. With enough hardware it would take 30 minutes. We then moved to PHPUnit as the primary developer testing source, but those were integration tests that depended on a very expensive setup as well. Those could take 30 minutes on a developer VM and 7-10 minutes on Jenkins. We have improved our testing practices since. No suite runs over 5 minutes on Jenkins. Maybe worst case of 7-10 minutes per suite on a developer VM.

    The main advantage of ‘try’ now is parallel execution of test suites and extra verification that the code will pass in CI because sometimes your local environment gets dirty or you forget to track new files.

Interesting stuff. We run builds nightly and often end up with broken ones. Having CI in place surely sounds like a practice we don’t use at this time. Even if our suite may be significantly different from others, I would guess that there is gotta be a way to run build tests continuously, speed up build processes and avoid inconveniencing other developers. Hmmm.

[…] The Triggering User and Master Project plugins are both integral to our latest version of Try. […]

[…] DNS hostname we just created. Our various tools can immediately be run from the new VM, such as the try server, which is a cluster of around 60 LXC based instances that spawn tests in parallel on your upcoming […]

How do you keep developer’s “try” builds from confusing the test reporter? Do you just have copies of your different Jenkins jobs, one for automated testing and another for developer-induced try builds?

    We have a ‘try’ copy of each test job. Sort of stinks keeping them in sink, but after awhile they stop changing so much. The only difference between the two jobs is that the ‘try’ has parameters (one for a patch, another for a guid) and in the ‘Shell Execute’ builder the logic to patch in the patch file.