Quantum of Deployment

Posted by efkastner | Filed under infrastructure

aka. Deployinating the Country Side

UPDATE 2011-07-29:
Deployinator is now Open Source!
Grab it on github: https://github.com/etsy/deployinator

We deploy a lot of code. Deployinator is our creation to make that as easy and painless as possible. Deployinator is a one button web-based deployment app. Hit that button and code goes to our webservers and is serving requests in almost no time. Using Deployinator we’ve brought a typical web push from 3 developers, 1 operations engineer, everyone else on standby and over an hour (when things went smoothly) down to 1 person and under 2 minutes.

At Etsy, we’re doing what’s come to be called Continuous Deployment. However, what we’ve learned is that having a tool like Deployinator is useful for more than just enabling that. This post is about those benefits – for anyone deploying web code.

Why

Our job as engineers (and ops, dev-ops, QA, support, everyone in the company actually) is to enable the business goals. We strongly feel that in order to do that you must have the ability to deploy code quickly and safely. Even if the business goals are to deploy strongly QA’d code once a month at 3am (it’s not for us, we push all the time), having a reliable and easy deployment should be non-negotiable.

It’s a metric I’ve come to call the “quantum of deployment”: what’s the smallest number of steps, with the smallest number of people and the smallest amount of ceremony required to get new code running on your servers? This isn’t a trivial question. Even if you’re on a slow release cycle, and have a push engineer, what happens if there’s an emergency push needed? Does it go through your normal process, or is there a fast-lane? Do your fast-lane deployments get logged? Are they measured for speed? Is everyone aware that it happened the way they would for a normal deployment, or is it your dirty little secret?

It’s not hard to get started. If you currently have a bunch of shell scripts that move everything in place, wrap those up with a single shell script. The most important thing is that it’s ONE easy step. This might require changing your process. Try to remove or replace special cases. The less thought it takes to deploy, the more you can focus on getting stuff done.

Once deploying is No Big Deal, a lot of things can change. Features can go out a piece at a time instead of one all-or-nothing push. Your app configuration options can be in code – and changed quickly. Your hair can grow back. Puppies will lick your face!

What

Custom software? There’s a lot of choice out there, and we generally try to not reinvent the wheel whenever possible (and this wheel has been invented again and again). After comparing our requirements with the available software, we decided to roll our own. Here’s what we were looking for:

  • Web based
  • Logged (When, What and Who)
  • Adaptable to our network
  • Run from a central location
  • Announced in our IRC and email
  • Transparent in regards to its actions
  • Integrated with our graphing/monitoring tools
  • A lot of our requirements are inspired by the way Flickr deploys, as documented in Building Scalable Websites (Written by Friend of Etsy, Cal Henderson).

    For anyone deploying code (engineers, designers… anyone really), it’s an easy process. Once your code is ready to go, you go to Deployinator and push the button to get it on QA. From there it visits Princess (see the sidebar). Then, when it’s ready to go live, you hit the “Prod” button and soon your code is live, and everyone in IRC knows who pushed what code, complete with a link to the diff. For anyone not on IRC, there’s the email that everyone gets with the same information.

    What it looks like

    This is the main Deployinator screen. Here is how we deploy the “web stack”.

    Deployinator

    Here’s our IRC bot telling everyone that something went out. It also includes a link to the commits that went live.

    announcing a deployment

    How

    When we first brought Deployinator online, it was just a web frontend to the shell scripts that moved everything in the right place. What we gained by putting a screen in front of it was the ability to iterate the backend without changing the experience for people deploying. Deployinator currently uses svn to update the code, then rsync to move it between environments.

    Another important part of Deployinator is that the environments are a one way street. Code going from Princess to production is unaffected by any commits that have happened since getting on princess. This creates something of a “mantrap“, so that we know exactly what we’re deploying. No surprises!

    Deployinator isn’t used just for our web stack either. With the simple architecture we’ve built, we can add all kinds of stuff to it easily. It’s currently used for many different things such as the API, Lists service, internal admin-only tools and others. Having a single deployment process has removed a lot of complexity.

    When

    This isn’t a post about continuous deployment. Having a very simple deployment procedure is something you should do even if the thought of deploying your code 20 times a day scares you. Deployment can be a contentious subject with many stakeholders. Getting it simple and repeatable allows everyone to share a common vocabulary.

    For the nerds…

    Transporting the bits and bytes

    Here’s a rundown of some of the interesting parts of how Deployinator actually moves bits around. As mentioned above, this has changed and will change again. We analyze our entire process and have some low-hanging performance fruit to pick. As of today, an API push takes about 18 seconds, a Princess push takes about the same, and a production web push is 70-150 seconds. Here are the steps that a web push goes through:

    From the repo of truth

    We’re deploying directly from trunk (that’s a whole other post!). So the first step of deploying is to update the code on the deploy host.

    Builda what now?

    After the code is updated, we run “builda”, an app we wrote to take our assets and bundle them up with lots of magic. We use a combination of javascript async loading, Google’s closure and versioned directories to make things like css, javascript and sprites as fast as possible.

    Rsync

    At this point, we have a bunch of directories on our deploy host that represent our web tree. We then rsync to a staging area on each web box.

    Fanout

    At some number of boxes, rsyncing back to a single push host stops working. We’re employing a strategy called fan out. We rsync in chunks of 10 hosts at a time. This is one area where a lot of speed ups will be happening soon.

    First they came for our assets…

    Pop quiz, hotshot: Someone visits the site during a deployment and box 1 (the one they randomly get) has the new code. The html they’re returned refers to a new image. When they request that image, they end up on box 451.. which doesn’t have that asset yet. What do you do? WHAT DO YOU DO?

    We’ve solved this with two steps. The first (mentioned above) is versioned asset directories. The next is to get those assets on ALL hosts before ANY host has the new code referring to it.

    Graceful

    We’re using APC user caching, and expect it to have fresh data each deployment. Things like some application settings, database fields and routes are all cached for super fast lookups. However, if someone changes something in some code going out, we need to make sure that it’s fresh. We’re currently issuing a graceful restart to our apaches on each deployment.

    Deployinator itself

    One of the design goals of Deployinator has been to be as simple as possible. It’s a sinatra web app, but could really be written in anything. The script that does the svn updating (and checking out for new stacks) is in PHP and some of the straight-up simplest code possible.

    The commands that Deployinator runs through for each different stack are listed in ruby methods, and are mostly strings (with servers and such interpolated). It’s easy for anyone to come in and change how something works. Simple, understandable software that gets the job done.

    The one fancy bit of Deployinator is the streaming rack middleware that powers the live updating code window:

    Command window

    Database DDLs aren’t code

    An awesome feature of Capistrano is the ability to run schema migrations as part of your deployment. At a certain scale, however, database changes become more time consuming and dangerous. All of our schema changes go through a stringent process with several checks in place. However, not all schema is defined in the database. Whenever we have schema that’s defined in code, or inside the data itself, it’s just a normal code push.

    Conclusion

    Our deployment process is a very important part of how we work at Etsy. We treat it just like our web code, databases or other “serious” things. Deployinator has helped us to get more features out faster with less defects and drama. As we triple our engineering team in 2010 (we’re hiring!), tools like this are what make it possible for us to change the world.


    55 responses to Quantum of Deployment

  • Great stuff Erik! Is the streaming rack middleware bit public?

  • Kathy says:

    I love the GUI. The Save the Princess section is awesome.

  • Zach says:

    Great article, Erik! I may have to steal your princess naming scheme. ;)

  • kastner says:

    Jesse: We’re discussing open sourcing all of Deployinator at some point. Maybe we can OSS just that middleware first though!

    Kathy: Thank you… I didn’t mention the Easter Eggs either ;)

    Zach: Thanks. The princess has saved our castle many times already (*groan*)

  • Thanks for a wonderful article. I’d love to hear more how you handle migrations on the database and keeping all data intact and no conflicts happen without any downtime.

  • [...] lets us peek under the hood of their code deployment practices – save the [...]

  • jorge says:

    Awesome!!

    But please, if you oss pack the ui too ;)

  • [...] Code as Craft » Quantum of Deployment (tags: build webdev code data development tools programming deployment etsy deploy) [...]

  • Michael says:

    What about Twitters BitTorrent based software distribution tool Murder for fanout? http://github.com/lg/murder

  • kastner says:

    Michael: We’ve looked at “Murder” It’s *really* cool. However, we like to start with the simplest thing that could possibly work and iterate (and iterate, and iterate). When we have the need for ultra-fast deployments to thousands of nodes in geographically diverse datacenters, I suspect we’ll be talking about SOME peer-to-peer solution ;)

  • [...] Code as Craft » Quantum of Deployment Notes on how etsy does deployments. (categories: deployment cloud continuous-integration ops ) [...]

  • Oli says:

    I’ve started writing about our deployment methods time and time again, and never managed to write it all down. I will however, inspired by this, have another stab.

    Lovely to get some tech nuggets – always appreciated.

  • [...] Quantum of Deployment at Etsy's Code as Craft blog — an awesome (and thorough) run-down of their push-button deployment process. Good readin'. (tagged: architecture build dev webdev essay work ) [...]

  • Håkan Bruce says:

    Great text and good inspiration! Very close to the way we do it as well, though we’re on a smaller scale. Love the Princess as well. :)

  • [...] Shared Quantum of Deployment. [...]

  • [...] As Craft June 5th Quantum of deploy­ment. Leave a Reply Name (required) eMail (required) Website [...]

  • [...] every engineer has access to deploy code to our production site.  We use a tool called “Deployinator” to do this quickly and easily.  It’s one button.  A culture of unit and functional [...]

  • [...] wait, I haven’t even talked about the work! How many places are there where you can go from zero to production deploy on day one? Or find yourself the subject of CTO tweet and team leader blog [...]

  • [...] The Deployinator: Continuous Deployment at Etsy Using Deployinator we’ve brought a typical web push from 3 developers, 1 operations engineer, everyone else on standby and over an hour (when things went smoothly) down to 1 person and under 2 minutes. [...]

  • [...] Code as Craft » Quantum of Deployment (tags: deployment sysadmin architecture continuousintegration) [...]

  • [...] for their practices, see: The Technical Side of Commons v1.0 (August 24, 2010).  Also see ETSY’s Quantum of Deployment (May 20, 2010) for more information regarding enterprise, robust development [...]

  • [...] Etsy and Continuous Deployment – Etsy, the indie online craft store, has produced numerous interesting stories. What caught my attention here is the discussion of how they implement a continuous deployment. A cornerstone is a tool called deployinator: “…what’s the smallest number of steps, with the smallest number of people and the smallest amount of ceremony required to get new code running on your servers? This isn’t a trivial question. Even if you’re on a slow release cycle, and have a push engineer, what happens if there’s an emergency push needed? Does it go through your normal process, or is there a fast-lane? Do your fast-lane deployments get logged? Are they measured for speed? Is everyone aware that it happened the way they would for a normal deployment, or is it your dirty little secret?” [...]

  • Adam Rosien says:

    Great stuff! You can go even farther and have a zero-click deploy: tag your commits with something like #deploy and then if (and only if) the build passes then deployinator will auto-deploy that build. We do this at Wealthfront and it works out great.

  • [...] date hereCannot add comment at this time.  John Allspaw Add Etsy to this list of examples:http://codeascraft.etsy.com/2010… andhttp://www.slideshare.net/mobile…As well as Flickr:http://velocityconference.blip.t…Insert a [...]

  • [...] anything they can count or time into a graph with almost no effort. (And, because we can push code anytime, anywhere, it’s easy to deploy the code too, so we can go from “how often does X happen?” to a graph of [...]

  • [...] be presenting on Deployinator. I’ll also be presenting alongside my co-workers and friends Chad, Kellan and Mike Brittain. [...]

  • [...] operations, is covered very well at an Etsy blog. It builds on principles established by Flickr.http://codeascraft.etsy.com/2010http://codeascraft.etsy.com/2011…Most of us are running much more modest operations, though. One [...]

  • Fritz says:

    Hey Guys, nice work – have you considered dropping rsync in your process. We are phasing out rsync where possible to gain even lower deploy time-frames. for instance instead of doing a fan-out approach, We are employing DFS replication techniques(ie. GlusterFS) on our Geo-Dist. NFS masters, which the in turn are mounted across the entire www tier. The actual deploy push happens in near real-time where there is typically zero wait time, as oppose to the fan out to complete (via 10-host chunks)

    e.g.
    xyzfile => NFS master at any geo-location(GlusterFS, near real-time propagation) => location specific NFS slave bank(GlusterFS via master for that location, near real-time) => www tier(NFS ro mount off the NFS slave bank)

    from code-repo to entire globally dist. www tier to green light on live site = under 30sec.

    Just some food for thought we are bouncing around here — btw nice work on the graphite setup (will definitely put some of that to work here)

    • efkastner says:

      No we haven’t! We did discuss using git to update the bits (ala github’s post on the subject) but stuck with rsync.
      I’ll check out the DFS ideas for sure, thanks!

  • [...] Etsy, deployments are managed by engineers, not by ops or a release management team. The idea is: you write code, and then you deploy it to [...]

  • [...] are countless success stories out there: Etsy, Heyo, IMVU & Atlassian all do [...]

  • [...] Quantum of Deployment. A helpful tool for continuous deployment. [...]

  • Weight Loss says:

    We have had issues regarding data-base migration when using certain tools – tell me more on how you handle that please. Thanks for a great blog!

  • [...] I laugh in the face of such a situation because continuous deployment is where it’s at, pushing code live with the flip of a switch and a press of a [...]

  • [...] não quer dizer que a implantação contínua não é possível. Muitas empresas como Etsy, Heyo , IMVU e Atlassian, realizam a implantação contínua e, alegam fazê-la de forma [...]

  • OSchmi says:

    Thanks for that great post!
    Could you please give a little more details, how you handle DB Changes. Are they part of the deployment? How often do you push DB Changes? Could need some hints how to convince or DB guys. Thanks

  • Asim says:

    Very interesting insights on your deployment process. I am wondering how do you guys manage roll-backs in the deployinator?

    I have had the joy of building a deployment engine myself a few years ago. Although back then I wasn’t involve in dealing with systems of your scale, however it did work like a charm for smaller projects. Unlike deployinator, it used a pull mechanism on the host itself, where the host will pull all the sources directly from specified locations, and then deploy it. Here’s the link if you’re interested:

    http://fuzedbulb.com/projects/depro-pl-php-deployment-configuration-engine/

  • If you havent had the time to play with this you must, its an amazingly simple application, very clean interface.

  • Quora says:

    Why does Etsy care so much about automated software testing?…

    At Etsy a key part of our process is that we make many deploys at a high velocity. We’ve found by experience that writing and running tests enables us to ship faster and more often. Tests help us to communicate with each other and having tests for new…

  • [...] how we handle so many test jobs. try has also been our guinea pig for integrating Jenkins and Deployinator. We will save the details of all of this for another [...]

  • Christo says:

    We have a set of tooling similar to this at Betfair, but not quite as advanced. Good article though. Thanks for sharing.

    christo

  • Ori says:

    Great article, inspired a lot of changes to our internal deployment system.

    How do you integrate QA into this flow? Or are engineers responsible for testing their own branches fully before merging into the master?

  • [...] to tag English content so it can be extracted and then translated. Since we are a company with a continuous deployment development process, we do this on a daily basis and as an result get a significant number of new [...]

  • [...] Quantum of Deployment (Erik Kastner). We deployed code to production more than 10,000 times in 2011. If you wonder “how did they do that?” this post will tell you all you need to know. [...]

  • [...] User Plugin for determining the user who triggered the build and a Deployinator Plugin to link key Deployinator information to particular Jenkins [...]

  • [...] new Engineers on their first day: deploy to production. We’ve talked a lot in the past about our deployment, metrics, and testing processes. But how does the development environment facilitate someone coming [...]

  • [...] resulting file is pushed out in the same way all our other code is. Using this mini WHOIS database is just a PHP function call without any database or network access. [...]

  • [...] Wales, UKEtsy have some very good information about their infrastructure on their engineering blog.http://codeascraft.etsy.com/2010… deals with how they actually deploy the code. They've also open sourced their Deployinator [...]

  • [...] from errors made while doing schema changes. This incident helped crystalize the “quantum of deployment concept as applied to schema changes” for us. We asked ourselves “What’s the [...]

  • [...] automated. To paraphrase Beyoncé: “If you like it, put a button on it”. Etsy’s Deployinator is a very good example of what an automated release manager should be, but you particular solution [...]

  • Leave a Response

    Recent Posts

    About

    Etsy At Etsy, our mission is to enable people to make a living making things, and to reconnect makers with buyers. The engineers who make Etsy make our living with a craft we love: software. This is where we'll write about our craft and our collective experience building and running the world's most vibrant handmade marketplace.

    Code as Craft is proudly powered by WordPress.com VIP and the SubtleFlux theme.

    © Copyright 2014 Etsy