How does Etsy manage development and operations?

Posted by on February 4, 2011

I’ve been loving using Quora these past few months, and have been amazed at the level of behind-the-scenes detail people are providing about really complex and specific things (like how Facebook does automated testing).

Recently, someone asked, “How does Etsy manage development and operations?” with these comments:  Etsy seems to have scaled far and fast, whilst continuing to add new features; how is all this managed – is there a strictly-defined process within which engineers operate, or is it a case of hiring clever people and letting them get on with it (Facebook-style)?

First of all, I love the team and am proud of the work that they do.  It’s an amazing group and none of this would work or be as fun as it is without them.

So, here’s the answer I just posted:

In 2010, we did grow the engineering team pretty fast, going from 20 to about 70, and the rest of the company grew quickly, too.  As we grew, overall speed has been really important to us, and we’ve continually tuned our processes, tools, and culture to support that.  I wrote about some of these principles behind all of it in my blog over the summer:

Right now, developers are divided up into a number of small teams, usually 3-7 engineers.  These teams are paired with a product manager and a designer, and there is some movement across teams as needed.  All designers at Etsy code and product managers code at various levels, too. Ops and dev work really closely together, and we have one development team that is very ops-like and straddles both domains. Everyone in the company uses IRC.  Lots of ideas are worked out on a wiki, and people around the company comment on those ideas and plans (we use Confluence).  Some projects form organically, and others are more top-down.

We generally plan in 60-day chunks and divide the deliverables up into 2-week periods (though we’re not officially using capital-A Agile).  The 60-day cycle has no special significance — we just felt like it was a reasonable timeframe for planning near-term deliverables.  The 60-day plans go through a review, we set goals, and we publish the plans on the wiki.  Our founder, CEO, and head of product (Rob Kalin) participates in these reviews and stays in close contact with the product and engineering teams throughout.  In general, the teams have a lot of autonomy in how they get their work done within a set of architectural principles we’ve established (a subject for another post) and our overall design approach.  Specs are typically very light, and the focus is on building working features.

We onboard engineers quickly and their first goal is simple: deploy on your first day.  The goal here is to constantly emphasize shipping, and get over any deployment fears early.  Engineers get productive very quickly.  The level of cooperation between developers and ops is also really high (see our engineering blog for more:

We practice continuous deployment and make small changes frequently to the site.  We use what we call “config flags,” which are more or less an exact copy of what Flickr does (see the Flickr engineering blog: and a lot of the code for features runs “dark” for days or weeks, and feature launches mean flipping a switch in the code.  We have a lot of Flickr DNA in the company (John Allspaw, our VP of Ops, ran ops at Flickr, and Kellan Elliott-McCrea was architect at Flickr).  In January (a month in which we did over a billion page views), code committed by 76 unique individuals was deployed to production by 63 different folks a total of 517 times.   Product managers make changes and do deploys (here’s Jenn Vargas, one of our newest product managers, tweeting about it) and we have trained aspiring developers on our support team to make small changes with our help and guidance, too.  Our deployment environment requires a lot of trust, transparency, communication, coordination, and discipline across the team.  We’ve invested a lot in our automated unit and functional testing (we have a team devoted just to this), tooling for deployment (see our blog post about Deployinator:, and metrics and monitoring (see “Tracking Every Release”:   Key system-level and business level metrics (like checkout/listing/registration/sign-in rates)  are projected on screens in the office and we have a number of internal dashboards that the team uses (we mainly use Ganglia and Graphite).  We also have lots of switches and knobs to help us roll features out to percentages of users and ramp them up slowly, or quickly.  Features are used and tested by us here at Etsy for some period of time before they are rolled out publicly.

When we make mistakes, we conduct blameless post-mortems and assign remediation items to the appropriate team members.  Engineers frequently post in our community forums when we have any issues and we have a status blog that we maintain (  I think that interacting with Etsy members gives everyone a deeper sense of responsibility for the code we’re writing.  We also write about the mistakes we make pretty openly (

Overall, engineers are treated as creative collaborators in the overall process with design and product, and products are worked out and iterated on with engineers instead of simply being handed to them for implementation.   Rob (our founder and head of product) likes working with engineers and the engineers spend a lot of time interacting with Rob.  Our ability to work this way has as much to do with the personalities of the people involved and the culture as the technologies involved. We’re always learning and adjusting and we’ll continue to evolve as time goes on.

Posted by on February 4, 2011
Category: operations, people, philosophy


[…] I posted my answer, and cross-posted it to the Etsy engineering blog, too. […]

[…] schedule. This part also depends on how agile the whole process is. Companies like and believes in continuous deployment, they would have no problem fixing a reported security problem in […]

[…] How does Etsy manage development and operations? « Code as Craft – […]

Hi – I was interested by the fact that everyone uses IRC. Are the channels private in someway ?

Jonah: Private within the company? We don’t have any private channels. We’re generally pretty transparent and it’s not necessary to keep people out of the “wrong” channels. Usually a team’s channel is the fastest way to get something to a team. Lots of people will “hang out” in another team’s channel just to keep up with what they’re working on.

Oh I see – you run internal IRC !! silly me

what IRC server do you use ?

Guys, will you accept russian project manager? I will relocate to Brooklin from Moscow just to work with you

Andrey, we’re not looking for any project managers at the moment, but we’re honored that you’re willing to move to Brooklyn!

[…] currently use some form of Continuous Deployment to manage their operations, such as Flickr, Etsy, Netflix and more who I’ve forgotten. Continuous Delivery is a topic that impacts all areas […]

[…] For more success stories on hiring methodologies, check out Etsy’s Quora answer on how they train their engineers to manage development and operations. […]

[…] we’ve been deploying code 25 times a day or more, and running automated tests every time we deploy. Conservatively, we run our tests about 25 times […]

[…] How does Etsy manage development and operations? […]

[…] the other side of the gulf there is Amazon (a deployment every 11.6 seconds), Etsy (a new developer commits to production on day 1), and Flickr (they deployed 97 times this week (scroll to the […]

[…] the other side of the gulf there is Amazon (a deployment every 11.6 seconds), Etsy (a new developer commits to production on day 1), and Flickr (they deployed 97 times this week (scroll to the […]

[…] Bjelic (@bojanbjelic) April 5, 2012Amazon (a deployment every 11.6 seconds), Etsy (a new developer commits to production on day 1), and Flickr (they deployed 97 times this week) (!!)RT @OdeToCode: The Bar Is Even Higher Now […]

[…] Continuous Deployment shops are more careful and follow Etsy/Flickr’s approach of dark launching: deploying changes continuously, but testing and reviewing them before turning them on […]

[…] uncommon for development to ship code into production several times a day! Examples include IMVU, Etsy, and Flickr and many others. Even in large scale organizations like Facebook, a daily deployment is […]

[…] HSTS during rollout to ensure things operate smoothly, and we’ll be increasing this value via a config push over time as we’re confident there will be no […]

[…] Software-as-a-Service, whereby programs are delivered over the internet, makes continuous deployment possible (continuous deployment doesn’t work for installed software). With time, continuous deployment will become more prominent, especially when firms like Etsy espouse its benefits. […]

[…] Chad Dickerson, How does Etsy manage development and operations?, 04 février […]

[…] How Does Etsy Manage Development and Operations […]

[…] to have a limited number of customers on a beta program, and enable features based on matrices. Etsy described how they manage Operations and Development in a very awesome blog post, and they also talk about having config flags to enable modules. The […]

[…] we were sending 100% of our project search traffic to Elasticsearch. There are some great posts by Etsy and Flickr that go into more detail on config flags and rolling out features gradually for more […]

[…] doing, having looked online and not found much about what other companies are doing (other than a brief mention of Confluence by […]

[…] Etsy CEO Chad Dickerson has remarked that, “our goal here is to constantly emphasize shipping, and get over any deployment fears.”  For Etsy, the regular addition of new features means the team has to stay ahead of their users’.  In order to move fast, they’ve adopted a testing process with a heavy emphasis on performance monitoring and analytics.  They A/B test whenever they push new features or make any infrastructure upgrades. They do side-by-side comparisons on before and after system performance readouts. They leverage the metrics produced by deep performance monitoring as much as possible. Diligent performance management is core to Etsy’s operational model, and it results is fast, confident updates and quick fixes, whenever necessary. […]