Reducing Domain Sharding

Posted by Jonathan Klein | Filed under performance

This post originally appeared on the Perf Planet Performance Calendar on December 7th, 2013.

Domain sharding has long been considered a best practice for pages with lots of images.  The number of domains that you should shard across depends on how many HTTP requests the page makes, how many connections the client makes to each domain, and the available bandwidth.  Since it can be challenging to change this dynamically (and can cause browser caching issues), people typically settle on a fixed number of shards – usually two.

An article published earlier this year by Chromium contributor William Chan outlined the risks of sharding across too many domains, and Etsy was called out as an example of a site that was doing this wrong.  To quote the article: “Etsy’s sharding causes so much congestion related spurious retransmissions that it dramatically impacts page load time.”  At Etsy we’re pretty open with our performance work, and we’re always happy to serve as an example.  That said, getting publicly shamed in this manner definitely motivated us to bump the priority of reinvestigating our sharding strategy.

Making The Change

The code changes to support fewer domains were fairly simple, since we have abstracted away the process that adds a hostname to an image path in our codebase.  Additionally, we had the foresight to exclude the hostname from the cache key at our CDNs, so there was no risk of a massive cache purge as we switched which domain our images were served on.  We were aware that this would expire the cache in browsers, since they do include hostname in their cache key, but this was not a blocker for us because of the improved end result.  To ensure that we ended up with the right final number, we created variants for two, three, and four domains.  We were able to rule out the option to remove domain sharding entirely through synthetic tests.  We activated the experiment in June using our A/B framework, and ran it for about a month.

Results

After looking at all of the data, the variant that sharded across two domains was the clear winner.  Given how easy this change was to make, the results were impressive:

  • 50-80ms faster page load times for image heavy pages (e.g. search), 30-50ms faster overall.
  • Up to 500ms faster load times on mobile.
  • 0.27% increase in pages per visit.

As it turns out, William’s article was spot on – we were sharding across too many domains, and network congestion was hurting page load times.  The new CloudShark graph supported this conclusion as well, showing a peak throughput improvement of 33% and radically reduced spurious retransmissions:

Before – Four Shards

before

After – Two Shards

after

Lessons Learned

This story had a happy ending, even though in the beginning it was a little embarrassing.  We had a few takeaways from the experience:

  • The recommendation to shard across two domains still holds for us (YMMV depending on page composition).
  • Make sure that your CDN is configured to leave hostname out of the cache key.  This was integral to making this change painless.
  • Abstract away the code that adds a hostname to an image URI in your code.
  • Measure everything, and question assumptions about existing decisions.
  • Tie performance improvements to business metrics – we were able to tell a great story about the win we had with this change, and feel confident that we made the right call.
  • Segment your data across desktop and mobile, and ideally international if you can.  The dramatic impact on mobile was a huge improvement which would have been lost in an aggregate number.

Until SPDY/HTTP 2.0 comes along, domain sharding can still be a win for your site, as long as you test and optimize the number of domains to shard across for your content.


3 responses to Reducing Domain Sharding

  • Joachim says:

    Wow, a team willing to admit that their current technical implementation is not perfect. Usually you only see this when a new team has taken over “legacy” code.

    Refreshing!

    Love these posts, keep ‘em coming!

  • tscanausa says:

    Has Etsy looked into the technique of pushing multiple image files into 1 http request?

    It is at least an interesting read from dropbox.
    https://tech.dropbox.com/2014/01/retrieving-thumbnails/

    • We have talked about it, but have no plans to do it at the moment. There’s a lot of complexity inherent in that strategy for Etsy, since we have a huge, dynamic working set of images. That doesn’t make it impossible, just complicated enough that we feel like we have lower hanging fruit on the performance tree at the moment.

  • Leave a Response

    Recent Posts

    About

    Etsy At Etsy, our mission is to enable people to make a living making things, and to reconnect makers with buyers. The engineers who make Etsy make our living with a craft we love: software. This is where we'll write about our craft and our collective experience building and running the world's most vibrant handmade marketplace.

    Code as Craft is proudly powered by WordPress.com VIP and the SubtleFlux theme.

    © Copyright 2014 Etsy