October 2012 Site Performance Report

Posted by on November 9, 2012

It’s been about four months since our last performance report, and we wanted to provide an update on where things stand as we go into the holiday season and our busiest time of the year.  Overall the news is very good!

Server Side Performance

Here are the median and 95th percentile load times for core pages, Wednesday 10/24/12:

As you can see, load times declined significantly across all pages.  A portion of this improvement is due to ongoing efforts we are making in Engineering to improve performance in our application code.  The majority of this dip, however, resulted from upgrading all of our webservers to new machines using Sandy Bridge processors.  With Sandy Bridge we saw not only a significant drop in load time across the board, but also a dramatic increase in the amount of traffic that a given server can handle before performance degrades.  You can clearly see when the cutover happened in the graph below:

This improvement is a great example of how operational changes can have a dramatic impact on performance.  We tend to focus heavily on making software changes to reduce load time, but it is important to remember that sometimes vertically scaling your infrastructure and buying faster hardware is the quickest and most effective way to speed up your site.  It’s a good reminder that when working on performance projects you should be willing to make changes in any layer of the stack.

Front-end Performance

Since our last update we have a more scientific way of measuring front-end performance, using a hosted version of WebPagetest.  This enables us to run many synthetic tests a day, and slice the data however we want.  Here are the latest numbers, gathered with IE8 from Virginia over a DSL connection as a signed in Etsy user:

These are median numbers across all of the runs on 10/24/12, and we run tests every 30 minutes.  Most of the pages are slower as compared to the last update, and we believe that this is due to using our hosted version of WebPagetest and aggregating many tests instead of looking at single tests on the public instance.  By design, our new method of measurement should be more stable over the long term, so our next update should give a more realistic view of trends over time.

You might be surprised that we are using synthetic tests for this front-end report instead of Real User Monitoring (RUM) data.  RUM is a big part of performance monitoring at Etsy, but when we are looking at trends in front-end performance over time, synthetic testing allows us to eliminate much of the network variability that is inherent in real user data.  This helps us tie performance regressions to specific code changes, and get a more stable view of performance overall.  We believe that this approach highlights elements of page load time that developers can impact, instead of things like CDN performance and last mile connectivity which are beyond our control.

New Baseline Performance Measurements

Another new thing we created is an extremely basic page that allows us to track the lower limit on load time for Etsy.com.  This page just includes our standard header and footer, with no additional code or assets.  We generate some artificial load on this page and monitor its performance.  This page represents the overhead of our application framework, which includes things like our application configuration (config flag system), translation architecture, security filtering and input sanitization, ORM, and our templating layer.  Having visibility into these numbers is important, since improving them impacts every page on the site.  Here is the current data on that page:

Over the next few months we hope to bring these numbers down while at the same time bringing the performance of the rest of the site closer to our baseline.

Posted by on November 9, 2012
Category: performance

24 Comments

Hey Jonathan,

Thanks for posting about the Sandy Bridge stuff. We have seen similar results at WordPress.com. Could you elaborate a little bit about the hardware you were using before and the hardware you are using now? Not only the CPU type but also the chassis. For example, we are switching from Dell R410s running Xeon 5620s to R420s running E5-2430s and see a similar difference to what you posted. We have also tested R620s and Supermicros with E5-2630 CPUs and the results have been pretty surprising – not as much difference as I would have thought and the Supermicro performance seems to be significantly worse than Dell – I am not sure why. (We of course have already made all the necessary changes to the power scheduling plans in the BIOS). Thanks!

    Hey Barry,
    Have you seen my previous post regarding our hardware setup? At the time of writing we weren’t using Sandy Bridge for the webs, but after our experiences with Solr we outright replaced the pool. We’re still using Supermicros for that job; X9DRT-HF+ specifically. Before that we used the earlier version of the same setup, the X8DTT-H. Basically they’re both 4 nodes that slot into a 2U chassis, giving pretty great density (more on that in the previous post). We went from 2x E5645 @ 2.4ghz (24 threads total) to the E5-2690 @ 2.9ghz (32 threads total) and according to our estimates, not only does that give the huge wins when it comes to performance but they can also handle 3x the traffic.

    Interesting that you’re seeing less performance on the Supermicro; we happened to buy one HP box with a Sandy Bridge for a particularly CPU hungry app but haven’t benchmarked how that performs against the Supermicro.

Thanks again for a fascinating look into your infrastructure and how closely you track everything. We are a few days away from a major product launch at my small startup, and we look to your blog posts a lot for guidance on engineering and monitoring practices especially.

Have you guys done any comparisons of your workload with different speeds of RAM? I’m trying to decide between more RAM per server at 1333Mhz or less RAM at 1600Mhz.

[…] Last week my good friend Jonathan Klein at Etsy published a blog about the latest performance achievements at Etsy. […]

[…] A RUM-free Recipe Hardware and code improvements along with using an automated version of webpagetest.org has led to a better, faster Etsy! […]

What software did you use to generate charts?

    The bar charts were actually made by hand, one of our designers here (Jason Huff) put them together for this post. The graph of performance after the Sandy Bridge rollout was generated with Graphite.

Great post. Thanks for sharing. I wish you would include the RUM aggregate stats. People could benefit to see how RUM compares to synthetic, so showing IE8 RUM vs WebPagetest would be very useful. You guys rock!

    That’s a good suggestion, we’ll try to put our RUM data into the next update. It’s a little noisy, but I agree that the comparison would be beneficial. Thanks Steve!

Do you run these tests as part of your automated builds?

I’m currently looking into how we can add webpagetest test runs to our Jenkins builds and was wondering what Etsy does around this (e.g. using an internal instance of webpagetest + its APIs)

    We don’t run them as part of our build process, but we do run a test with our private instance on every code push, and graphs of site performance appear on our deployment dashboard. Engineers scan the dashboard for anomalies after they push, and if people see a performance regression then we dig into why it happened.

    We find that this works fairly well, and I imagine that putting it into Jenkins and alerting on it might cause some false positives. Still, we are always looking for ways to make our automated testing more robust, and to catch problems before they occur. It’s a good idea, and this might be something we will look into in the new year. Let us know if you have success with it in your environment!

Hi Jonathan!

Are you using WPT in conjunction with WPTMonitor? If so what have your experiences been so far?

    We actually aren’t using WPT Monitor – we wrote some software internally to make the submission of jobs easy, and we are using our existing monitoring (statsD/graphite and splunk) to capture the data. I’ll take a look at WPT Monitor and see if it makes sense to add that into the mix as well. Thanks!

Hi Jonathan, I wouldn’t advice on using it, I had some problems with it and was curious if you’ve used it and ran into the same issues, I ended up writing my own application to schedule jobs and make reports as well 🙂

@GARNSER
could you share your own application?
I was used WPT monitor and encounter some problem.
I am trying to utilize httparchive’s private instance but without luck yet.

[…] we published our last performance update, we got a comment about the lack of RUM data for front-end performance from Steve Souders.  Steve […]

[…] like Google, Etsy, and Betfair have gone so far as to publish their commitment to performance. This is a win for […]

Hi- You all sound very smart, dedicated, and transparent…..but. What about all the middle aged ladies (and men- but they aren’t as quick to admit it, let’s be honest) who have only a passing grasp of what you’re talking about a lot of the time. So many words become part of everyday language without any mainstream path of learning- we’re suddenly supposed to know what ‘wireframe’ ‘push’ ‘application’ ‘build’ etc is, not to mention all the acronyms and initials that mean all different things. Is this just a way to get us to accept soylent green? (JK) but seriously what is one to do? Thanks, Terre

    Hello,

    You bring up a good point, and this is why we occasionally post about technical topics our our main Etsy News Blog, but keep the terminology and concepts more accessible to a general audience. In fact, we just recently published a post about site speed on that blog, and I hope it will be an easier read! You can check it out here:

    http://www.etsy.com/blog/news/2013/etsy-gets-faster/

    This particular blog is where we dive deep into the technical aspects of running Etsy, and it is targeted primarily at an engineering audience. We try to keep a clear separation between the two, so we can provide relevant content to people of all technical backgrounds.

@IOIIOI I’ll need to put down another few hours into it and I’ll have it released, although I will probably not have time for it this month. I’ll keep you posted.

Have you considered using FEO services from strangeloop or torbit? (Not affiliated with either but they look promising)

thanks

    We have considered it, and we looked into mod_pagespeed as well, but the idea of running all of our traffic through a proxy that we have limited control over has stopped us so far. It might be something we look into more in the future, since these options keep getting better and safer.