Web Experimentation with New Visitors

Posted by on April 3, 2014

We strive to build Etsy with science, and therefore love how web experimentation and A/B testing help us drive our product development process. Several months ago we started a series of web experiments in order to improve Etsy’s homepage experience for first-time visitors. Testing against a specific population, like first-time visitors, allowed us to find issues and improve our variants without raising concerns in our community. This is how the page used to look for new visitors:

old-homepage

We established both qualitative and quantitative goals to measure improvements for the redesign. On the qualitative side, our main goal was to successfully communicate to new buyers that Etsy is a global marketplace made by people. On the quantitative side, we primarily cared about three metrics: bounce rate, conversion rate, and retention over time. Our aim was to reduce bounce rate (percentage of visits who leave the site after viewing the homepage) without affecting conversion rate (proportion of visits that resulted in a purchase) and visit frequency. After conducting user surveys, usability tests, and analyzing our target web metrics, we have finally reached those goals and launched a better homepage for new visitors. Here’s what the new homepage looks like:

new-homepage

Bucketing New Visitors

This series of web experiments marked the first time at Etsy where we tried to consistently run an experiment only for first-time visitors over a period of time. While identifying a new visitor is relatively straightforward, the logic to present that user with the same experience on subsequent visits is something less trivial.

Bucketing a Visitor

At Etsy we use our open source Feature API for A/B testing. Every visitor is assigned a unique ID when they arrive to the website for the first time. In order to determine in which bucket of a test the visitor belongs to, we generate a deterministic hash using the visitor’s unique ID and the experiment identifier. The main advantage of using this hash for bucketing is that we don’t have to worry about creating or managing multiple cookies every time we bucket a visitor into an experiment.

Identifying New Visitors

One simple way to identify a new visitor is by the absence of etsy.com cookies in the browser. On our first set of experiments we checked for the existence of the __utma cookie from Google Analytics, which we also used to define visits in our internal analytics stack.

Returning New Visitors

Before we define a returning new visitor, we need first to describe the concept of a visit. We use the Google Analytics visit definition, where a visit is a group of user interactions on our website within a given time frame. One visitor can produce multiple visits on the same day, or over the following days, weeks, or months. In a web experiment, the difference between a returning visitor and a returning new visitor is the relationship between the experiment start time and the visitor’s first landing time on the website. To put it simply, every visitor who landed on the website for the first time after the experiment start date will be treated as a new visitor, and will consistently see the same test variant on their first and subsequent visits.

As I mentioned before, we used the __utma cookie to identify visitors. One advantage of this cookie is that it tracks the first time a visitor landed on the website. Since we have access to the first visit start time and the experiment start time, we can determine if a visitor is eligible to see an experiment variant. In the following diagram we show two visitors and their relation with the experiment start time.

 visits-diagram

Feature API

We added the logic to compare a visitor’s first landing time against an experiment start time as part of our internal Feature API. This way it’s really simple to set up web experiments targeting new visitors. Here is an example of how we set up an experiment configuration and an API entry point.

Configuration Set-up:

$server_config['new_homepage'] => [
   'enabled' => 50,
   'eligibility' => [
       'first_visit' => [
           'after' => TIMESTAMP
       ]
   ]
];

API Entry Point:

if (Feature::isEnabled('new_homepage')) {
   $controller = new Homepage_Controller();
   $controller->renderNewHomepage();
}

Unforeseen Events

When we first started analyzing the test results, we found that more than 10% of the visitors in the experiment had first visit landing times prior to our experiment start day. This suggested that old, seasoned Etsy users were being bucketed into this experiment. After investigating, we were able to correlate those visits to a specific browser: Safari 4+. The visits were a result of the browser making requests to generate thumbnail images for the Top Sites feature. These type of requests are generated any time a user is on the browser, even without visiting Etsy. On the web analytics side, this created a visit with a homepage view followed by an exit event. Fortunately, Safari provides a way to identify these requests using the additional HTTP header “X-Purpose: preview”. Finally, after filtering these requests, we were able to correct this anomaly in our data. Below you can see the experiment’s bounce rates significantly decreased after getting rid of these automated visits.

bounces

Although verifying the existence of cookies to determine whether a visitor is new may seem trivial, it is hard to be completely certain that a visitor has never been to your website before based on this signal alone. One person can use multiple browsers and devices to view the same website: mobile, tablet, work or personal computer, or even borrow any other device from a friend. Here is when more deep analysis can come in handy, like filtering visits using attributes such as user registration and signed-in events.

Conclusions

We are confident that web experimentation with new visitors is a good way to collect unbiased results and to reduce product development concerns such as disrupting existing users’ experiences with experimental features. Overall, this approach allows us to drive change. Going forward, we will use what we learned from these experiments as we develop new iterations of the homepage for other subsets of our members. Now that all the preparatory work is done, we can ramp-up this experiment, for instance, to all signed-out visitors.

You can follow Diego on Twitter at @gofordiego

Posted by on April 3, 2014
Category: data, engineering Tags: ,

Related Posts

4 Comments

Excluding visitors who _may_ have seen an old version of a feature is effective but a little heavy-handed. Experiments that would be biased by testing only brand new visitors must be common as well, though perhaps less common than those biased by people who have seen “the old version”.

I’m also intrigued by maybe using the __utmb cookie to prevent rollout of features mid-session.

I’m a little confused by this part:

“When we first started analyzing the test results, we found that more than 10% of the visitors in the experiment had first visit landing times prior to our experiment start day. This suggested that old, seasoned Etsy users were being bucketed into this experiment.”

I thought your code was preventing visitors whose first visit was before the experiment start time from being bucketed in the experiment.

I’m a little confused by this part:

“When we first started analyzing the test results, we found that more than 10% of the visitors in the experiment had first visit landing times prior to our experiment start day. This suggested that old, seasoned Etsy users were being bucketed into this experiment.”

I thought your code was preventing visitors whose first visit was before the experiment start time from being bucketed in the experiment. Are you saying that their cookies made it seem like they were new visitors but you knew that they weren’t for some reason?

    I hope the following clarifies that part. When Safari Top Sites made a request to generate a thumbnail for Etsy’s homepage it didn’t send any cookies on the HTTP request header. However, while the page was loading in that process, JavaScript did have access to the browser cookies and triggered a false analytics event. The lack of cookies made these requests eligible to see the new homepage, but their analytics event properties indicated that they weren’t coming from new visitors.