Engineering Social Commerce
This holiday season we launched a redesigned version of a product we call “Gift Ideas for Your Friends”. The product works by connecting with your Facebook account, analyzing thousands or more of your friends’ likes and interests, and then making recommendations across millions of items in Etsy’s marketplace. Social commerce has been somewhat of a hot topic lately, and the gift recommender is a social commerce feature in that it provides a new and unique shopping experience to buy gifts for your friends and family. In this post we explore some of the engineering challenges we faced in building a social commerce feature like the gift recommender.
The gift recommender is social in that it brings your friends to Etsy. We all know shopping for friends and family is hard, particularly around the holidays when shopping lists grow quite large. Building a responsive experience here that allows navigation across your friends and their recommendations requires a tight coupling between client and server components as well as with Facebook’s API. On the commerce side of things, the product is powered by data mining algorithms that analyze contexts in both Facebook’s social setting and Etsy’s marketplace to make relevant recommendations. While developing these algorithms represents a challenge within itself, the coupling between these algorithms and the end design and user interaction is equally critical.
Tight integration with Facebook
The gift ideas product works by analyzing each of your friends. For each friend, we request various attributes, including name, education history, likes, interests, and activities. Facebook has a limit of 5,000 friends, but having friend counts above 1,000 is not uncommon. Furthermore, it is also not uncommon for people to have upwards of thousands of likes and interests. So, for a typical Facebook power user (read: your average graduate college student) requesting upwards of 100,000 attributes is not uncommon.
So how do we pull this amount data back from Facebook? First, you may have noticed that each friend is featured in a separated UI component which allows us to compute recommendations independently. When creating recommendations, we split up friends into groups of 50 and use our asynchronous job queueing system (powered by Gearman) to create recommendations in parallel. Each Facebook request is constructed using a series of fairly complex Facebook Query Language queries, a SQL like language supported by Facebook’s API. Some of these queries are extremely complex. For example, the query to fetch a user’s page likes looks something like this:
select page_id, name, type from page where page_id in ( select page_id from page_fan where uid in ( select uid2 from friend where uid1 = me() limit 50; ) )
Requesting data from Facebook is the slowest component of the recommendation creation process: some of our larger Facebook queries take multiple seconds to respond.
Performance: Caching, Caching, and Caching
The new design for the product displays many friends and their recommendations on the primary splash page. This is in contrast to the old design which only allowed for viewing of only one friend’s recommendations at a time. This presented several performance challenges.
Each Facebook attribute triggers a recommendation, and each recommendation shows items from the marketplace by issuing a search query. The new product displays four recommendations per friend in batches of 20 friends, so each batch can require as many as 80 search queries. Assuming an average response time of ~200ms per search, this could add up to load times in excess of 15 seconds. (!)
Luckily, the distribution of Facebook likes (and corresponding gift recommendations) is very sharp: the most popular 5,000 recommendations represent over 90% of all recommendations made by the product. Therefore, caching listing results at a per-recommendation level granularity provides us with tremendous speedups: 200ms search requests optimize to ~2ms memcache requests.
Client and Server-side Facebook API
Complicating matters, we also recently released a feature that allows you to connect your Etsy account with your Facebook account. Managing tokens and authentication across the two systems while also allowing users to shop for gift ideas without an Etsy account presents several technical challenges.
Dealing with Backend Latency on the Frontend
Perhaps the biggest improvements made this year stem from a tighter coupling between the backend recommendation generation process and the frontend display. The initial creation process can take 3 or more seconds, and providing user feedback and context throughout is critical.
You may have noticed that your recommendations fill in “on the fly” as they’re created. As each of the asynchronous Gearman job workers completes its recommendation task, we stream results back to the client, which then renders them immediately via ajax. The end goal here is to enable the user to see recommendations appear as soon as possible, providing a more immediate shopping experience.
Of course, the core of the recommender system is the recommender algorithm and the supporting data. The core algorithm is responsible for understanding the meaning of a given Facebook attribute in an Etsy context. For example, the artist “Pink” is a popular musician on Facebook. However, a query for “pink” returns substantially different results on Etsy.
The core gift recommendation algorithm is overviewed in a post from earlier this year. We’ve also made several improvements since then. We’re smarter in analyzing gender when retrieving appropriate listing suggestions, and we’ve also taken another pass and removed bad listings based on data from the first year of the product.
Precision vs Recall, and the End Experience
“Gift Ideas for Your Friends” provides a different experience compared to other traditional recommender algorithms. For example, Netflix’s algorithms take a collaborative approach in which your entire profile is analyzed in aggregate, and recommendations are created by comparing your favorite movies compared to others.
In contrast, “Gift Ideas for Your Friends” makes point-based recommendations off of a single attribute of your friend’s Facebook profile. Jim likes burning man. Kurt likes video games. Chad likes Brooklyn.
In informational retrieval terms, the goal of the gift recommender is to optimize on precision: to make a handful of good recommendations based on a given set of attributes. There are lots of things that your mother likes on Etsy that aren’t represented in her Facebook profile, and the gift recommender will “miss” these recommendation opportunities. This is compared to Netflix style recommendations where the goal is to optimize for recall: given your entire movie history, provide recommendations that capture your taste as a whole.
In fact, for the general gift giving problem, optimizing for precision is a more natural objective: you generally buy your mother only a one or two gifts each year. Your mother might appreciate gifts like vintage glassware, amethyst jewelry, raku pottery, etc. A successful holiday gift really only requires buying her one of these items. Netflix style recommendations are aimed at capturing your various aspects of your taste and have stronger expectations for movie recommendations across all genres / styles that you may like.
The ultimate goal of the product is to provide a glimpse of Etsy through your friends and their existence on Facebook. We view the recommendations and sample results not as the final word in what to buy, but rather as a landing pad to dive into the marketplace. Diving into Chad’s recommendations for “Brooklyn” could then lead to a search for “brooklyn bridge” and purchasing an 8×10 photo.
At Etsy, we build our system in a continuously deployed environment which allows us to quickly iterate and experiment. We view everything we build as somewhat of an experiment, and the Facebook gifter is no exception. We look forward to the future of “Gift Ideas for Your Friends” and social commerce in general on Etsy.