Internationalizing Etsy: Fostering a Global Marketplace

Posted by on October 8, 2011

There’s nothing more frustrating than not understanding what’s right in front of you, as the non-German speakers among us might surmise after watching the clip from comedy programme Loriot above.  At Etsy, we’ve been working to build the world’s first truly global marketplace, allowing buyers and sellers to connect and have meaningful exchanges, regardless of which languages they speak.  As a first step, we launched Etsy in German in September.

As Chad points out, Etsy has always been a global marketplace, but it’s only recently that we’ve really focused on better supporting our international (non-US, non-English) members. Internationalization at Etsy means preparing our business to support international members and transactions, and also preparing the website—everything from registration, to searching and discovering, to buying and selling, to support.

We typically think of two broad categories when thinking of the technical side of internationalization: localization and translation.

Over the coming weeks, we’ll be sharing some of the engineering work we’ve done to internationalize Etsy.  Check back here on Code as Craft as we dig into some of the interesting bits of our internationalization stack (on both the localization and translation sides), discuss some of the challenges we’ve faced and share the solutions we’ve developed.

Some of the topics we plan on covering:

Localization logic: Combining cues such as Geo IP, browser language preferences along with cookie/user preferences to display the appropriately-localized language, region and currency content.Translation stack: How we tag templates and database content for translation, then translate and serve to our visitors.  Will include our PHP-based message extractor, Smarty prefilter for precompiling in translated content, and SQLite translation bundler.

Translator tools, used by translators to translate and test the translated site.  

In Translation Memory, we’ll discuss one of the key pieces of our Translation Tools: our translation memory built on Lucene, and how we extended CodeMirror to handle HTML and Smarty entities.

Translating user-generated content: allowing for multilanguage user-generated content (such as shops and item listings) comes with data model challenges.

Search query translation to allow for performant cross-language search and discovery of shops and listings, using a translated taxonomy and machine translation services.

Multilingual User Generated Content and SEO: cues for search engines to understand translated content, translated site UI, and how to play nice with sitemaps and robots.txt.

Continuous deployment = continuous translation.  We’ll talk about our translation workflow, and how we’re dealing with common translation and linguistic QA problems.

Localization gotchas: addresses, date formats, possessives, plurals, gender, cache keys, stemming, compound words.

If you haven’t already, take a test drive of Etsy in German, and comment below with any questions or feedback.

Posted by on October 8, 2011
Category: engineering, infrastructure, internationalization Tags: , ,

Related Posts

5 Comments

This is great stuff. I’m looking forward to reading more about your strategy to provide sellers with a possible translation tool to assist them with their listings. As of right now, I believe all the work is up to Etsy members which is OK and understandable. But, I would personally love to be able to translate my shop’s content to all languages Etsy will target.

Also, I’m curious about the ‘Localization gotchas’. At my current job, we release language packs every two years or so. One of the biggest ‘unnoticed’ issues I’ve seen is that translation is mostly performed by US translation services which often fail to consider things such as culture, genre, and plurals when they do their work. Translation by an expert may give you the correct ‘translated’ label but not necessarily the most ‘appropriate’ based on its context and common usage.

Also QA’s usually fail to pick up on these type of errors as the scope of their work is usually to verify that a specific terms appears in the expected language and text labels are big enough to hold the translated text, etc.

As far as other localization issues I see happening very often is the fact that language preferences set through the website, browser, and OS can be completely different and when this happens a type of logic needs to be in place to determine which one takes priority and so on. (Eg. An Etsy buyer in Germany using a work laptop (browser language settings set to English by IT guy) to browse Etsy in French)

Oh yeah and let’s not forget about non-unicode character sets.

I also assume that buyer/seller customer support will be a biggie as you continue to add languages. Anyhow, good luck.

I’m really excited for this series. I’ve not had to venture into internationalization yet, but it is on the foreseeable horizon. Your timing is perfect for me! Thank you!

[…] Internationalizing Etsy: Fostering a Global Marketplace (codeascraft.etsy.com) […]

[…] you’ve come across, PHP tricks/solutions?  Share with us below. Stay tuned for more about Etsy’s internationalization. Category: engineering, internationalization, Uncategorized Tags: currency, language, locale, […]

[…] offers a lot of items from a lot of sellers worldwide. Now since we’ve started to better support our international members, this means Etsy also comes in a variety of languages. We rely heavily on search engines to bring […]