Developing in a Monorepo While Still Using Webpack

Posted by on April 6, 2020

When I talk to friends and relatives about what I do at Etsy, I have to come up with an analogy about what Frontend Infrastructure is. It’s a bit tricky to describe because it’s something that you don’t see as an end user; the web pages that people interact with are several steps removed from the actual work that a frontend infrastructure engineer does. The analogy that I usually fall to is that of a restaurant: the meal is a fully formed web page, the chefs are product engineers, and the kitchen is the infrastructure. A good kitchen should make it easy to cook a bunch of different meals quickly and deliciously. Recently, my team and I spent over a year swapping out our home-grown, Require-js-based JavaScript build system for Webpack. Running with this analogy a bit, this project is like trading out our kitchen without customers noticing, and without bothering the chefs too much.  Large projects tend to be full of unique problems and unexpected hurdles, and this one was no exception. This post is the second in a short series on all the things that we learned during the migration, and is adapted in part from a talk I gave at JSConf 2019. The first post can be found here.


The state of JavaScript at Etsy last year.

At Etsy, we have a whole lot of JavaScript. This alone doesn’t make us very unique, but we have something that not every other company has: a monorepo. When we deploy our web code, we need to build and deploy over 1200 different JavaScript assets made up from over twelve thousand different JavaScript files or modules. Like the rest of the industry, we find ourselves relying more and more on JavaScript, which means that a good bit more of our code base ends in “.js” this year than last.

When starting to adopt Webpack, one of the first places we saw an early win was in our development experience. Up to and until this point, our engineers had been using a development server that we had written in-house. We ran a copy of it on every developer machine, where it built files as they were requested. This approach meant that you could reliably navigate around Etsy.com in development without needing to think about a build system at all. It also meant that we could start and restart an instance of the development server without worrying about losing state or interrupting developers much. Conceptually, this made things very simple to maintain.

This is a diagram showing the browser requesting an asset, our build system building that asset synchronously, and that asset being served back to the browser.
You truly couldn’t have asked for a simpler diagram.

In practice, however, developers were asking for more from JavaScript and from their build systems. We started adopting React a few years prior using the then-available JSXTransform tool, which we added to our build system with a fair amount of wailing and gnashing of teeth. The result was a server that successfully, yet sluggishly, supported JSX. Because it wasn’t designed with large applications in mind, our development server didn’t do things like cache transpiled JSX between builds. Building some of our weightier JavaScript code often took the better part of a minute, and most of our developers grew increasingly frustrated with the long iteration cycles it produced. Worse yet, because we were using JSXTransform, rather than something like Babel, our developers could use JSX but weren’t able to use any ES6 syntax like arrow functions or classes.

Bending Webpack to our will.

Clearly, there was a lot with our development environment that could be improved. To be worth the effort of adopting, any new build system we adopted would at least have to support the ability to transpile syntaxes like JSX, while still allowing for fast rebuild times for developers. Webpack seemed like a pretty safe bet — it was widely adopted; it was actively developed and funded; and everyone who had experience with it seemed to like it (in spite of its intimidating configuration).

So, we spent a good bit of time configuring Webpack to work with our codebase (and vice versa). This involved writing some custom loaders for things like templates and translations, and it meant updating some of the older parts of our codebase that relied on the specifics of Require.js to work properly. After a lot of planning, testing, and editing, we were able to get Webpack to fully build our entire codebase. It took half an hour, and that was only when it didn’t fill all 16 gigabytes of our development server’s memory. Clearly, we had a lot more work on our plates.

This is a screenshot of a performance monitoring tool for a server in which 32 processors are maxed out and 20 gigs of ram are used up.
This is one of our beefiest machines maxing out all 32 of its processors and eating up over 20 gigs of memory trying to run Webpack once.

When Webpack typically runs in development mode, it behaves much differently than our old development server did. It starts by compiling all your code as it would for a production build, leaving out optimizations that don’t make sense in development (like minification and compression). It then switches to “watch mode”, where it listens to your source files for changes and kicks off partial recompilations when any of your source code updates. This keeps it from starting from scratch every time an asset updates, and watching the filesystem lets builds start a few seconds before the assets are requested by the browser. Webpack is very effective at partial rebuilds, which is how it’s able to remain fast and effective, even for larger projects.

…and maybe bending our will to Webpack’s.

Although Webpack was designed for large projects, it wasn’t designed for a whole company’s worth of large projects. Our monorepo contains JavaScript code from every part of Etsy. Making Webpack try to build everything at once was a fool’s errand, even after playing with plugins like HardSource, CacheLoader, and HappyPack to either speed up the build time or reduce its resource footprint.

We ended up admitting to ourselves that building everything at once was impossible. If your solution to a problem just barely works today, it’s not going to be very useful when your problem doubles in size in a few years’ time. A pretty straightforward next step would be to split up our codebase into logical regions and make a webpack config for each one, rather than using one big config to build everything. Splitting things up would allow each individual build to be reasonably sized, cutting back on both build times and resource utilization. Plus, production builds wouldn’t need to change much, since Webpack is perfectly happy accepting either a single configuration or an array of them

There was one problem with this approach though: if we only built one slice of the site at a time, we wouldn’t be able to allow developers to easily browse around Etsy.com in development unless they manually started and stopped multiple instances of Webpack. There are a lot of features in Etsy that touch multiple parts of the site; adding a change to how a listing might appear could mean a change for our search page, the seller dashboard, and our internal tools as well. We needed a solution that would both allow us to only build parts of the site that made sense, while maintaining the “it just works!” behavior of our old system.

So, we wrote something we’re calling Kevin.

This is Kevin.

This is a screenshot of an overlay rendered by kevin-middleware. It shows a message that says "Your code is out for delivery" as well as a loading bar.

Kevin (technically “kevin-middleware”) is an express-style middleware that manages multiple instances of Webpack for you. Its job is to make it easier to build a monorepo’s worth of JavaScript while maintaining the resource footprint of something much smaller. It was both inspired by and meant as a replacement to webpack-dev-middleware, which is what Webpack’s own development server uses to manage a single instance of Webpack under the hood. If you happen to be using that, Kevin will probably feel a bit familiar.

Kevin works by reading in a list of Webpack configurations and determining all of the assets that each one could be responsible for. It then listens for requests for those assets, determines the config that is responsible for that asset, and then starts an instance of Webpack with that config. It’ll keep a few instances around in memory based on a simple frecency algorithm, and will monitor your source files in order to eagerly rebuild any changes. When there are more instances than a configured limit, the least used compiler is shut down and cleaned up.

This is a diagram that attempts to visualize the flow that Kevin goes through when a request comes in, as described in the previous paragraph.
While otherwise being a lot cooler in every respect, Kevin has an objectively more complicated diagram.

Webpack’s first build often takes a while. Like I mentioned before, it has to do a first pass of all the assets it needs to build before it’s able to do fast, iterative rebuilds. If a developer requests an asset from a config that isn’t being built by an active compiler, that request might time out before a fresh compiler finishes its first build. Kevin tries to offset this problem by serving some static code that renders an overlay whenever an asset is requested from a compiler that’s still running its first build. The overlay code communicates back with your development server to check on the status of your builds, and automatically reloads the page once everything is complete.

Using Kevin is meant to be really straightforward. If you don’t already have a development server of some sort, creating one with Kevin and Express is maybe a dozen lines of code. Here’s a snippet taken from Kevin’s documentation:

const express = require("express");
const Kevin = require("kevin-middleware");

// This is an array of webpack configs. Each config **must** be named so that we
// can uniquely identify each one consistently. A regular ol' webpack config
// should work just fine as well.
const webpackConfigs = require("path/to/webpack.config.js");

// Setup your server and configure Kevin
const app = express();

const kevin = new Kevin(webpackConfigs, {
    kevinPublicPath = "http://localhost:3000"
});
app.use(kevin.getMiddleware());

// Serve static files as needed. This is required if you generate async chunks;
// Kevin only knows about the entrypoints in your configs, so it has to assume
// that everything else is handled by a different middleware.
app.use("/ac/webpack/js", express.static(webpackConfigs[0].output.path));

// Let 'er rip
app.listen(9275);

We’ve also made a bunch of Kevin’s internals accessible through Webpack’s own tapable plugin system. At Etsy, we use these hooks to integrate with our monitoring system, and to gracefully restart active compilers that have pending updates to their configurations. In this way, we can keep our development server up to date while keeping developer interruptions to a minimum.

Sometimes, a little custom code goes a long way.

In the end, we were able to greatly improve the development experience. Rebuilding our seller tools, which previously took almost a minute on every request, now takes under 30 seconds when we’re starting a fresh compiler, and subsequent requests take only a second or two. Navigating around Etsy.com in development still takes very little interaction with the build system from our engineers. Plus, we can now support all the other things that Webpack enables for us, like ES6, better asset analysis, and even TypeScript.

This is the part where I should mention that Kevin is officially open-source software. Check out the source on Github, and install it from npm as kevin-middleware. If you have any feedback about it, we would welcome an issue on Github. I really hope you get as much use out of it as we did.


This post is the second in a two-part series on our migration to a modern JavaScript build system. The first part can be found here.

Posted by on April 6, 2020
Category: Uncategorized

4 Comments

Very creative solution, and an awesome read. Thanks for sharing Kevin with the world!

It can be used in a Monorepo for Front End app and libs using Typescript?
If not, would you have any suggestions?

    It can be used using essentially any JavaScript loader (although there are currently limitations if you use Webpack to build your HTML). We lean on Babel, but you could just as easily use the TypeScript loader instead and get the same results.