Is It A Bird? Is It A Plane? No, It’s Supergrep!

Posted by on June 28, 2012

Etsy parties have come to earn quite the boisterous reputation — there may or may not have been kegs dragged up to the roof on a particularly balmy occasion, not to mention cadaver eating contests and mariachi bands. Thus, one of the greatest survival skills I have come to hone here at HQ is the art of selective hearing. Being able to focus in on the subtle wit of your coworker in the midst of a rowdy congo line is the only way you will leave at the end of the evening with your sanity (and self-respect) intact. Luckily for us, eliminating the noise in our data using technology is a little bit easier.

After a long week two years ago with a number of site incidents, our then-CTO Chad Dickerson sent an email to our all-engineering mailing list about a new tool:

For a few of the incidents we’ve had in the past week, we’ve found
obvious trails in the error logs leading directly to the source of the
problem. I talked to a bunch of folks and what I generally heard was
that the logs were noisy, which is technically true, but. . . . .

Through the magic of grep, you can get past the noise. I did some
digging through the error logs today to better understand what was in
there and wrote a simple ugly grep that kills all the stuff that
repeats a lot (not that we don’t need to fix it).

And thus supergrep was born. The original was a simple grep -v command line bash script to make our logs easier to parse, but the tool has evolved into a communication mechanism for the entire team to view our logs in real time over a web browser. Since then, we have cleaned up many of the noise problems that obscured our logs, while iterating the tool into something easily accessible by everyone on team. It’s a simple node app that everyone on the team can access as they are pushing code, to see if new errors appear. In addition we’ve linked the errors to the exact line of code that the error is happening on for easier analysis. Its yet another change-awareness tool in our kit that allows us to push code continuously and safely to production.

We’ve open sourced the code on our github repository, which you can find here: https://github.com/etsy/supergrep

Posted by on June 28, 2012
Category: engineering Tags: , , , , ,

13 Comments

Nice! At first glance this seems to be a bit friendlier to use than mine 🙂 https://github.com/ckdake/livetail

Nice! At first glance this seems to be a bit friendlier to use than mine 🙂 https://github.com/ckdake/livetail

Y’all ever heard of ack?
http://betterthangrep.com/

    Of course, this isn’t a command line tool, its a node.js app for streaming logs in a web browser.

Or have you heard of Splunk (http://www.splunk.com/)? It’s pretty powerful. But might be a little too much for this case.

    Hi, Yes we also use Splunk. Its a slightly different use case. Splunk is great for historical analysis, log search, and filtering. Supergrep is more whats happening this very second. There is actually some code buried in here that links out to splunk.

Supergrep looks like it would be adapted well to a handle a problem of real time log analysis of ADB logcat output. So far I have yet to see anything that does that really well.

I like it. It seems quite adaptable for otherwise unreadable realtime garbage from Android’d logcat output. Great app!

I’ve setup an instance of supergrep -> I like it 🙂 However it seems that I’ve some parsing issues (ie: PHP Warnings not flagged as warnings …), probably related to the fact that we’re not using the same logformat ?

Could you please share it ? Or do you have a better idea ?

In all the cases, thanks a lot for sharing this tool !

    Yes we use a custom format and you’ll have to adjust the regexes in static/js/supergrep.js to suit yours. I can try to dig ours out, please submit a github issue for that.

Hey John, glad to see this finally made it out there!

The original blog post explained a lot more and would have answered some of the questions people asked but… =)

BTW – there was also a log generator (simulate.js) included that would allow people to quickly demo supergrep without having to touch the config or make live server logs available.

Hey Guys

Saw at Scaleconf and decided to get supergrep, I installed node.js and npm and got it running.

Unfortunately there is nothing showing that is being logged, do I have to specify log paths? How/where can I learn/read more about the configuration.

I am not a linux/server expert and just know enough to make my apps work

thanks!

    Yes you do have to specify. Check out the localConfig.js for an example