Introducing statsd-jvm-profiler: A JVM Profiler for Hadoop

Posted by on January 14, 2015

At Etsy we run thousands of Hadoop jobs over hundreds of terabytes of data every day.  When operating at this scale optimizing jobs is vital: we need to make sure that users get the results they need quickly, while also ensuring we use our cluster’s resources efficiently.  Actually doing that optimizing is the hard part, however.  To make accurate decisions you need measurements, and so we have created statsd-jvm-profiler: a JVM profiler that sends the profiling data to StatsD.

Why Create a New Profiler?

There are already many profilers for the JVM, including VisualVM, YourKit, and hprof.  Why do we need another one?  Those profilers are all excellent tools, and statsd-jvm-profiler is not intended to entirely supplant them.  Instead, statsd-jvm-profiler, inspired by riemann-jvm-profiler, is designed for a specific use-case: quickly and easily profiling Hadoop jobs.

Profiling Hadoop jobs is a complex process.  Each map and reduce task gets a separate JVM, so one job could have hundreds or even thousands of distinct JVMs, running across the many nodes of the Hadoop cluster.  Using frameworks like Scalding complicates it further: one Scalding job will run multiple Hadoop jobs, each with many distinct JVMs.  As such it is not trivial to determine exactly where the code you want to profile is running.  Moreover, storing and transferring the snapshot files produced by some profilers has also been problematic for us due to the large size of the snapshots.  Finally, at Etsy we want our big data stack to be accessible to as many people as possible, and this includes tools for optimizing jobs.  StatsD and Graphite are used extensively throughout Etsy, so by sending data to StatsD, statsd-jvm-profiler enables users to use tools they are already familiar with to explore the profiling data.

Writing the Profiler

For simplicity, we chose to write statsd-jvm-profiler is a Java agent, which means it runs in the same JVM as the process being instrumented.  The agent code runs before the main method of that process.  Implementing an agent is straightforward: define a class that has a premain method with this signature:

package com.etsy.agent;

import java.lang.instrument.Instrumentation;

public class ExampleAgent {
    public static void premain(String args, Instrumentation instrumentation) {
        // Agent code here
    }
}

The agent class should be packaged in a JAR whose manifest specifies the Premain-Class attribute:

Premain-Class: com.etsy.agent.ExampleAgent

We are using Maven to build statsd-jvm-profiler, so we use the maven-shade-plugin’s ManifestResourceTransformer to set this property, but other build tools have similar facilities.

Finally, we used the JVM’s management interface to actually obtain the profiling data.  java.lang.management.ManagementFactory provides a number of MXBeans that expose information about various components of the JVM, including memory usage, the garbage collector, and running threads.  By pushing this data to StatsD, statsd-jvm-profiler removes the need to worry about where the code is running – all the metrics are available in a central location.

Issues

There were some issues that came up as we developed statsd-jvm-profiler.  First, statsd-jvm-profiler uses a ScheduledExecutorService to periodically run the threads that actually perform the profiling.  However, the default ScheduledExecutorService runs as a non-daemon thread, which means it will keep the JVM alive, even though the main thread may have exited.  This is not ideal for a profiler, as it will keep the JVM alive and continue to report profiling data even though nothing is happening other than the profiler.  Guava has functionality to create a ScheduledExecutorService that will exit when the application is complete, which statsd-jvm-profiler uses to work around this issue. 

Safepoints are another interesting aspect of profiling the JVM.  A thread is at a safepoint when it is in a known state: all roots for garbage collection are known and all heap contents are consistent.  At a safepoint, a thread’s state can be safely observed or manipulated by other threads.  Garbage collection must occur at a safepoint, but a safepoint is also required to sample the thread state like statsd-jvm-profiler does.  However, the JVM can optimize safepoints out of hot methods.  As such, statsd-jvm-profiler’s sampling can be biased towards cold methods.  This is not a problem unique to statsd-jvm-profiler – any profiler that samples the thread state like statsd-jvm-profiler does would have the same bias.  In practice this bias may not be that meaningful.  It is important to be aware of, but an incomplete view of application performance that still enables you to make improvements is better than no information.

How to Use statsd-jvm-profiler

statsd-jvm-profiler will profile heap and non-heap memory usage, garbage collection, and the aggregate time spent executing each function.  You will need the statsd-jvm-profiler jar on the host where the JVM you want to profile will run.  Since statsd-jvm-profiler is a Java agent, it is enabled with the -javaagent argument to the JVM.  You are required to provide the hostname and port number for the StatsD instance to which statsd-jvm-profiler should send metrics.  You can also optionally specify a prefix for the metrics emitted by statsd-jvm-profiler as well as filters for the functions to profile. 

-javaagent:/path/to/statsd-jvm-profiler/statsd-jvm-profiler.jar=server=statsd,port=8125

An example of using statsd-jvm-profiler to profile Scalding jobs is provided with the code.

statsd-jvm-profiler will output metrics under the “statsd-jvm-profiler” prefix by default, or you can specify a custom prefix.  Once the application being profiled has finished, all of the data statsd-jvm-profiler produced will be available in whatever backend you are using with StatsD.  What do you do with all that data? Graph it!  We have found flame graphs to be a useful method of visualizing the CPU profiling data, and a script to output data from Graphite into a format suitable for generating a flame graph is included with statsd-jvm-profiler:

Example Flame Graph

The memory usage and garbage collection metrics can be visualized directly:

Example Metrics

Using the Profiler’s Results

We’ve already used the data from statsd-jvm-profiler to determine how best to optimize jobs.  For example, we wanted to profile a job after some changes that had made it slower.  The flame graph made it obvious where the job was spending its time.  The wide bars on the left and right of this image are from data serialization/deserialization.  As such we knew that speeding up the job would come from improving the serialization or reducing the amount of data being moved around – not in optimizing the logic of the job itself. 

Flame Graph with Serialization

We also made a serendipitous discovery while profiling that job: it had been given 3 Gb of heap, but it was not using anywhere near that much.  As such we could reduce its heap size.  Such chance findings are a great advantage of making profiling simple.  You are more likely to to make these chance discoveries if you profile often and make analysis of your profiling data easier.  statsd-jvm-profiler and Graphite solve this problem for us.

Get statsd-jvm-profiler

Want to try it out yourself?  statsd-jvm-profiler is available on Github now!

Posted by on January 14, 2015
Category: engineering, infrastructure Tags: , , , ,

Related Posts

13 Comments

Very useful java agent to profile JVM and analyze it using Flame Graph. However, I am having problem running “graphite_dump.py” script to capture cpu trace metrics. I am running the script on the same host that has graphite carbon server. However, carbon server does not run on default port (2003). Please provide some examples on how to specify: hostname, port number, metrics and start/end date.

Thanks,
Amer.

    Hey Amer,

    I did update the README for graphite_dump.py to give more detail on how to run it. It doesn’t currently support a custom port, but I did file an issue on the project and will get that added.

      Thanks for reply. It will be useful if CPU profiling can be enabled/disabled at will. There is an overhead to keep cpu profiling enabled all the time in the production.

      SImilar task can be performed by using Linux “perf” utility, that can capture both JVM and system stacks. This way you get the visibility across the stacks. To capture full stack using perf utility, you need to have “perf-map-agent” and OpenJDK (with frame pointer fix).

      – “perf-map-agent” dumps a map file of JIT symbols and provide symbol resolution of perf_events.
      – Brendan Gregg, author of Flame Graph, recently released a fix for OpenJDK frame pointer issue that was resulting an incomplete stacks when collected using perf events. Now with the OpenJDK fix, one can capture full JVM and system level stacks for full stack visibility. Patched OpenJDK code is found at URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2014-December/016477.html

      Enabling or disabling the CPU profiling is also a planned feature: https://github.com/etsy/statsd-jvm-profiler/issues/2

      perf is also a great tool!

Andrew – see my comment in https://github.com/etsy/statsd-jvm-profiler/issues/2 – I think Amer was suggesting being able to turn profiling on/off on-demand, at run-time.

You can instrument & profile an entire Hadoop cluster without having to give up so much with call stack sampling using an instrumentation agent (that actually instruments code) and replication (mirroring) of such measured behavior over into a single JVM runtime acting as the world.

Youtube: Satoris + Simz : Hadoop Profiling at Scale

I want to say thank you very much to Andrew Johnson for all his help during implementation of this profiling method.

Here http://ihorbobak.com/index.php/2015/08/05/cluster-profiling/ I posted a complete instructions of how to do performance profiling using my mod of StatsD JVM Profiler + InfluxDB. Feel free to use.

I find your article very interesting. I actually grabbed the code and works great. However I have a basic question, probably due to my ignorance on instrumentation: what is the advantage of using a javaagent to tap jvm stats, over just picking them within the application and then reporting to statsd?
Thank you in advance.

    You can definitely instrument specific parts of your application with StatsD. A javaagent profiler like this, however, will let you instrument the whole application without code changes. You could also instrument an application you don’t have the source code for.

    Hopefully that helps and that you’re finding this useful!

Hi,
You have mentioned that we need to supply the server name and port number to the JVM. In a hadoop cluster, the child JVMs are created by the Application Master during runtime. There can be multiple YarnChild process running in a single physical server. By supplying a single port number, wouldn’t there be a conflict within multiple YarnChild processes running on the same server?

    Sorry for the confusion – the server and port you specify as arguments to the profiler are for the StatsD (or now InfluxDB) instance to which the profiler should send data. There should be no issue with profiling multiple JVMs on the same machine.

Hi Andrew,

thanks for the detailed article. I have a question. Do you see any performance impact of using this agent on the application itself ? We have tried using some tools to get the stats but it directly impacting CPU of machine.
Appreciate if you share your experience.

Thanks
N

    We haven’t seen substantial degradation in performance from profiling in this way. There’s always some overhead from profiling, but this is pretty lightweight.