Juggling Multiple Elasticsearch Instances on a Single Host

Posted by on December 4, 2014

Elasticsearch is a distributed search engine built on top of Apache Lucene. At Etsy we use Elasticsearch in a number of different configurations: for Logstash, powering user-facing search on some large indexes, some analytics usage, and many internal applications.

Typically, it is assumed that there is a 1:1 relationship between ES instances and machines. This is straightforward and makes sense if your instance requirements line up well with the host – whether physical, virtualized or containerized. We run our clusters on bare metal, and for some of them we have more ES instances than physical hosts. We have good reasons for doing this, and here I’ll share some of the rationale, and the configuration options that we’ve found to be worth tuning.

Why?

Managing JVMs with large heaps is scary business due to garbage collection run times. 31Gb is the magic threshold above which point you lose the ability to use CompressedOops. In our experience, it is better to have even smaller heaps. Not only do GC pause times stay low, but it’s easier to capture and analyze heap dumps!

To get optimal Lucene performance, it is also important to have sufficient RAM available for OS file caching of the index files.

At the same time, we are running on server-class hardware with plenty of CPU cores and RAM. Our newest search machines are Ivy Bridge with 20 physical (40 virtual) cores, 128Gb of RAM, and of course SSDs. If we run a single node with a small heap on this hardware we would be wasting both CPU and RAM, because the size of shards such an instance will be able to support will also be smaller.

We currently run 4 ES JVMs per machine with 8Gb of heap each. This works out great for us: GC has not been a concern and we are utilizing our hardware effectively.

The settings

os.processors

Elasticsearch uses this setting to configure thread pool and queue sizing. It defaults to Runtime.getRuntime().availableProcessors(). With multiple instances, it is better to spread the CPU resources across them.

We set this to ($(nproc) / $nodes_per_host). So if we are running 4 nodes per host on 40-core machines, each of them will configure thread pools and queues as if there were 10 cores.

node.name

The default for this setting is to pick a random Marvel comic character at startup. In production, we want something that lets us find the node we want with as little thought and effort as possible. We set this to $hostname-$nodeId (which results in names like “search64-es0” – less whimsical, but far more practical when you’re trying to get to the bottom of an issue).

http.port, transport.port

If these ports are not specified, ES tries to pick the next available port at startup, starting at a base of 9200 for HTTP and 9300 for its internal transport.

We prefer to be explicit and assign ports as $basePort+$nodeIdx from the startup script. This can prevent surprises such as where an instance that you expect to be down is still bound to its port, causing the ‘next available’ one to be higher than expected.

cluster.routing.allocation.awareness.attributes

A key way to achieve failure tolerance with ES is to use replicas, so that if one host goes down, the affected shards stay available. If you’re running multiple instances on each physical host, it’s entirely possible to automatically allocate all replicas for a shard to the same host, which isn’t going to help you! Thankfully this is avoidable with the use of shard allocation awareness. You can set the hostname as a node attribute on each instance and use that attribute as a factor in shard assignments.

ES_JAVA_OPTS="$ES_JAVA_OPTS -Des.node.host=${HOSTNAME}"
ES_JAVA_OPTS="$ES_JAVA_OPTS -Des.cluster.routing.allocation.awareness.attributes=host"

path.logs

Without having a dedicated log directory for each instance, you would end up with multiple JVMs trying to write to the same log files.

An alternative, which we rely on, is to prefix the filenames in logging.yml with the property ${node.name} so that each node’s logs are labelled by host and node ID. Another reason to be explicit about node naming!

Minimal Viable Configuration

Elasticsearch has lots of knobs, and it’s worth trying to minimize configuration. That said, a lot of ES is optimized for cloud environments so we occasionally find things worth adjusting, like allocation and recovery throttling. What do you end up tweaking?

You can follow Shikhar on Twitter at @shikhrr

Posted by on December 4, 2014
Category: infrastructure, search

5 Comments

Hi Shikhar,
Great post thanks for sharing it!
I am planning to run multiple nodes on a single hosts for the same reasons, I am wondering how do you then start each node as a service? does each service end up with a different name?

for instance
service elasticsearch start

    Exactly! In our environment we do it that way:
    1- service elasticsearch start (master node only)
    2- service elasticsearch.data1 start (data node only)
    3- service elasticsearch.data2 start (data node only)

Hi Yasir,

We are using a custom script that calls out to the Elasticsearch start script (‘bin/elasticsearch’) for managing the lifecycle of the service (start/stop/restart/etc.). I believe the distro packages for Elasticsearch ship with such init scripts already.

The bin script supports providing some configuration overrides using ‘–node.name=mynode’ style arguments.

We instead supply the relevant service-specific properties in the ES_JAVA_OPTS environment variable which can be used to the same effect.

You might also find it useful to copy the ‘bin/elasticsearch.in.sh’ script that gets sourced and customize it for your needs, and you can then point at your version using ES_INCLUDE.

if you run rpm packages from redhat, you can create one file elasticsearch-.yml for each node, one file /etc/sysconfig/elasticsearch-, copy the init.d script to elasticsearch- and change the progname from elasticsearch to elasticsearch-, create a separate workdir+datadir e.g. data-, work-, configure these in the elasticsearch-.yml and you are set.

I created a separate wrapper init scripts /etc/init.d/elasticsearch to loop over each /etc/init.d/elasticsearch-* script to simplify manual restarts etc.
have fun,
Uwe

I’ve started using the Elastic-provided elasticsearch puppet forge module and it’s working well so far. It’s handling the basics of multiple instances: https://forge.puppetlabs.com/elasticsearch/elasticsearch

Thanks for this article that still points out what additional steps are needed in the configuration to ensure the instances don’t overrun each other.