Categories

Sizing Your Business Cluster

If your index or traffic changes periodically, you can change plans whenever you wish.
Last updated
June 17, 2023

The new Bonsai Business Plans come with more options than we have ever provided before. It may seem intimidating to choose, but there are a few simple guidelines that make it easy. Bonsai Business Plans also don’t require annual contracts. If your index or traffic changes periodically, you can change plans whenever you wish.

Let’s start by looking at the two plan types.

Choosing a plan type

Business Plans offer two main types - Compute and Capacity. The difference is inherent in their names: if your use-case necessitates a lot of data written to disk but less traffic (perhaps only a few per hour), Capacity allows you to get more bang for your buck in raw disk. As a contrast, Compute is designed for those that need a setup that can withstand from high traffic load or query complexity.

Planning for disk capacity

When you deploy an HA Elasticsearch cluster, you must provision enough disk for three things: 1.Your primary data 2. Your replica data, and 3. The normal maintenance routines performed by Lucene, the underlying search engine behind Elasticsearch.

Nobody likes using a search engine that doesn’t work. Failing to account for any of these factors will result in performance degradation, a.k.a. the infamous yellow or red cluster. 😫

How much primary data you can load into Elasticsearch, while still maintaining High Availability? This simple formula will help you calculate:

((number of nodes - 1) * the capacity of a single node) * 0.8 = the amount of data that can be loaded in your cluster

Let’s put this in a concrete example. A Business Capacity Large plan has a raw capacity of 150GB, with each of the three nodes contributing 50GB of disk. So the concrete numbers would be:

number of nodes = 3 per node capacity = 50GB total raw capacity = 3 * 50GB = 150GB Usable data = ((3-1) * 50GB) * 0.8 = 80GB

This means that if you have a total raw capacity of 150GB, you should only be planning to use 80GB of it for your search data usage. At first this seems like a huge gap in resources available versus resources usable (that’s a 53% drop!), but it’s a necessary plan to prevent getting paged at 3AM with red status clusters, poorly performing queries and/or data loss.

This formula, explained

Planning is key with distributed systems like Elasticsearch. When nodes inevitably go offline, it’s important to have replication in place for backup. The formula removes one node from your calculation (number of nodes - 1) so that your cluster will not lose any data. The additional failover nodes will maintain a green index status, even when a node goes offline for maintenance reasons. This ensures enough capacity for your primary data and a replica. Multiplying the total by 0.8 buffers your capacity by 20%, which accounts for Lucene’s maintenance routines.

Planning for computational requests and traffic

Now that we’ve covered raw capacity planning, let’s talk about handling traffic. Traffic and query computation strength maps to the size of the cluster. Larger clusters can handle a higher number of requests. You should consider three different numbers here:

1.) How many search requests will you be doing in any given minute? 2.) How many aggregations will you be doing in any given minute? 3.) Lastly, how many bulk updates will you perform each minute?

To ensure optimal performance, all three numbers should fit under the values in the table below.

<table>
<thead>
<tr><th>Aggregation Rate</th><th>Bulk Insert Rate</th><th>Search Rate</th><th>Ideal Plan/Size</th></tr>
</thead>
<tbody>
<tr><td>< 25 / minute</td><td>< 250 / minute</td><td>< 500 / minute</td><td>Large</td></tr>
<tr><td>< 50 / minute</td><td>< 500 / minute</td><td>< 1000 / minute</td><td>XLarge</td></tr>
<tr><td>< 100 / minute</td><td>< 1000 / minute</td><td>< 2000 / minute</td><td>2XLarge</td></tr>
</tbody>
</table>

Aggregation RateBulk Insert RateSearch RateIdeal Plan/Size< 25 / minute< 250 / minute< 500 / minuteLarge< 50 / minute< 500 / minute< 1000 / minuteXLarge< 100 / minute< 1000 / minute< 2000 / minute2XLarge

As a note: these numbers are conservative by design. Once you are up and running, you’ll be able to use the metrics panel in the Bonsai application to see the how your searches are performing in real time and be able to make sizing decisions, up or down, based on real data.

For those of you that want to dig even deeper, you can read our very thorough version of capacity planning as well.

Questions?

Our team has provisioned search engines that handle billions of requests each month. If you are still unsure about which plan is right for you, please contact us for a personalized consultation.

View code snippet
Close code snippet