A shard is the basic unit of work in Elasticsearch. If you haven’t read the Shards section in Elasticsearch Core Concepts, that would be a good place to start. Bonsai meters on the total number of shards in a cluster. That means both primary and replica shards count towards the limit.
The relationship between shard scheme and your cluster’s usage can sometimes not be readily apparent. For example, if you have an index with a 3x2 sharding scheme (3 primaries, 2 replicas), that’s not 5 or 6 shards, it’s 9. If this is confusing, read our Shard Primer documentation for some nice illustrations.
Managing shards is a basic skill for any Elasticsearch user. Shards carry system overhead and potentially stale data. Keeping your cluster clean by pruning old shards can both improve performance and reduce your server costs.
In this guide we cover a few different ways of reducing shard usage:
- What is a Shard Overage?
- Delete Unneeded Indices
- Use a Different Sharding Scheme
- Reduce replication
- Data Collocation
- Upgrade the Subscription
This guide will make frequent references to the Elasticsearch API using the command line tool curl
. Interacting with your own cluster can also be done via curl
, or via a web browser. You can also use the interactive Bonsai Console in your cluster dashboard.
What is a Shard Overage?
A shard overage occurs when your cluster has more shards than the subscription allows. In practice, this usually means one of three things:
- Extraneous indices.
- Sharding scheme is not optimal.
- Replication too high
It is also possible that the cluster and its data are already configured according to best practices. In that case, you may need to get creative with aliases and data collocation in order to remain on your current subscription.
Delete Unneeded Indices
There are some cases where one or more indices are created on a cluster for testing purposes, and are not actually being used for anything. These will count towards the shard limits; if you’re getting overage notifications, then you should delete these indices.
There are also some clients that will use aliases to roll out changes to mappings. This is a really nice feature that allows for zero-downtime reindexing and immediate rollbacks if there’s a problem, but it can also result in lots of stale indices and data accumulating in the cluster.
To determine if you have extraneous indices, use the /_cat/indices endpoint to get a list of all the indices you have in your cluster:
GET /_cat/indices green open test1 1 1 0 0 318b 159b green open test2 1 1 0 0 318b 159b green open test3 1 1 0 0 318b 159b green open test4 1 1 0 0 318b 159b green open test5 1 1 0 0 318b 159b green open test6 1 1 0 0 318b 159b
If you see indices that you don’t need, you can simply delete them:
# Delete a single index: DELETE /test1 # Delete a group of indices: DELETE /test2,test3,test4,test5
Use a Different Sharding Scheme
It’s possible that for some reason one or more indices were created with far more shards than necessary. For example, a check of /_cat/indices shows something like this:
GET /_cat/indices green open test1 5 2 0 0 318b 159b green open test2 5 2 0 0 318b 159b
That 5x2 shard scheme results in 15 shards per index, or 30 total for this cluster. This is probably really overprovisioned for these indices. Choosing a more conservative shard scheme of 1x1 would reduce this cluster’s usage from 30 shards down to 4.
Unfortunately, the number of primary shards can not be changed once an index has been created. To fix this, you will need to manually create a new index with the desired shard scheme and reindex the data. If you have not read The Ideal Elasticsearch Index, it has some really nice information on capacity planning and sizing. Check out the sections on Intelligent Sharding and Benchmarking for some tips on what scheme would make more sense for your particular use case.
Reduce replication
Multitenant subscription plans (Hobby and Standard tiers) are required to have at least 1 replica shard in use per index. Indices on these plans cannot have a replica count of 0.
For most use-cases, a single replica is perfectly sufficient for redundancy and load capacity. If any of your indices have been created with more than one replica, you can reduce it to free up shards. An index with more than one replica might look like this:
GET /_cat/indices?v health status index pri rep docs.count docs.deleted store.size pri.store.size green open test1 5 2 0 0 318b 159b green open test2 5 2 0 0 318b 159b
Note that the rep
column has a 2? That means there are actually 3 copies of the data: one primary shard, and its two replicas. Replicas are a multiplier against primary shards, so if an index has a 5×2 configuration (5 primary shards with 2 replicas), reducing replication to 1 will free up five shards, not just one. See the Shard Primer for more details.
Fortunately, reducing the replica count for the index is a small JSON body to the _settings
endpoint:
PUT /test1,test2/_settings -d '{"index":{"number_of_replicas":1}}' {"acknowledged":true} GET /_cat/indices green open test1 5 1 0 0 318b 159b green open test2 5 1 0 0 318b 159b
That simple request shaved 10 shards off of this cluster’s usage.
Replication == Availability and Redundancy
It might seem like a good money-saving idea to simply set all replicas to 0 so as to fit as many indices into your cluster as possible. However, this is not advisable. This means that your primary data has no live backup. If a node in your cluster goes offline, data loss is basically guaranteed.
Data can be restored from a snapshot, but this is messy and not a great failover plan. The snapshot could be an hour or more old, and any updates to your data since then either need to be reindexed or are lost for good. Additionally, the outage will last much longer.
Having replication of at least 1 mitigates against all these problems.
Data Collocation
Another solution for reducing shard usage involves using aliases and custom routing rules to collocate different data models onto the same group of shards.
What is data collocation? Many Elasticsearch clients use an index per model paradigm as the default for data organization. This is analogous to, say, a Postgres database with a table for each type of data being indexed.
Sharding this way makes sense most of the time, but in some rare cases users may benefit from putting all of the data into a single namespace. In the Postgres analogy, this would be like putting all of the data into a single table instead of a table for each model. An attribute (i.e., table column) is then used to filter searches by class.
For example, you might have a cluster that has three indices: videos
, images
and notes
. If each of these has a conservative 1x1 sharding scheme, it would require 6 shards. But this data could potentially be compacted down into a single index, production
, where the mapping has a type
field of some kind to indicate whether the document belongs to the video
, image
or note
class.
The latter configuration with the same 1x1 scheme would only require two shards (one primary, one replica) instead of six:
There are several major downsides to this approach. One is that field name collisions become an issue. For example, if there is a field called
published
in two models, but one is defined as a boolean and the other is a datetime, it will create a conflict in the mapping. One will need to be renamed.
Another downside is that it is a pretty large refactor for most users, and may be more trouble than simply upgrading the plan. Overriding the default behavior in the application’s Elasticsearch client may require forking the code and relying on other hacks/workarounds.
There are others. Data collocation is mentioned here as a possibility, and one that only works for certain users. It is by no means a recommendation.
Upgrade the Subscription
If you find that you’re unable to reduce shards through the options discussed here, then you will need to upgrade to the next plan.