Upgrading Major Versions
Bonsai makes upgrading major versions of Elasticsearch as painless as possible. There is no need to manage the operational details of deploying software upgrades to nodes or spinning up new servers. With Bonsai, version upgrades are instant and can be performed with zero downtime.
In this document, we’ll cover some general best practices and offer some Bonsai-specific guidelines for migrating your app to a new version of Elasticsearch.
Protip: Use one Elasticsearch cluster per environment
This means having one cluster for production, and another for staging, another for development, and so on. Some users like to put staging indices alongside production indices on the same cluster in order to ensure identical behaviors between staging and production applications. First, this is a terrible idea in general; you should never run staging/dev applications on the same resources as production. This is a recipe for disaster.
Second, separating out your environments allows for upgrading them one at a time. While this may sound tedious, it is the most prudent approach and allows you to discover potential problems before they impact production.
Step 1: Read the Release Notes Carefully
Make sure to perform your due diligence by reading the release notes and breaking changes that accompany the version you’re targeting for the upgrade.
Another thing to investigate is whether your application’s Elasticsearch client supports the upgrade candidate. There have been cases where a popular client or framework was several support versions behind the official Elasticsearch release. Some of these have resulted in hours of down time for users who upgraded the production Elasticsearch cluster beyond the version supported by their Elasticsearch client.
This is one of many reasons we recommend upgrading non-production environments first.
Step 2: Validate In Development
Upgrading across major versions sometimes comes with breaking changes, new dependencies, and tweaks in behavior. It is important to validate that the upgrade is safe before pushing it out to production. We advise starting by upgrading the least critical environment first. A variation of the blue-green deployment strategy is useful here.
The process looks like this:
Ensure it works as expected, and make any changes as needed. Deploy those changes to the next least critical environment and then upgrade the cluster for that environment. Continue in this fashion until reaching the production environment. By that point, you should be fairly confident that the application and search will work as expected.
Make sure to validate that searches will work as expected in terms of relevancy. Also make sure to test full deletion and reindexing in the least critical environments before upgrading production. Reindexing is something you should be familiar with anyway as a part of normal usage of Elasticsearch, such as changing analyzer settings, backfilling a new field, or - in this case - upgrading to a new major version.
Step 3: Upgrade the Production Cluster
Once you are satisfied that the candidate version will work in production as expected, the final step is to take it live. This last step is usually complicated by the constraint that search must not go down at all, and data loss is unacceptable. Because of this constraint, planning and possibly additional infrastructure (like message queues), are required to ensure a zero-downtime switch and a fallback path in case something breaks.
Of course, if you're fortunate enough to have a use case where the production app can be put into maintenance mode while the new cluster is repopulated, then you can simply use the same process as outlined in the previous step.
For everyone else, the basic process is to have the old application and cluster serve traffic while the new application is populating the new cluster from the source database. Once the new cluster is populated with the same data as the old cluster, the new application is promoted to a production role and begins serving traffic. This strategy allows developers to quickly roll back to a known working state in the event that there is a serious issue with the new system.
The exact steps will vary considerably by application and use case. A typical strategy for this is outlined in the infographic below:
You will need to adapt this to your specific use case when planning out your blue-green strategy.
Upgrading major versions of Elasticsearch while running a production application can be tricky. If you’re unsure of what to do, are concerned about an edge case or special circumstance, or simply want to sanity check a plan, please do not hesitate to reach out to firstname.lastname@example.org. We’re here to help ensure the smoothest upgrade possible.