Draft
Conversation
Upgrades AWS Elasticsearch from 6.7 to 6.8. Summary of changes ------------------ There are [no breaking changes between 6.7 and 6.8](https://www.elastic.co/guide/en/elasticsearch/reference/6.8/breaking-changes-6.8.html) There are [four "release highlights"](https://www.elastic.co/guide/en/elasticsearch/reference/6.8/release-highlights.html) in this upgrade: - two licensing changes (in 6.8.0) - an OOM error fix for cross-cluster replication (which we don't use) - a fix for retries in cross-cluster replication (which we don't use) There are no changes that I've seen regarding to relevance in the [full release notes](https://www.elastic.co/guide/en/elasticsearch/reference/6.8/es-release-notes.html). There are several [security fixes including one for log4j](https://www.elastic.co/guide/en/elasticsearch/reference/6.8/release-notes-6.8.21.html). Internal Process ---------------- There are some docs on what happens during minor version upgrades: - https://aws.amazon.com/blogs/database/in-place-version-upgrades-for-amazon-elasticsearch-service/ - https://aws.amazon.com/opensearch-service/faqs/#topic-8 - https://aws.amazon.com/blogs/big-data/an-automated-approach-to-perform-an-in-place-engine-upgrade-in-amazon-opensearch-service/ Older docs say that minor versions are an inplace upgrade. Each node is removed from the cluster, upgraded and then re-added. Newer docs (which might be less relevant given the venerable age of our version) indicate that it might be a blue/green deployment. In either case, downtime for queries is not expected. Our Process ----------- 1. Run the [preflight checks](https://docs.publishing.service.gov.uk/manual/reindex-elasticsearch.html#1-preflight-checks) to ensure that we don't have duplicate indices hanging around. 2. [Take a manual snapshot](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-snapshot-create.html) of the index. 3. Deploy this TF change. On production especially, this should be done at a quiet time. 4. Consider [replaying content changes](https://docs.publishing.service.gov.uk/manual/fix-out-of-date-search-indices.html) that may have happened during the upgrade. Rollback -------- Rollback involves creating a new Elasticsearch cluster and restoring the snapshot. Environment Sync ---------------- I don't expect this to affect the environment sync that occurs overnight.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Upgrades AWS Elasticsearch from 6.7 to 6.8.
Summary of changes
There are no breaking changes between 6.7 and 6.8
There are four "release highlights" in this upgrade:
There are no changes that I've seen regarding to relevance in the full release notes.
There are several security fixes including one for log4j.
Internal Process
There are some docs on what happens during minor version upgrades:
Older docs say that minor versions are an inplace upgrade. Each node is removed from the cluster, upgraded and then re-added. Newer docs (which might be less relevant given the venerable age of our version) indicate that it might be a blue/green deployment. In either case, downtime for queries is not expected.
Our Process
Run the preflight checks to ensure that we don't have duplicate indices hanging around.
Take a manual snapshot of the index. We will remove any old snapshots
Deploy this TF change to create a new cluster (blue/green) with the new version
Monitor the new cluster and switch over.
Consider replaying content changes that may have happened during the upgrade.
Remove the old cluster if all is ok.
Rollback
Rollback involves switching back to the old cluster and replaying missing data.
Environment Sync
I don't expect this to affect the environment sync that occurs overnight.