Skip to content

v3.1.0

Compare
Choose a tag to compare
@thegridman thegridman released this 28 Sep 09:57

Coherence Operator v3.1.0

🚫 THIS RELEASE IS DEPRECATED - DO NOT USE 🚫

⚠️ It appears that the upgrade to using Operator SDK 1.0.0 which now uses Kubebuilder to generate the CRDs has caused the
name of the CRD to change slightly from coherence.coherence.oracle.com to coherences.coherence.oracle.com (the first coherences is now plural). The work-around for this was to delete the existing CRD but that would cause all Coherence clusters that
had been deployed with the previous CRD to also be deleted. This is obviously totally impractical.

This version of the Coherence Operator is compatible with previous 3.0.* versions, there should have been no breaking changes and Coherence yaml used with 3.0.* versions should work with 3.1.0.

Changes in Operator 3.1.0

Project Restructure

The biggest change from our perspective was the move to the final 1.0.0 release of the Operator SDK. Just before that release the Operator SDK team made big changes to their project, removing a lot of things and basically switching to using Kubebuilder for a lot of the code generation and configuration. The meant that we had to do a bit of reorganization of the code and project layout. The Operator SDK also removed its test framework, which we had made extensive use of in our suite of end-to-end integration tests. Some things became simpler with using Kubebuilder, but we still had to do work to refactor our tests. This is all of course transparent to Coherence Operator users, but was a sizeable piece of work for us.

Deployment

The change to using Kubebuilder, and using the features it provides, has meant that we have changed the default deployment options of the Coherence Operator. The recommended way to deploy the Coherence Operator with 3.1 is to deploy a single instance of the operator into a Kubernetes cluster and that instance monitors and manages Coherence resources in all namespaces. This is a change from previous versions where an instance of the operator was deployed into a namespace and only monitored that single namespace, meaning multiple instances of the operator could be deployed into a Kubernetes cluster.
There are various reasons why the new model is a better approach. The Coherence CRDs are deployed (or updated) by the Operator when it starts. In Kubernetes a CRD is a cluster scoped resource, so there can only be a single instance of any version of a CRD. We do not update the version of our CRD with every Operator release - we are currently at v1. This means that if two different versions of the Coherence Operator had been deployed into a Kubernetes cluster the version of the CRD deployed would only match one of the operators (typically the last one deployed) and this could lead to subtle bugs or issues due to version mis-matches. The second reason is due to version 3.1 of the operator introducing admission web-hooks (more on that below). Like CRDs, admission web-hooks are also really a cluster scoped resource so having multiple web-hooks deployed for a single CRD may cause issues.
It is possible to deploy the Coherence Operator with a list of namespaces to monitor instead of monitoring all namespaces, and hence it is possible to deploy multiple operators monitoring different namespaces, we just would not advise this.

Admission Web-Hooks

Version 3.1 of the operator introduced the use of admission web-hooks. In Kubernetes an admission web-hook can be used for mutating a resource (typically applying defaults) and for validating a resource. The Coherence Operator uses both of these, we apply default values to some fields, and we also validate fields when a Coherence resource is created or updated. In previous versions of the operator it was possible to see issues caused by creating a Coherence resource with invalid values in some fields, for example altering a persistent volume when updating, setting invalid NodePort values, etc. In previous versions these errors were not detected until after the Coherence resource had been accepted by Kubernetes and a StatefulSet or Service was created and subsequently rejected by Kubernetes causing errors in the operators reconcile loop. With a validation web-hook a Coherence resource with invalid values will not even make it into Kubernetes.

Kubernetes Autoscaler

Back in version 3.0 of the operator we supported the scale sub-resource which allowed scaling of a Coherence deployment using built in Kubernetes scale commands, such as kubectl scale. In version 3.1 we have taken this further with a full end-to-end example of integrating a Coherence cluster into the Kubernetes Horizontal Pod Autoscaler and showing how to scale a cluster based on metrics produced by Coherence. This allows a Coherence cluster to grow as its resource requirements increase, for example as heap use increases. This is by no means an excuse not to do any capacity planning for you applications, but does offer a useful way to use your Kubernetes resources on demand.

Graceful Cluster Shutdown

As a resilient data store Coherence handles Pods leaving the cluster by recovering the lost data from backup and re-balancing the cluster. This is all great and exactly what we need but not necessarily when we actually just want to stop the whole cluster at once. Pods will not all die together, and those left will be working hard to recover as other Pods leave the cluster. If a Coherence resource is deleted from Kubernetes (or if it is scaled down to a replica count of zero) the Coherence Operator will now suspend all storage enabled cache services in that deployment before Pods are stopped. This allows for a more controlled cluster shut-down and subsequent recovery when brought back up.

Spring Boot Image Support

Spring Boot is a popular framework that we have big plans for in upcoming Coherence CE releases. One feature of Spring Boot is the way it packages an application into a jar, and then how Spring Boot builds images from the application jar. This could lead to problems trying to deploy those types of application using the Coherence Operator. The simplest way to package a Spring Boot application into an image for use by the Coherence Operator is to use JIB. The JIB Gradle or Maven plugins will properly package a Spring Boot application into an image that just works out of the box with the Coherence Operator.
Spring Boot images built using the latest Spring Boot Gradle or Maven plugins use Cloud Native Buildpacks to produce and image. The structure of these images and how they are run is quite different to a simple Java application. There are pros and cons with this, but as a popular framework and tooling it is important the Coherence Operator can manage Coherence applications built and packaged this way. With version 3.1 of the operator these images can be managed with the addition of one or two extra fields in the Coherence resource yaml.
Finally, if you really wish to put your Spring Boot fat-jar into an image (and there are reasons why this is not recommended) then the Coherence resource has configuration options that will allow this to work too.

Tested on Kubernetes 1.19

With the recent release of Kubernetes 1.19 we have added this to our certification test suite. We now test the Coherence Operator on all Kubernetes versions from 1.12 to 1.19 inclusive.