Conversation
Signed-off-by: Reshma Abdul Rahim <reshmarahim.abdul@microsoft.com>
Signed-off-by: Reshma Abdul Rahim <reshmarahim.abdul@microsoft.com>
|
|
||
| #### Option 2: Adding it as part extensions | ||
|
|
||
| `extensions` have been the way to provide punch through capabilities which do not change the behaviors of the resource. The following is an example of adding autoscaling as part of the extensions. |
There was a problem hiding this comment.
minor correction: runtimes is the punch through
| Cons: not straightforward or intuitive as user doesn't know which autoscale config is required for the platform ? needs a platform specific discriminator | ||
| May be, this is not a developer problem and should be handled by the platform operator. | ||
|
|
||
| #### Option 3: Adding it as top level common property |
There was a problem hiding this comment.
I think Option 3 makes the most sense IF hpa and/or http auto-scaling and their corresponding properties are common across container platforms.
|
|
||
| Cons:same as above, not straightforward or intuitive as user doesn't know which autoscale config is required for the platform ? May be, this is not a developer problem and should be handled by the platform operator. | ||
|
|
||
| ### Configuring Autoscaling in Radius Environment |
There was a problem hiding this comment.
Would autoscaling policies specified in the container override the autoscaling policies set for the environment?
| 1. The platform engineer can configure the scaling policies once and all the applications deployed in the environment will inherit the scaling policies. | ||
| 1. Developer can focus on the application modelling and not worry about the scaling policies. If needed the developer can override the scaling policies in the application resource definition. Overriding scenario could be for example, if the developer wants to scale based on some events from the Kafka or RabbitMQ modelled in the application definition | ||
|
|
||
| ### Configuring Autoscaling as a core resource type |
There was a problem hiding this comment.
I don't follow this completely - I see Radius resources as objects it needs to provision or create at deploy time, but autoscaling is not an object, but rather a configuration. I think this kind of pattern might be better handled through bicep params instead?
|
|
||
| ## Overview | ||
|
|
||
| Autoscaling is a critical capability for modern cloud applications, enabling them to dynamically adjust resources based on demand. Platform engineers and developers need the ability to optimize the resource utilization when deploying applications using Radius across different runtime environments. This document outlines the design and the user experience for configuring autoscaling policies in Radius applications. |
There was a problem hiding this comment.
| Autoscaling is a critical capability for modern cloud applications, enabling them to dynamically adjust resources based on demand. Platform engineers and developers need the ability to optimize the resource utilization when deploying applications using Radius across different runtime environments. This document outlines the design and the user experience for configuring autoscaling policies in Radius applications. | |
| Autoscaling is a critical capability required for cloud-native applications to provide a performant experience for its users. Platform engineers and developers must collaborate together to ensure the application behaves well as it scales and and scales down, and that the platform maximizes utilization of the available computing resources. This collaboration requires that both developers and platform engineers work together—it is not possible for only one persona to successfully configure autoscaling without the other. This document outlines the design and the user experience for configuring autoscaling policies in Radius applications. |
| The following are the most common autoscaling mechanisms available in the cloud-native ecosystem. | ||
|
|
||
| **Kubernetes** | ||
| 1. **Horizontal Pod Autoscaler (HPA)** - Kubernetes native autoscaling mechanism that scales the number of pods in a deployment based on observed CPU utilization, Memory and other custom metrics. This is the most common autoscaling mechanism used in the Kubernetes ecosystem. |
There was a problem hiding this comment.
| 1. **Horizontal Pod Autoscaler (HPA)** - Kubernetes native autoscaling mechanism that scales the number of pods in a deployment based on observed CPU utilization, Memory and other custom metrics. This is the most common autoscaling mechanism used in the Kubernetes ecosystem. | |
| 1. **Horizontal Pod Autoscaler (HPA)** - Kubernetes native autoscaling mechanism that scales the number of pods in a deployment based on resource metrics (CPU and memory utilization) or custom metrics. This is the most common autoscaling mechanism used in the Kubernetes ecosystem. |
|
|
||
| **Kubernetes** | ||
| 1. **Horizontal Pod Autoscaler (HPA)** - Kubernetes native autoscaling mechanism that scales the number of pods in a deployment based on observed CPU utilization, Memory and other custom metrics. This is the most common autoscaling mechanism used in the Kubernetes ecosystem. | ||
| 2. **Vertical Pod Autoscaler (VPA)** - Kubernetes native autoscaling mechanism that automatically adjusts the CPU and memory requests of the pods. This is the least common autoscaling mechanism used in the Kubernetes ecosystem as it requires restarting the pods to apply the new resource requests. |
There was a problem hiding this comment.
| 2. **Vertical Pod Autoscaler (VPA)** - Kubernetes native autoscaling mechanism that automatically adjusts the CPU and memory requests of the pods. This is the least common autoscaling mechanism used in the Kubernetes ecosystem as it requires restarting the pods to apply the new resource requests. | |
| 2. **Vertical Pod Autoscaler (VPA)** - Kubernetes native autoscaling mechanism that automatically adjusts the CPU and memory requests (the minumum) of the pods up to the limit (the maximum). This is the least common autoscaling mechanism used in the Kubernetes ecosystem as it requires restarting the pods to apply the new resource requests. |
| 3. **KEDA** - Kubernetes Event-driven Autoscaling (KEDA) is an open-source component that enables autoscaling of Kubernetes workloads based on external metrics. KEDA operates on top of the HPA and triggers scaling based on metrics from various sources, such as message queues, databases, or observability platforms. | ||
|
|
||
| **Serverless Container platforms** | ||
| 1. **Azure Container Instances and Apps** - Azure container instances doesn't provide an inbuilt solution to automatically scale. Container apps provide scaling based on HTTP traffic and other event-driven triggers (KEDA). For web apps, the preferred scaling mechanism is based on HTTP traffic. For event-driven workloads, the preferred scaling mechanism is based on KEDA. |
There was a problem hiding this comment.
| 1. **Azure Container Instances and Apps** - Azure container instances doesn't provide an inbuilt solution to automatically scale. Container apps provide scaling based on HTTP traffic and other event-driven triggers (KEDA). For web apps, the preferred scaling mechanism is based on HTTP traffic. For event-driven workloads, the preferred scaling mechanism is based on KEDA. | |
| 1. **Azure Container Instances** - Azure Container Instances today only has the ability to manual scale using the `desiredCount` property on the NGroup. When autoscaling is available for NGroups, this section will be updated. | |
| 2. **Azure Container Apps** – Azure Container Apps provide scaling based on HTTP traffic and other event-driven triggers (KEDA). For web apps, the preferred scaling mechanism is based on HTTP traffic. For event-driven workloads, the preferred scaling mechanism is based on KEDA. |
|
|
||
| **Serverless Container platforms** | ||
| 1. **Azure Container Instances and Apps** - Azure container instances doesn't provide an inbuilt solution to automatically scale. Container apps provide scaling based on HTTP traffic and other event-driven triggers (KEDA). For web apps, the preferred scaling mechanism is based on HTTP traffic. For event-driven workloads, the preferred scaling mechanism is based on KEDA. | ||
| 2. **AWS Fargate and App Runner** - AWS Fargate provides autoscaling based on CPU, memory and cloud watch metrics. App runner provides autoscaling based on HTTP traffic |
There was a problem hiding this comment.
This should be ECS not Fargate. And ignore AppRunner. Usage is almost nil.
|
|
||
| A developer specifies a simple HPA autoscaling configuration in the container resource definition. Below are the various options to configure autoscaling in the container resource type. | ||
|
|
||
| #### Option 1: Adding autoscaling in the compute configuration |
There was a problem hiding this comment.
For consistency with the manualScaling extension, let's assume option 2. I don't see any reason we would do option 1 or 3. I also think having these options here takes away from your broader options. The options we should be debating aren't these, it's:
- Option 1 – Autoscaling specified as part of the environment configuration with applications able to opt in via properties in the container resource definition
- Option 2 – Autoscaler as a resource type analogous to a gateway which enables platform engineers delegate to developers via RBAC
| 1. More configuration to manage across different runtimes. | ||
| 2. The developer needs to know the underlying platform and the autoscaling policies required for the platform. | ||
|
|
||
| #### Option 2: Adding it as part extensions |
There was a problem hiding this comment.
I don't see this as a viable approach. If the maxReplicas is hard coded in the application definition, there would be no difference based on the environment. That doesn't meet the requirements. If that's the case, we're wasting our time considering this as an option. Is there some way to make this environment-responsive? I don't see how.
| Pros: | ||
| 1. Autoscaling is an infrastructure problem, the platform engineer has the ability to configure the autoscaling policies in the environment configuration thus separating the concerns of the platform engineer and the developer. | ||
| 1. Dev, test environments wouldn't require autoscaling policies and the platform engineer has more flexibility to configure the scaling policies based on the environment. | ||
| 1. The platform engineer can configure the scaling policies once and all the applications deployed in the environment will inherit the scaling policies. |
There was a problem hiding this comment.
This is a bad thing. There must be a way for developers to opt their container into autoscaling, and there must be a method for operations engineers to tune autoscaling on a per container basis.
| 1. The platform engineer can configure the scaling policies once and all the applications deployed in the environment will inherit the scaling policies. | ||
| 1. Developer can focus on the application modelling and not worry about the scaling policies. If needed the developer can override the scaling policies in the application resource definition. Overriding scenario could be for example, if the developer wants to scale based on some events from the Kafka or RabbitMQ modelled in the application definition | ||
|
|
||
| ### Configuring Autoscaling as a core resource type |
There was a problem hiding this comment.
The biggest advantage here is the ability to delegate via RBAC, that's missing from your pros.
But I still don't quite see how to make this environment-specific.
| @@ -0,0 +1,239 @@ | |||
| # Application Autoscaling | |||
|
|
|||
| ## Overview | |||
There was a problem hiding this comment.
It may be better to list user stories. Here's a first cut:
- As a platform engineer, I need to enable autoscaling in an environment. I want to my developers to be able to opt-in one of their container services to autoscaling and specify the containers scaling metric.
- As a developer, I need to configure autoscaling for my container. Some of my containers use resource metrics and others use a custom metric provided by my application.
- As an operations engineer, I need to tune each container to maximize application performance and resource utilization.
|
This pull request has been automatically marked as stale because it has been inactive for 90 days. Remove stale label or comment or this will be closed in 7 days. |
|
This pull request has been closed due to inactivity. Feel free to reopen if you are still working on it. |
No description provided.