Skip to content

Comments

Autoscaling Feature spec #86

Closed
Reshrahim wants to merge 2 commits intoradius-project:mainfrom
Reshrahim:autoscale
Closed

Autoscaling Feature spec #86
Reshrahim wants to merge 2 commits intoradius-project:mainfrom
Reshrahim:autoscale

Conversation

@Reshrahim
Copy link
Contributor

No description provided.

Signed-off-by: Reshma Abdul Rahim <reshmarahim.abdul@microsoft.com>
Signed-off-by: Reshma Abdul Rahim <reshmarahim.abdul@microsoft.com>

#### Option 2: Adding it as part extensions

`extensions` have been the way to provide punch through capabilities which do not change the behaviors of the resource. The following is an example of adding autoscaling as part of the extensions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor correction: runtimes is the punch through

Cons: not straightforward or intuitive as user doesn't know which autoscale config is required for the platform ? needs a platform specific discriminator
May be, this is not a developer problem and should be handled by the platform operator.

#### Option 3: Adding it as top level common property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Option 3 makes the most sense IF hpa and/or http auto-scaling and their corresponding properties are common across container platforms.


Cons:same as above, not straightforward or intuitive as user doesn't know which autoscale config is required for the platform ? May be, this is not a developer problem and should be handled by the platform operator.

### Configuring Autoscaling in Radius Environment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would autoscaling policies specified in the container override the autoscaling policies set for the environment?

1. The platform engineer can configure the scaling policies once and all the applications deployed in the environment will inherit the scaling policies.
1. Developer can focus on the application modelling and not worry about the scaling policies. If needed the developer can override the scaling policies in the application resource definition. Overriding scenario could be for example, if the developer wants to scale based on some events from the Kafka or RabbitMQ modelled in the application definition

### Configuring Autoscaling as a core resource type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow this completely - I see Radius resources as objects it needs to provision or create at deploy time, but autoscaling is not an object, but rather a configuration. I think this kind of pattern might be better handled through bicep params instead?


## Overview

Autoscaling is a critical capability for modern cloud applications, enabling them to dynamically adjust resources based on demand. Platform engineers and developers need the ability to optimize the resource utilization when deploying applications using Radius across different runtime environments. This document outlines the design and the user experience for configuring autoscaling policies in Radius applications.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Autoscaling is a critical capability for modern cloud applications, enabling them to dynamically adjust resources based on demand. Platform engineers and developers need the ability to optimize the resource utilization when deploying applications using Radius across different runtime environments. This document outlines the design and the user experience for configuring autoscaling policies in Radius applications.
Autoscaling is a critical capability required for cloud-native applications to provide a performant experience for its users. Platform engineers and developers must collaborate together to ensure the application behaves well as it scales and and scales down, and that the platform maximizes utilization of the available computing resources. This collaboration requires that both developers and platform engineers work together—it is not possible for only one persona to successfully configure autoscaling without the other. This document outlines the design and the user experience for configuring autoscaling policies in Radius applications.

The following are the most common autoscaling mechanisms available in the cloud-native ecosystem.

**Kubernetes**
1. **Horizontal Pod Autoscaler (HPA)** - Kubernetes native autoscaling mechanism that scales the number of pods in a deployment based on observed CPU utilization, Memory and other custom metrics. This is the most common autoscaling mechanism used in the Kubernetes ecosystem.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **Horizontal Pod Autoscaler (HPA)** - Kubernetes native autoscaling mechanism that scales the number of pods in a deployment based on observed CPU utilization, Memory and other custom metrics. This is the most common autoscaling mechanism used in the Kubernetes ecosystem.
1. **Horizontal Pod Autoscaler (HPA)** - Kubernetes native autoscaling mechanism that scales the number of pods in a deployment based on resource metrics (CPU and memory utilization) or custom metrics. This is the most common autoscaling mechanism used in the Kubernetes ecosystem.


**Kubernetes**
1. **Horizontal Pod Autoscaler (HPA)** - Kubernetes native autoscaling mechanism that scales the number of pods in a deployment based on observed CPU utilization, Memory and other custom metrics. This is the most common autoscaling mechanism used in the Kubernetes ecosystem.
2. **Vertical Pod Autoscaler (VPA)** - Kubernetes native autoscaling mechanism that automatically adjusts the CPU and memory requests of the pods. This is the least common autoscaling mechanism used in the Kubernetes ecosystem as it requires restarting the pods to apply the new resource requests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. **Vertical Pod Autoscaler (VPA)** - Kubernetes native autoscaling mechanism that automatically adjusts the CPU and memory requests of the pods. This is the least common autoscaling mechanism used in the Kubernetes ecosystem as it requires restarting the pods to apply the new resource requests.
2. **Vertical Pod Autoscaler (VPA)** - Kubernetes native autoscaling mechanism that automatically adjusts the CPU and memory requests (the minumum) of the pods up to the limit (the maximum). This is the least common autoscaling mechanism used in the Kubernetes ecosystem as it requires restarting the pods to apply the new resource requests.

3. **KEDA** - Kubernetes Event-driven Autoscaling (KEDA) is an open-source component that enables autoscaling of Kubernetes workloads based on external metrics. KEDA operates on top of the HPA and triggers scaling based on metrics from various sources, such as message queues, databases, or observability platforms.

**Serverless Container platforms**
1. **Azure Container Instances and Apps** - Azure container instances doesn't provide an inbuilt solution to automatically scale. Container apps provide scaling based on HTTP traffic and other event-driven triggers (KEDA). For web apps, the preferred scaling mechanism is based on HTTP traffic. For event-driven workloads, the preferred scaling mechanism is based on KEDA.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **Azure Container Instances and Apps** - Azure container instances doesn't provide an inbuilt solution to automatically scale. Container apps provide scaling based on HTTP traffic and other event-driven triggers (KEDA). For web apps, the preferred scaling mechanism is based on HTTP traffic. For event-driven workloads, the preferred scaling mechanism is based on KEDA.
1. **Azure Container Instances** - Azure Container Instances today only has the ability to manual scale using the `desiredCount` property on the NGroup. When autoscaling is available for NGroups, this section will be updated.
2. **Azure Container Apps** – Azure Container Apps provide scaling based on HTTP traffic and other event-driven triggers (KEDA). For web apps, the preferred scaling mechanism is based on HTTP traffic. For event-driven workloads, the preferred scaling mechanism is based on KEDA.


**Serverless Container platforms**
1. **Azure Container Instances and Apps** - Azure container instances doesn't provide an inbuilt solution to automatically scale. Container apps provide scaling based on HTTP traffic and other event-driven triggers (KEDA). For web apps, the preferred scaling mechanism is based on HTTP traffic. For event-driven workloads, the preferred scaling mechanism is based on KEDA.
2. **AWS Fargate and App Runner** - AWS Fargate provides autoscaling based on CPU, memory and cloud watch metrics. App runner provides autoscaling based on HTTP traffic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be ECS not Fargate. And ignore AppRunner. Usage is almost nil.


A developer specifies a simple HPA autoscaling configuration in the container resource definition. Below are the various options to configure autoscaling in the container resource type.

#### Option 1: Adding autoscaling in the compute configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with the manualScaling extension, let's assume option 2. I don't see any reason we would do option 1 or 3. I also think having these options here takes away from your broader options. The options we should be debating aren't these, it's:

  • Option 1 – Autoscaling specified as part of the environment configuration with applications able to opt in via properties in the container resource definition
  • Option 2 – Autoscaler as a resource type analogous to a gateway which enables platform engineers delegate to developers via RBAC

1. More configuration to manage across different runtimes.
2. The developer needs to know the underlying platform and the autoscaling policies required for the platform.

#### Option 2: Adding it as part extensions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this as a viable approach. If the maxReplicas is hard coded in the application definition, there would be no difference based on the environment. That doesn't meet the requirements. If that's the case, we're wasting our time considering this as an option. Is there some way to make this environment-responsive? I don't see how.

Pros:
1. Autoscaling is an infrastructure problem, the platform engineer has the ability to configure the autoscaling policies in the environment configuration thus separating the concerns of the platform engineer and the developer.
1. Dev, test environments wouldn't require autoscaling policies and the platform engineer has more flexibility to configure the scaling policies based on the environment.
1. The platform engineer can configure the scaling policies once and all the applications deployed in the environment will inherit the scaling policies.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bad thing. There must be a way for developers to opt their container into autoscaling, and there must be a method for operations engineers to tune autoscaling on a per container basis.

1. The platform engineer can configure the scaling policies once and all the applications deployed in the environment will inherit the scaling policies.
1. Developer can focus on the application modelling and not worry about the scaling policies. If needed the developer can override the scaling policies in the application resource definition. Overriding scenario could be for example, if the developer wants to scale based on some events from the Kafka or RabbitMQ modelled in the application definition

### Configuring Autoscaling as a core resource type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest advantage here is the ability to delegate via RBAC, that's missing from your pros.

But I still don't quite see how to make this environment-specific.

@@ -0,0 +1,239 @@
# Application Autoscaling

## Overview
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be better to list user stories. Here's a first cut:

  1. As a platform engineer, I need to enable autoscaling in an environment. I want to my developers to be able to opt-in one of their container services to autoscaling and specify the containers scaling metric.
  2. As a developer, I need to configure autoscaling for my container. Some of my containers use resource metrics and others use a custom metric provided by my application.
  3. As an operations engineer, I need to tune each container to maximize application performance and resource utilization.

@github-actions
Copy link

github-actions bot commented Jun 5, 2025

This pull request has been automatically marked as stale because it has been inactive for 90 days. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 5, 2025
@github-actions
Copy link

This pull request has been closed due to inactivity. Feel free to reopen if you are still working on it.

@github-actions github-actions bot closed this Jun 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants