-
Notifications
You must be signed in to change notification settings - Fork 16
Confidential Container support for ACI #92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: sk593 <[email protected]>
**Azure CG Profile handler** | ||
To enable confidential container support, the Azure Container Group Profile handler requires updates. Following the official guidance for confidential containers [here](https://learn.microsoft.com/en-us/azure/container-instances/container-instances-tutorial-deploy-confidential-containers-cce-arm), Radius will automate the setup process through these implementation steps: | ||
|
||
1. Prior to deployment, the Azure CLI and `confcon` extension should have been downloaded. This would have been set up during `rad init` by the user when they elected to support confidential containers on ACI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this won't be supported with GitOps workflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would also affect the air-gapped environments
|
||
#suppress "@azure-tools/typespec-azure-core/bad-record-type" | ||
@doc("A merge will be applied to the ContainerGroupProfile object when this container is being deployed.") | ||
model ContainerGroupProfile is Record<unknown>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could populate this with some reasonable fields for the build demo
1. Prior to deployment, the Azure CLI and `confcon` extension should have been downloaded. This would have been set up during `rad init` by the user when they elected to support confidential containers on ACI. | ||
2. In the ACI renderer, populate a container group profile object using the default values or the user-provided inputs. See the previous section on the ACI renderer for details. | ||
3. In the container group profile handler, generate a temporary ARM template that contains the same data in the container group profile object. The container group profile object is passed as in input to the handler. | ||
4. Run the following command on the ARM template to generate a CCE policy: `az confcom acipolicygen -a .\template.json`. This command modifies the existing template to include the base64 encoded CCE policy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any changes here based on what the team was saying about policy fragments that are published with the containers to the registry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. From what I understand, policy fragments are added onto the existing CCE policy so we'd still need to generate an initial one
Enter an environment name | ||
> default | ||
|
||
Select the compute for the environment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Select the compute for the environment | |
Select the container platform for the environment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the flow be modified to:
Select the cloud platform for the environment [Kubernetes-only, AWS, Azure]
- If the user selects Azure,
Select the container platform for the environment [Kubernetes, Azure Container Instances]
|
||
Select the compute for the environment | ||
> 1. kubernetes | ||
> 2. aci |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> 2. aci | |
> 2. Azure Container Instances |
> default | ||
|
||
Select the compute for the environment | ||
> 1. kubernetes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> 1. kubernetes | |
> 1. Kubernetes |
> 1. kubernetes | ||
> 2. aci | ||
|
||
Setup support for confidential containers? Selecting 'Yes' will install the Azure CLI and the `confcom` extension that are required to support confidential containers on ACI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setup support for confidential containers? Selecting 'Yes' will install the Azure CLI and the `confcom` extension that are required to support confidential containers on ACI. | |
Setup support for confidential containers? Selecting 'Yes' will install the Azure CLI and the `confcom` extension which are required to support confidential containers on ACI. |
] | ||
runtimes: { | ||
+ aci: { | ||
+ containergroupprofile: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we asking the developer for the container group profile name? We don't ask them for a pod name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for punchthroughs specifically. We'll have default values set for the container group profile but will allow users to override properties if they choose to
2. Create an NGroups object | ||
3. Create an ACI instance | ||
|
||
In addition, we will need to support an `aci` property and `containergroupprofile` property under `runtimes`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why this is a requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a way to pass punchthroughs from Bicep to Radius. Since we already have a kubernetes
property that allows for podspec, we'll add an analogous aci
property for container group profiles. The user doesn't have to set these properties if they don't want to.
|
||
Ted is a developer trying to run his applications on a serverless compute platform. His application needs to run on confidential containers. His E2E experience defining a Bicep template using ACI compute looks like this: | ||
|
||
**Setup** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if Ted creates an environment via a rad deploy env.bicep
instead of rad init --full
? Would he have to do anything different to enable confidential computing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to remove changes to rad init
. We'll continue to specify compute through the environment Bicep definition
+ name: 'mycgname' | ||
+ // Add ACI-specific properties here to punch-through the Radius abstraction, e.g. sku, osType, etc. These should be in the same format as the Bicep definition for a container group profile | ||
+ properties: { | ||
+ sku: 'Confidential' // 'Standard', 'Dedicated', etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this schema be exactly the same as the current schema for container group profiles?
Will Radius validate the user input against the cgp schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the plan. We'd keep it as close as possible to the Bicep schema (ideally the exact same) so there's less load in defining punchthrough properties
|
||
## Overview | ||
|
||
The integration of Azure Container Instances (ACI) with Radius aims to enable deployments of applications and containers using the Radius control plane in Kubernetes to target ACI environments. One of the key features we want to implement is the use of confidential containers to enhance data security and privacy. Confidential containers on Azure provide a set of features and capabilities to further secure your standard container workloads. They run in a hardware-backed Trusted Execution Environment (TEE) that provides intrinsic capabilities like data integrity, data confidentiality, and code integrity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we just going to deploy applications and containers? Anything else we can deploy with ACI but will do it later in future iterations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACI is for serverless containers so it'll just be containers for now
|
||
The integration of Azure Container Instances (ACI) with Radius aims to enable deployments of applications and containers using the Radius control plane in Kubernetes to target ACI environments. One of the key features we want to implement is the use of confidential containers to enhance data security and privacy. Confidential containers on Azure provide a set of features and capabilities to further secure your standard container workloads. They run in a hardware-backed Trusted Execution Environment (TEE) that provides intrinsic capabilities like data integrity, data confidentiality, and code integrity. | ||
|
||
Confidential containers require a CCE policy to be set on the container group profile associated with ACI instances. CCE policies can only be generated using an extension on the Azure CLI. As a result, Radius will need access to the Azure CLI and the `confcom` extension to support this feature. More information on confidential containers with ACI can be found [here](https://learn.microsoft.com/en-us/azure/container-instances/container-instances-tutorial-deploy-confidential-containers-cce-arm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be a deal breaker for some potential users because they may think that it is a vendor lock-in? Can users opt-out of this option or is it going to come out of the box when you run rad init
for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this would be a vendor lock-in. Users would opt-in to use ACI by specifying a different compute than Kubernetes.
|
||
- A serverless extensibility design. This document will only focus on components ot Radius relevant to confidential container deployment. Compute extensibility design can be found [here](https://github.com/radius-project/design-notes/pull/91). | ||
- Multi-compute support per environment will not be available initially. | ||
- Full Dapr parity will not be available initially. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean Dapr in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for using Dapr with ACI containers, but it's out of scope for this design
--> | ||
|
||
#### Bicep changes | ||
The Bicep schema will be updated to include a field for the `containergroupprofile` data and properties. These changes will be automatically generated through the TypeSpec schema changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is containergroupprofile
supposed to be all lower case?
**Azure CG Profile handler** | ||
To enable confidential container support, the Azure Container Group Profile handler requires updates. Following the official guidance for confidential containers [here](https://learn.microsoft.com/en-us/azure/container-instances/container-instances-tutorial-deploy-confidential-containers-cce-arm), Radius will automate the setup process through these implementation steps: | ||
|
||
1. Prior to deployment, the Azure CLI and `confcon` extension should have been downloaded. This would have been set up during `rad init` by the user when they elected to support confidential containers on ACI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would also affect the air-gapped environments
|
||
Option 1: | ||
|
||
We could require the user to create a container group profile first and pass the resource to Radius. This would eliminate the need for Radius to install the `confcom` extension, generating ARM templates, and extracting the CCE policy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could also be a good option for users to use. They may go with this option or let Radius create the container group profile. i.e. I think we should offer both.
} | ||
``` | ||
|
||
**Azure CG Profile handler** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can any of these operations be done on the client side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed requiring the user to define a container group profile and passing that to Radius, but ultimately decided against it to ensure that Radius handles the full setup for ACI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a proposed change to the data model for environments
that looks like this. Given the more long-term proposal on the table for extensibility, let's discuss whether this still makes sense, as I think it would be removed in the long-term.
param aciscope string = '/subscriptions/<subscription id/resourceGroups/<resource group name>'
resource env 'Applications.Core/environments@2023-10-01-preview' = {
name: 'aci-env'
properties: {
compute: {
- kind: 'kubernetes'
- namespace: 'default'
- identity: {
- kind: 'azure.com.workload'
- oidcIssuer: oidcIssuer
- }
// Either aci or kubernetes can be specified, but not both.
+ aci: {
+ resourceGroup:
+ identity {
+ kind: 'managed-identity'
+ applicationResourceId: ''
+ }
+ }
+ kubernetes: {
+ namespace: 'default'
+ identity {
+ kind: 'azure.com.workload'
+ oidcIssuer: oidcIssuer
+ }
+ }
}
providers: {
azure: {
scope: aciscope
}
}
}
}
|
||
`rad init` would add 2 steps: one step to determine which compute platform to use when the user chooses to set up an environment and one step to choose confidential containers. See the user experience section for details. | ||
|
||
### Error Handling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How/where do we handle issues with issues with installation of extension or missing az cli?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue will be surfaced in the container group profile handler since that's where we'd use the extension. We'd require the extension as a prerequisite so the failure will let the user know they need to install it. I'll update the doc with that info
|
||
TBD | ||
|
||
## Security |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we require users to setup azure credentials on the client side with this change? Is it limited to platform engineers setting up the environment or also the dev deploying the app?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would still require Azure credentials since we're deploying Azure resources, but there shouldn't be an additional credential setup needed from this change
|
||
The scenario document for severless is in progress [here](https://github.com/radius-project/design-notes/blob/9bcaa014c013b254593426280add5ef5c69b265e/architecture/2025-01-serverless-feature-spec.md#scenario-3-punch-through-to-platform-specific-features-and-incremental-adoption-of-radius-into-existing-serverless-applications). Please refer to the scenarios defined in that doc, specifically scenario 3 as it relates to `runtimes` punchthroughs. | ||
|
||
## User Experience (if applicable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative option: require the user to have the azure CLI + confcon
extension downloaded and accessible to Radius before using CC on ACI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use this option
Signed-off-by: sk593 <[email protected]>
Signed-off-by: sk593 <[email protected]>
Closing for now, will reopen this when confidential container implementation is picked up again. |
No description provided.