Skip to content

Confidential Container support for ACI #92

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

sk593
Copy link
Contributor

@sk593 sk593 commented Apr 16, 2025

No description provided.

@sk593 sk593 requested review from a team as code owners April 16, 2025 16:40
**Azure CG Profile handler**
To enable confidential container support, the Azure Container Group Profile handler requires updates. Following the official guidance for confidential containers [here](https://learn.microsoft.com/en-us/azure/container-instances/container-instances-tutorial-deploy-confidential-containers-cce-arm), Radius will automate the setup process through these implementation steps:

1. Prior to deployment, the Azure CLI and `confcon` extension should have been downloaded. This would have been set up during `rad init` by the user when they elected to support confidential containers on ACI.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this won't be supported with GitOps workflow

Copy link
Contributor

@ytimocin ytimocin Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would also affect the air-gapped environments


#suppress "@azure-tools/typespec-azure-core/bad-record-type"
@doc("A merge will be applied to the ContainerGroupProfile object when this container is being deployed.")
model ContainerGroupProfile is Record<unknown>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could populate this with some reasonable fields for the build demo

1. Prior to deployment, the Azure CLI and `confcon` extension should have been downloaded. This would have been set up during `rad init` by the user when they elected to support confidential containers on ACI.
2. In the ACI renderer, populate a container group profile object using the default values or the user-provided inputs. See the previous section on the ACI renderer for details.
3. In the container group profile handler, generate a temporary ARM template that contains the same data in the container group profile object. The container group profile object is passed as in input to the handler.
4. Run the following command on the ARM template to generate a CCE policy: `az confcom acipolicygen -a .\template.json`. This command modifies the existing template to include the base64 encoded CCE policy.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any changes here based on what the team was saying about policy fragments that are published with the containers to the registry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. From what I understand, policy fragments are added onto the existing CCE policy so we'd still need to generate an initial one

Enter an environment name
> default

Select the compute for the environment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Select the compute for the environment
Select the container platform for the environment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the flow be modified to:

  1. Select the cloud platform for the environment [Kubernetes-only, AWS, Azure]
  2. If the user selects Azure,
    Select the container platform for the environment [Kubernetes, Azure Container Instances]


Select the compute for the environment
> 1. kubernetes
> 2. aci
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> 2. aci
> 2. Azure Container Instances

> default

Select the compute for the environment
> 1. kubernetes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> 1. kubernetes
> 1. Kubernetes

> 1. kubernetes
> 2. aci

Setup support for confidential containers? Selecting 'Yes' will install the Azure CLI and the `confcom` extension that are required to support confidential containers on ACI.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Setup support for confidential containers? Selecting 'Yes' will install the Azure CLI and the `confcom` extension that are required to support confidential containers on ACI.
Setup support for confidential containers? Selecting 'Yes' will install the Azure CLI and the `confcom` extension which are required to support confidential containers on ACI.

]
runtimes: {
+ aci: {
+ containergroupprofile: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we asking the developer for the container group profile name? We don't ask them for a pod name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for punchthroughs specifically. We'll have default values set for the container group profile but will allow users to override properties if they choose to

2. Create an NGroups object
3. Create an ACI instance

In addition, we will need to support an `aci` property and `containergroupprofile` property under `runtimes`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this is a requirement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a way to pass punchthroughs from Bicep to Radius. Since we already have a kubernetes property that allows for podspec, we'll add an analogous aci property for container group profiles. The user doesn't have to set these properties if they don't want to.


Ted is a developer trying to run his applications on a serverless compute platform. His application needs to run on confidential containers. His E2E experience defining a Bicep template using ACI compute looks like this:

**Setup**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen if Ted creates an environment via a rad deploy env.bicep instead of rad init --full? Would he have to do anything different to enable confidential computing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to remove changes to rad init. We'll continue to specify compute through the environment Bicep definition

Comment on lines +121 to +124
+ name: 'mycgname'
+ // Add ACI-specific properties here to punch-through the Radius abstraction, e.g. sku, osType, etc. These should be in the same format as the Bicep definition for a container group profile
+ properties: {
+ sku: 'Confidential' // 'Standard', 'Dedicated', etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this schema be exactly the same as the current schema for container group profiles?

Will Radius validate the user input against the cgp schema?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the plan. We'd keep it as close as possible to the Bicep schema (ideally the exact same) so there's less load in defining punchthrough properties


## Overview

The integration of Azure Container Instances (ACI) with Radius aims to enable deployments of applications and containers using the Radius control plane in Kubernetes to target ACI environments. One of the key features we want to implement is the use of confidential containers to enhance data security and privacy. Confidential containers on Azure provide a set of features and capabilities to further secure your standard container workloads. They run in a hardware-backed Trusted Execution Environment (TEE) that provides intrinsic capabilities like data integrity, data confidentiality, and code integrity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we just going to deploy applications and containers? Anything else we can deploy with ACI but will do it later in future iterations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACI is for serverless containers so it'll just be containers for now


The integration of Azure Container Instances (ACI) with Radius aims to enable deployments of applications and containers using the Radius control plane in Kubernetes to target ACI environments. One of the key features we want to implement is the use of confidential containers to enhance data security and privacy. Confidential containers on Azure provide a set of features and capabilities to further secure your standard container workloads. They run in a hardware-backed Trusted Execution Environment (TEE) that provides intrinsic capabilities like data integrity, data confidentiality, and code integrity.

Confidential containers require a CCE policy to be set on the container group profile associated with ACI instances. CCE policies can only be generated using an extension on the Azure CLI. As a result, Radius will need access to the Azure CLI and the `confcom` extension to support this feature. More information on confidential containers with ACI can be found [here](https://learn.microsoft.com/en-us/azure/container-instances/container-instances-tutorial-deploy-confidential-containers-cce-arm)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be a deal breaker for some potential users because they may think that it is a vendor lock-in? Can users opt-out of this option or is it going to come out of the box when you run rad init for example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this would be a vendor lock-in. Users would opt-in to use ACI by specifying a different compute than Kubernetes.


- A serverless extensibility design. This document will only focus on components ot Radius relevant to confidential container deployment. Compute extensibility design can be found [here](https://github.com/radius-project/design-notes/pull/91).
- Multi-compute support per environment will not be available initially.
- Full Dapr parity will not be available initially.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean Dapr in this context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for using Dapr with ACI containers, but it's out of scope for this design

-->

#### Bicep changes
The Bicep schema will be updated to include a field for the `containergroupprofile` data and properties. These changes will be automatically generated through the TypeSpec schema changes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is containergroupprofile supposed to be all lower case?

**Azure CG Profile handler**
To enable confidential container support, the Azure Container Group Profile handler requires updates. Following the official guidance for confidential containers [here](https://learn.microsoft.com/en-us/azure/container-instances/container-instances-tutorial-deploy-confidential-containers-cce-arm), Radius will automate the setup process through these implementation steps:

1. Prior to deployment, the Azure CLI and `confcon` extension should have been downloaded. This would have been set up during `rad init` by the user when they elected to support confidential containers on ACI.
Copy link
Contributor

@ytimocin ytimocin Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would also affect the air-gapped environments


Option 1:

We could require the user to create a container group profile first and pass the resource to Radius. This would eliminate the need for Radius to install the `confcom` extension, generating ARM templates, and extracting the CCE policy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could also be a good option for users to use. They may go with this option or let Radius create the container group profile. i.e. I think we should offer both.

}
```

**Azure CG Profile handler**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can any of these operations be done on the client side?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed requiring the user to define a container group profile and passing that to Radius, but ultimately decided against it to ensure that Radius handles the full setup for ACI

Copy link
Member

@brooke-hamilton brooke-hamilton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a proposed change to the data model for environments that looks like this. Given the more long-term proposal on the table for extensibility, let's discuss whether this still makes sense, as I think it would be removed in the long-term.

param aciscope string = '/subscriptions/<subscription id/resourceGroups/<resource group name>'
resource env 'Applications.Core/environments@2023-10-01-preview' = {
  name: 'aci-env'
  properties: {
    compute: {  
-        kind: 'kubernetes'
-        namespace: 'default'
-        identity: {
-            kind: 'azure.com.workload'
-            oidcIssuer: oidcIssuer
-        }
       // Either aci or kubernetes can be specified, but not both.
+      aci: {  
+          resourceGroup:
+          identity {
+              kind: 'managed-identity'
+              applicationResourceId: ''
+           }
+       }
+       kubernetes: {
+           namespace: 'default'
+           identity {
+               kind: 'azure.com.workload'
+               oidcIssuer: oidcIssuer
+           }
+       }
    }  
    providers: {
      azure: {
        scope: aciscope
      }
    }
  }
}


`rad init` would add 2 steps: one step to determine which compute platform to use when the user chooses to set up an environment and one step to choose confidential containers. See the user experience section for details.

### Error Handling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How/where do we handle issues with issues with installation of extension or missing az cli?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue will be surfaced in the container group profile handler since that's where we'd use the extension. We'd require the extension as a prerequisite so the failure will let the user know they need to install it. I'll update the doc with that info


TBD

## Security
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we require users to setup azure credentials on the client side with this change? Is it limited to platform engineers setting up the environment or also the dev deploying the app?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would still require Azure credentials since we're deploying Azure resources, but there shouldn't be an additional credential setup needed from this change


The scenario document for severless is in progress [here](https://github.com/radius-project/design-notes/blob/9bcaa014c013b254593426280add5ef5c69b265e/architecture/2025-01-serverless-feature-spec.md#scenario-3-punch-through-to-platform-specific-features-and-incremental-adoption-of-radius-into-existing-serverless-applications). Please refer to the scenarios defined in that doc, specifically scenario 3 as it relates to `runtimes` punchthroughs.

## User Experience (if applicable)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative option: require the user to have the azure CLI + confcon extension downloaded and accessible to Radius before using CC on ACI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use this option

sk593 added 2 commits April 16, 2025 15:27
Signed-off-by: sk593 <[email protected]>
Signed-off-by: sk593 <[email protected]>
@sk593
Copy link
Contributor Author

sk593 commented May 21, 2025

Closing for now, will reopen this when confidential container implementation is picked up again.

@sk593 sk593 closed this May 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants