Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 3, 2025

  • Remove outbound NSG rules allowing unrestricted access to AzureMachineLearning service tag that enables data exfiltration to external ML workspaces
  • Disable shared access keys on Azure ML storage account to enforce RBAC-based access
  • Add workspace_owners_group_id and workspace_researchers_group_id as input variables to Azure ML workspace service
  • Replace script-based role assignment with direct RBAC assignments using AAD group IDs
  • Update CHANGELOG.md with security fix details
  • Verify systemDatastoresAuthMode = "identity" format against Azure documentation
  • Bump version to 1.0.0 (major version) due to breaking security changes
  • Remove external provider from Terraform configuration
Original prompt

This section details on the original issue you should resolve

<issue_title>Data exfiltration is possible via Azure ML</issue_title>
<issue_description>This is a bit of a niche bug, but has come up in our production instance.

We have users who have access to the SDE to carry out research, including the use of AzureML.

Some of these users also have access to a resource group in a different subscription in the same tenancy, where they can deploy resources, including AzureML.

When they visit ml.azure.com from inside the SDE, they can see their external ML workspaces. They can upload data from the SDE to Jupyter notebooks in that external workspace. They can then go outside the SDE, connect to the same notebook, and download the data, bypassing the airlock completely.

Steps to reproduce

  1. Create a user who has SDE Workspace Admin role for a workspace.
  2. Create a resource group for them outside the SDE, grant them Contributor access to it.

Then, as that user:
3. Create an AzureML workspace and a Jupyter notebook in that external resource group.
4. Create a VM in the SDE workspace.
5. From the VM, launch a browser and go to ml.azure.com.
6. Navigate to the AzureML instance you created in step 3, create a Jupyter notebook.
7. Upload a file from the SDE VM to the notebook storage. This is the point where the breach occurs.
8. From outside the SDE, open a browser, go to ml.azure.com.
9. Find the Jupyter notebook, download the file.

Azure TRE release version (e.g. v0.14.0 or main):
v0.21.0

Deployed Azure TRE components - click the (i) in the UI:
UI Version: 0.7.0
API Version: 0.21.1

One obvious fix is to have the SDE in a unique tenancy, not used for other purposes, but that's not necessarily convenient (it isn't for us).

Another possibility would seem to be the use of Conditional Access Policies, barring AzureML traffic from crossing the SDE boundary. I'm no expert here, but it seems from googling that CAPs can use the SDE firewall public IP address to do that?

Any other options?</issue_description>

Comments on the Issue (you are @copilot in this section)

@marrobi Hmm. That shouldn't be the case. The firewall should prevent access to public endpoints that are available via private endpoints.

If you can get some firewall logs of when the second workspace is being accessed, that would be very useful. Then we can try removing those and see if the primary workspace still works.

I might have time back end of next week to have a look and if needed speak with the product group.

Thanks for reporting.</comment_new>
<comment_new>@marrobi
Ok, so the rule AzureML_Client has the AzureMachineLearning service tag. This allows this traffic. Currently, trying to see if removing this service tag stops this access and if the primary TRE workspace continues to function.</comment_new>
<comment_new>@marrobi
Removing this rule seems to block access to the terminal/notebooks in other workspaces but still allow access to the tre workspace. @TonyWildish-BH looks like this was a configuration issue.

Couple of other improvements that should be made:

Also we need to ensure that none of the storage accounts should use account key or have credentials saved.

Modify the workspace service to use the workspace_owners_group_id and workspace_researchers_group_id values outputted from the workspace template to assign Reade, Storage Blob Data Contributor, Storage File Data Privileged Contributor to the service storage accounts to reduce the need for a script that can result in transient errors. This relies on auto AD groups being used for the workspace.

Also see:
https://learn.microsoft.com/en-us/azure/machine-learning/how-to-enable-studio-virtual-network?view=azureml-api-2#datastore-azure-storage-account

About roles that need to be assigned to the workspace identity.</comment_new>

Fixes #4660

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@marrobi
Copy link
Member

marrobi commented Oct 3, 2025

@copilot please read
#4660 (comment)
#4660 (comment)

and continue.

…ce tag rules and enforcing RBAC

Co-authored-by: marrobi <[email protected]>
Copy link
Contributor Author

Copilot AI commented Oct 3, 2025

@copilot please read
#4660 (comment)
#4660 (comment)
...

I've implemented the security fix based on the issue comments. Changes made in commit 0de3b26:

Security fixes:

  1. Removed NSG rules allowing outbound traffic to AzureMachineLearning service tag (ports 443, 5831, 8787, 18881) that enabled access to external ML workspaces
  2. Removed the "AzureML_Client" firewall rule that allowed workspace VMs to reach external AzureMachineLearning endpoints
  3. Disabled shared access keys on Azure ML storage account (shared_access_key_enabled = false)

RBAC improvements:
4. Added workspace_owners_group_id and workspace_researchers_group_id parameters to enable direct AAD group-based role assignments
5. Updated role assignments to use workspace AAD groups (Reader, AzureML Data Scientist, Storage Blob Data Contributor, Storage File Data Privileged Contributor)
6. Maintained backward compatibility with script-based role assignment for workspaces not using AAD groups

Version updates:
7. Bumped Azure ML workspace service to version 0.11.0
8. Updated CHANGELOG.md with security fix details

The TRE workspace will continue to function normally via private endpoints while blocking access to external ML workspaces.

Copilot AI changed the title [WIP] Data exfiltration is possible via Azure ML Fix Azure ML data exfiltration vulnerability by removing AzureMachineLearning service tag access and enforcing RBAC Oct 3, 2025
Copilot finished work on behalf of marrobi October 3, 2025 12:22
@marrobi
Copy link
Member

marrobi commented Oct 3, 2025

@copilot can you verify the correct format of systemDatastoresAuthMode = "identity" against the azure docs.

Also make this a major version increase.

Copy link
Contributor Author

Copilot AI commented Oct 3, 2025

@copilot can you verify the correct format of systemDatastoresAuthMode = "identity" against the azure docs.

Also make this a major version increase.

Verified and confirmed in commit 3a5b87d:

  1. systemDatastoresAuthMode format: The current implementation uses systemDatastoresAuthMode = "identity" which is correct per Azure ML documentation. This setting enables identity-based authentication for system datastores instead of credential-based access.

  2. Version bump: Updated from 0.11.0 to 1.0.0 (major version) in porter.yaml and CHANGELOG.md. This is appropriate given the breaking security changes that remove network rules and disable shared access keys.

@github-actions
Copy link

github-actions bot commented Nov 20, 2025

Unit Test Results

0 tests   0 ✅  0s ⏱️
0 suites  0 💤
0 files    0 ❌

Results for commit 343d31f.

♻️ This comment has been updated with latest results.

@marrobi marrobi marked this pull request as ready for review November 20, 2025 23:40
@marrobi marrobi requested a review from a team as a code owner November 20, 2025 23:40
@marrobi
Copy link
Member

marrobi commented Nov 20, 2025

@copilot external provider is not needed, please remove it.

Copy link
Contributor Author

Copilot AI commented Nov 20, 2025

@copilot external provider is not needed, please remove it.

Removed external provider in commit 2343c4c. Cleaned up both the providers.tf file and the .terraform.lock.hcl file to completely remove the hashicorp/external provider that was previously used for the legacy script-based role assignment approach.

Copilot finished work on behalf of marrobi November 20, 2025 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data exfiltration is possible via Azure ML

2 participants