Skip to content

Use custom service account for search dataform#3800

Open
emmalowe wants to merge 4 commits intomainfrom
SCH-1778-Create-custom-dataform-service-account
Open

Use custom service account for search dataform#3800
emmalowe wants to merge 4 commits intomainfrom
SCH-1778-Create-custom-dataform-service-account

Conversation

@emmalowe
Copy link
Contributor

@emmalowe emmalowe commented Mar 4, 2026

GCP is enforcing a new access control model for Dataform called "strict act-as mode". As part of this, GCP is disabling the ability for Dataform instances to be run using the Default Dataform service Account, so all Dataform workflows must switch to a custom service account, which the Dataform Service Agent is given permissions on. The custom service account must include the iam.serviceAccounts.actAs permission to configure Dataform workflows.

Steps covered here:

  • Move creation of custom service account "dataform-sa"
  • Change BigQuery internal project permissions from default service account to custom service account
  • Give default service account permissions to impersonate custom service account
  • Add custom service account to repo set up (to use this as the default account for running workflows)
  • Give custom service account secrets permissions to connect to our dataform Github repo
  • Give custom service accounts permission to read from the GA4 Analytics project

See:

@emmalowe emmalowe force-pushed the SCH-1778-Create-custom-dataform-service-account branch from ebcf0cb to e237f69 Compare March 4, 2026 17:11
@emmalowe
Copy link
Contributor Author

emmalowe commented Mar 4, 2026

Since this is Terraform, I assume the approach to testing is to get this merged and then test it out on integration before applying the changes in production.

@emmalowe emmalowe marked this pull request as ready for review March 4, 2026 17:22
@emmalowe emmalowe requested a review from a team as a code owner March 4, 2026 17:22
Copy link
Contributor

@aaronfowles aaronfowles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The terraform plan looks like it failed with what could be a transient error. Might be worth triggering a retry from the UI to see if that resolves...

@emmalowe
Copy link
Contributor Author

emmalowe commented Mar 5, 2026

The terraform plan looks like it failed with what could be a transient error. Might be worth triggering a retry from the UI to see if that resolves...

Thanks @aaronfowles ! I reran the plan and it looks like it's working now.

Copy link
Contributor

@hannako hannako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me based on the confluence documentation and my reading of the gcp docs. But as discussed with @emmalowe in person, our roll out approach will be to merge, apply the change in integration and confirm all looks as expected before applying the change in production

@emmalowe emmalowe force-pushed the SCH-1778-Create-custom-dataform-service-account branch 2 times, most recently from e9c171e to 06e96b1 Compare March 12, 2026 15:37
GCP is enforcing a new access control model for Dataform called
"strict act-as mode" [1]. As part of this, GCP is disabling the
ability for Dataform instances to be run using the Default Dataform
service Account. All Dataform workflows must switch to a custom
service account, which the Dataform Service Agent is given
permissions on [2].

Steps covered:
- Change BigQuery internal project permissions from default service
account to custom service account
- Give default service account permissions to impersonate custom
service account
- Add custom service account to repo set up (to use this as the
default account for running workflows)
- Give custom service account secrets permissions to connect to
our dataform Github repo

[1] https://docs.cloud.google.com/dataform/docs/strict-act-as-mode
[2] https://docs.cloud.google.com/dataform/docs/access-control#grant-roles-auto-workflows
@emmalowe emmalowe force-pushed the SCH-1778-Create-custom-dataform-service-account branch from 06e96b1 to 1756975 Compare March 12, 2026 15:42
@emmalowe emmalowe marked this pull request as draft March 18, 2026 10:39
The permissions listed are not cross-project, so this comment
is misleading. The comment is a hang-over from when the
BigQuery permissions were assigned across all search
environments at once.

See #2109
@emmalowe emmalowe force-pushed the SCH-1778-Create-custom-dataform-service-account branch 2 times, most recently from 8bc3962 to 0d46886 Compare March 18, 2026 17:49
The Search Team's Dataform pipelines [1] read from specific
datasets in the GA4 Analytics project. Here we add
those specific permissions, in line with the principle of least
privilege.

Because some of the pipelines include a table wildcard [2], we
need to add a new custom role that includes the list permission.

It seems that previously these permissions were added in the
GCP UI via click-ops.

[1] https://github.com/alphagov/search-api-v2-dataform
[2] e.g. https://github.com/alphagov/search-api-v2-dataform/blob/main/definitions/search-intraday.sqlx#L66
@emmalowe emmalowe force-pushed the SCH-1778-Create-custom-dataform-service-account branch from 0d46886 to e6323a6 Compare March 18, 2026 18:09
@emmalowe
Copy link
Contributor Author

Hello reviewers 👋🏻 Apologies for sitting on this a while - I realised I had to do some more thinking on cross-project permissions before I could merge this. The short version is, our Dataform pipelines need permission to read from the GA4 Analytics Project, so I've added those in a new commit.

@AP-Hunt it might be most efficient if you could review this first, please? Most of my questions (that I'll leave inline) are about Terraform best practice, so you might be best to answer those. @hannako feel free to have a look too, since we've been chatting about this already.

Thanks 🌟

]
}

locals {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we only need to read from 2/17 datasets in the GA4 Analytics project, I've added these as dataset permissions instead of project permissions (which would give access to all the datasets). This might be overkill.

I've chosen to use iam_member instead of iam_binding for two reasons:

  1. I'm not confident how dataset permissions interact with project permissions, and I didn't want to wipe out any permissions for anyone else. From what I've read, dataset binding and project bindings together are additive, so that shouldn't be the case, but I'm not sure.
  2. It seems like some of GA4 Analytics permissions have been set up via ClickOps and again, I don't want to delete permissions for anyone else. But maybe I need to do more work to check what is there and rectify any issues because we probably should use binding if we can (see point 1).

If the permission set up looks okay, I'm not clear on how this new code should be organised. I've seen locals.tf files elsewhere. Maybe we should also have a separate file for dataset permissions 🤷🏻‍♀️

title = "GDS BQ read access"
}

resource "google_project_iam_custom_role" "gds_bigquery_read_and_list_access" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding this custom role, because for some of our pipelines that use a table wildcard, we need the bigquery.tables.list permission, which isn't in gds_bigquery_read_access. This also might be overkill, since I could just add bigquery.tables.list to gds_bigquery_read_access.

Copy link
Contributor Author

@emmalowe emmalowe Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because I'm only adding a list permission (rather than create or delete), it's not worth making a separate role, but I'm curious to know what other people think.

@emmalowe emmalowe marked this pull request as ready for review March 18, 2026 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants