Skip to content

fix: data race in checks.State.ObjectFailureMessages#38684

Open
sakiphan wants to merge 2 commits into
hashicorp:mainfrom
sakiphan:fix/checks-objectfailuremessages-data-race
Open

fix: data race in checks.State.ObjectFailureMessages#38684
sakiphan wants to merge 2 commits into
hashicorp:mainfrom
sakiphan:fix/checks-objectfailuremessages-data-race

Conversation

@sakiphan
Copy link
Copy Markdown

@sakiphan sakiphan commented Jun 4, 2026

State.ObjectFailureMessages reads the shared statuses and failureMsgs
maps without taking c.mu, but the Report* methods write to those same maps
under the lock while Terraform Core walks the graph concurrently. That's a data
race — under -race it's flagged as one, and in the worst case it surfaces as a
concurrent map read and map write panic or as wrong/missing failure messages.

The fix just holds c.mu for the duration of the read, the same way every other
reader on *State already does (e.g. ObjectCheckStatus). The method doesn't
call back into any other *State method, so there's no risk of deadlocking on
the non-reentrant mutex.

I also added a regression test that runs concurrent writers and readers against a
single State. It reports a race under go test -race ./internal/checks/...
without the lock and passes with it.

Fixes #38578

Target Release

1.16.x

This has been latent since #31268, so it's also a reasonable backport candidate
if the team wants it in a patch release.

Rollback Plan

  • If a change needs to be reverted, we will roll out an update to the code within 7 days.

Changes to Security Controls

No. This only adds a mutex lock around an existing read — no changes to access
controls, encryption, or logging.

CHANGELOG entry

  • This change is user-facing and I added a changelog entry.
  • This change is not user-facing.

ObjectFailureMessages read the shared statuses and failureMsgs maps
without holding c.mu, while the Report* methods write to them under the
lock from concurrent graph-walk goroutines. This can cause panics
("concurrent map read and map write") or incorrect results.

Acquire c.mu for the duration of the read, matching every other public
method on *State (e.g. ObjectCheckStatus). ObjectFailureMessages does not
call back into any *State method, so there is no deadlock risk.

Adds a regression test that fails under `go test -race` without the fix.

Fixes hashicorp#38578
@sakiphan sakiphan requested a review from a team as a code owner June 4, 2026 18:33
@hashicorp-cla-app
Copy link
Copy Markdown

hashicorp-cla-app Bot commented Jun 4, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data race in checks.State.ObjectFailureMessages: missing mutex lock

1 participant