Skip to content

feat: Add API model and CRUD endpoints for ExpectedRack#444

Open
chet wants to merge 1 commit intoNVIDIA:mainfrom
chet:expected-rack-apis
Open

feat: Add API model and CRUD endpoints for ExpectedRack#444
chet wants to merge 1 commit intoNVIDIA:mainfrom
chet:expected-rack-apis

Conversation

@chet
Copy link
Copy Markdown
Contributor

@chet chet commented Apr 30, 2026

Description

This adds ExpectedRack REST APIs to create, update, and get ExpectedRack instances, which is similar to the work for adding ExpectedSwitch and ExpectedPowerShelf API calls as done in #220 (with similar dual-write approach as implemented in #303. Tried to pattern match as much as possible again.

This also folds in the same fixes applied in #448, as well as the ToProto pattern from #467 and #456.

Tests included!

Signed-off-by: Chet Nichols III chetn@nvidia.com

Type of Change

  • Feature - New feature or functionality (feat:)

Services Affected

  • API - API models or endpoints updated
  • Workflow - Workflow service updated
  • DB - DB DAOs or migrations updated
  • Site Agent - Site Agent updated

Related Issues (Optional)

None

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated

Additional Notes

None

@chet chet requested a review from a team as a code owner April 30, 2026 06:08
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 30, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds full Expected Rack functionality: API handlers and models, DB model/DAO and migration, OpenAPI and generated SDK, Temporal workflows/activities, site-agent manager wiring, gRPC test helpers, and extensive unit/integration tests across layers.

Changes

Cohort / File(s) Summary
API Handlers & Models
api/pkg/api/handler/expectedrack.go, api/pkg/api/model/expectedrack.go, api/pkg/api/model/expectedrack_test.go, api/pkg/api/model/util/rack_labels.go, api/pkg/api/routes.go
New REST handlers (create, list, get, update, delete, replace-all, delete-all) with validation, auth/site checks, transactional DB mutations, synchronous Temporal workflow invocation; API request/response models, label constants, routes, and handler/model tests.
Database DAO & Migration
db/pkg/db/model/expectedrack.go, db/pkg/db/model/expectedrack_test.go, db/pkg/migrations/20260429100000_expected_rack.go
Adds Bun expected_rack model, DAO interface + SQL implementation (CRUD, bulk, replace-all, delete-all), paginated GetAll with filters/search/order, migration creating table/indexes, and comprehensive DAO tests.
OpenAPI & Generated SDK
openapi/spec.yaml, sdk/standard/api_expected_rack.go, sdk/standard/client.go, sdk/standard/model_expected_rack*.go, sdk/standard/model_expected_rack_list.go
OpenAPI paths/schemas for Expected Rack and generated client/models (create/list/update/replace/delete), nullable wrappers, JSON validation, and APIClient wiring.
Site-Agent Manager Integration
site-agent/pkg/components/managers/expectedrack/*, site-agent/pkg/components/managers/manager.go, site-agent/pkg/components/managers/manageraccess.go, site-agent/pkg/components/managers/managerapi/*.go, site-agent/pkg/datatypes/managertypes/workflow/workflowtypes.go, site-agent/pkg/components/managers/workflow/orchestrator.go
Manager API, package-level access, Init/GetState, subscriber registration, manager accessor and managerapi interface; workflow state extended and subscriber registered in orchestrator.
Temporal Workflows & Activities
site-workflow/pkg/activity/expectedrack.go, site-workflow/pkg/activity/expectedrack_test.go, site-workflow/pkg/activity/rack_labels.go, site-workflow/pkg/workflow/expectedrack.go, site-workflow/pkg/workflow/expectedrack_test.go
ManageExpectedRack activities calling Carbide and RLA RPCs, label-to-RLA mapping helpers, best-effort RLA sync, activity timeouts/retries, five workflows (create/update/delete/replace-all/delete-all) and tests.
gRPC Mocks & Test Server
site-workflow/pkg/grpc/client/testing.go, site-workflow/pkg/grpc/server/forge_test_server.go
Mock Forge client methods and in-memory test server supporting expected-rack CRUD and bulk operations to facilitate activity/workflow tests.
API & Integration Tests
api/pkg/api/handler/expectedrack_test.go, api/pkg/api/routes_test.go
Extensive handler integration tests (validation, auth/scoping, pagination, includeRelation, conflict/mismatch cases) with per-site mocked Temporal clients; route-count updated to include expected-rack routes.

Sequence Diagram

sequenceDiagram
    participant Client
    participant APIHandler
    participant DB as DB (Tx)
    participant Temporal as Temporal (Workflow)
    participant Carbide as Carbide gRPC
    participant RLA as RLA gRPC

    Client->>APIHandler: POST /.../expected-rack (CreateRequest)
    APIHandler->>APIHandler: Validate request, authorize, verify Site state
    APIHandler->>DB: BEGIN TX / INSERT ExpectedRack
    DB-->>APIHandler: Inserted row
    APIHandler->>Temporal: Start CreateExpectedRack workflow (site task queue)
    Temporal->>Carbide: Activity CreateExpectedRackOnSite (AddExpectedRack RPC)
    Carbide-->>Temporal: Success / Error
    Temporal->>RLA: Activity CreateExpectedRackOnRLA (best-effort)
    RLA-->>Temporal: Success or Warning
    Temporal-->>APIHandler: Workflow completes (success or logged RLA warning)
    APIHandler->>DB: COMMIT TX
    DB-->>APIHandler: Committed
    APIHandler-->>Client: 201 Created (ExpectedRack JSON)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 48.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The PR title accurately and specifically describes the primary change: adding API model and CRUD endpoints for ExpectedRack, which is precisely the scope of this substantial feature implementation across multiple layers.
Description check ✅ Passed The pull request description clearly relates to the changeset, explaining the addition of ExpectedRack REST APIs with reference to similar prior implementations and the dual-write pattern employed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-04-30 06:09:47 UTC | Commit: 68157bf

@chet chet changed the title feat: add ExpectedRack APIs (Create, Update, Get) feat: add ExpectedRack APIs Apr 30, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (3)
site-agent/pkg/components/managers/expectedrack/init.go (1)

26-27: ⚡ Quick win

Make the deferred validation a searchable TODO.

This is clearly intentional deferred work, but the repo guidance asks for explicit TODOs when something is being left for later.

Suggested tweak
-	// Validate the expectedrack config later
+	// TODO: validate the expectedrack config.

As per coding guidelines, "Add TODO comments for features or nuances not important to implement right away".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site-agent/pkg/components/managers/expectedrack/init.go` around lines 26 -
27, Replace the informal comment "// Validate the expectedrack config later"
with a searchable TODO comment such as "// TODO: validate expectedrack config"
so the deferred work is explicitly discoverable; update the same section where
ManagerAccess.Data.EB.Log.Info().Msg("ExpectedRack: Initializing ExpectedRack
API") is called to keep the TODO adjacent to that initialization call and, if
possible, include a short owner or ticket reference in the TODO (e.g., TODO:
validate expectedrack config `#ticket/owner`).
site-workflow/pkg/workflow/expectedrack_test.go (1)

46-111: ⚡ Quick win

Add nil-request regression tests for workflow entrypoints.

Please add explicit nil-input cases (at least for CreateExpectedRack, UpdateExpectedRack, DeleteExpectedRack, and ReplaceAllExpectedRacks) so panics from pre-validation field access are caught by tests.

Also applies to: 132-168, 189-223, 244-288

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site-workflow/pkg/workflow/expectedrack_test.go` around lines 46 - 111, Add
nil-request regression tests to the CreateExpectedRackTestSuite to ensure the
workflow entrypoints handle nil inputs without panicking: create new tests
(e.g., Test_CreateExpectedRack_NilRequest, and analogous tests for
UpdateExpectedRack, DeleteExpectedRack, ReplaceAllExpectedRacks) that call
certs.env.ExecuteWorkflow(CreateExpectedRack, nil) and assert the workflow
completes and returns an error (or the expected validation error) instead of
panicking; register any activities used by the workflows (or mock them) so the
test runs the workflow path that performs the pre-validation checks, and mirror
this pattern for UpdateExpectedRack, DeleteExpectedRack, and
ReplaceAllExpectedRacks to cover the ranges noted in the comment.
site-workflow/pkg/workflow/expectedrack.go (1)

23-25: ⚡ Quick win

Prefer workflow.GetLogger(ctx) inside workflows for replay-aware logging.

Using the package logger in workflow code can produce duplicate/non-contextual logs during replay. Switching to Temporal’s workflow logger keeps logs aligned with workflow execution semantics.

Also applies to: 52-53, 82-83, 105-106, 127-128, 149-150

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site-workflow/pkg/workflow/expectedrack.go` around lines 23 - 25, Inside the
workflow code in expectedrack.go replace uses of the package logger
(github.com/rs/zerolog/log) with Temporal's replay-aware logger: in each
workflow function (where you currently call log.Info/Debug/Error at the noted
locations) obtain logger := workflow.GetLogger(ctx) at the top of the function
and change log.<Level>(...) calls to logger.<Level>(...) (mapping zerolog
methods to the workflow logger methods as appropriate); also remove the zerolog
import if it becomes unused. Ensure you do this for all occurrences mentioned
(the instances using log at the start of the file and the other locations) so
logs are emitted via workflow.GetLogger(ctx) and are replay-safe.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/pkg/api/handler/expectedrack.go`:
- Around line 447-466: The handler currently looks up ExpectedRack by bare
rackId (using c.Param("rackId") and erDAO.Get) which can return/modify the wrong
record when rackId is only unique per site; modify the handler to read optional
siteId via c.QueryParam("siteId") and use it when present (pass siteId into the
DAO lookup so erDAO.Get uses rackId+siteId). When siteId is omitted, query the
DAO for all matches for that rackId (e.g., via a new or existing
erDAO.ListByRackID / FindByRackID method) and if more than one match is
returned, respond with cutil.NewAPIErrorResponse(http.StatusBadRequest,
"...ambiguous rackId...") rejecting the request; if exactly one match, use that
record's siteId/rack and proceed to call erDAO.Get as before. Ensure the same
pattern is applied to the other handlers referenced (lines ~552-586 and
~744-754).
- Around line 313-316: The tenant-only branch currently rejects callers when
tenant != nil even if they also have provider identity; update the condition to
enforce the "Site ID required for tenant" check only when there is no provider
identity (e.g., change the guard to "tenant != nil && provider == nil" or
equivalent), so dual-role principals follow the provider path if provider
identity exists; ensure the error still uses cutil.NewAPIErrorResponse and the
same message when the tenant-only condition truly applies.

In `@api/pkg/api/model/expectedrack_test.go`:
- Line 231: The site-mismatch test uses an invalid UUID which causes validation
to fail early; update the test to use a valid UUID string for the otherSiteID
variable (and the other occurrences in the same test block around the later
case) so validation proceeds to the top-level/entry SiteID equality check —
replace the current otherSiteID value with a well-formed UUID (or generate one
via uuid.NewString()) in the site-mismatch test and the corresponding instances
referenced in the same test.

In `@api/pkg/api/model/expectedrack.go`:
- Around line 97-124: The Validate method on APIExpectedRackUpdateRequest
currently allows empty updates; add a guard in
APIExpectedRackUpdateRequest.Validate (after existing field validations and
before returning nil) that ensures at least one mutable field is present (e.g.,
RackProfileID != nil, Metadata != nil, or any other mutable pointer/field on
APIExpectedRackUpdateRequest) and return a validation error like "no fields to
update" if none are set; reference the Validate method on
APIExpectedRackUpdateRequest and the fields RackProfileID and Metadata when
implementing this check.

In `@api/pkg/api/routes.go`:
- Around line 487-490: The route currently reserves the literal "all" by
registering Path "/expected-rack/all" which shadows the single-resource DELETE
"/expected-rack/:rackId" and prevents deleting a rack with id "all"; fix by
either (A) renaming the bulk-delete route to a non-conflicting path (e.g. change
Path used with NewDeleteAllExpectedRacksHandler to use a plural or explicit
suffix like "/expected-racks" or "/expected-rack/bulk-delete"), or (B) add
explicit request validation in the single-rack handler (the handler constructed
by NewDeleteExpectedRackHandler) to reject rackId == "all" and reflect that
reserved value in the OpenAPI spec and request validation schema; update the
route registration for NewDeleteAllExpectedRacksHandler and the OpenAPI docs
accordingly so the behavior is explicit.

In `@db/pkg/db/model/expectedrack.go`:
- Around line 61-64: The ExpectedRack model currently uses RackID as the sole
primary key (see struct ExpectedRack and field RackID) which makes rack IDs
global; change the persistence and DAO surfaces to be site-scoped by making the
primary key composite (SiteID + RackID) and update CRUD signatures (e.g., DAO
Get/Delete methods that currently accept rackID alone) to require both siteID
and rackID; ensure the bun tags on ExpectedRack (RackID and SiteID) reflect a
composite PK, and update any queries and unique constraints referenced around
the other occurences (lines referenced near 124-144) to filter by site_id as
well.
- Around line 388-455: UpdateMultiple currently unions all non-nil fields across
inputs (columnsSet -> columns) and calls Bulk(), which will overwrite missing
fields with zero values; fix by ensuring all inputs share the same field-mask or
by grouping inputs by identical field-mask and issuing one Bulk() per group. In
UpdateMultiple, compute each input's field mask (set of non-nil fields from
ExpectedRackUpdateInput) and either (a) validate that every mask equals the
first mask and return an error if they differ, or (b) partition inputs by mask
and for each partition build its own expectedRacks slice, columns list and call
NewUpdate().Model(&expectedRacks).Column(columns...).Bulk().Exec(ctx) so only
the intended columns are updated for each subgroup (use existing symbols:
ExpectedRackUpdateInput, columnsSet/columns, expectedRacks, ids, UpdateMultiple,
Bulk()).

In `@openapi/spec.yaml`:
- Around line 4274-4291: The PATCH operation update-expected-rack is keyed by
{rackId} but the request schema ExpectedRackUpdateRequest exposes an `id`
property, causing a contract mismatch; update the OpenAPI spec so the update
request schema uses `rackId` (or removes `id` entirely if server derives it from
the path) and ensure the path/parameter name `rackId` and any references to
ExpectedRackUpdateRequest (including the other occurrence referenced around
lines 18593-18618) are consistent with that change; also update any
examples/required lists and generated component schemas that mention `id` so
generated SDKs reflect the path key instead of an `id` body field.
- Around line 4185-4221: The OpenAPI paths
'/v2/org/{org}/carbide/expected-rack/all' and
'/v2/org/{org}/carbide/expected-rack/{rackId}' conflict because a rack with id
"all" would collide; fix by constraining the rackId path parameter (the
parameter named rackId in the '{rackId}' operation) with a regex pattern that
excludes the literal "all" (e.g. pattern: '^(?!all$).+' ) or by renaming the
bulk path to a non-identifier segment, and also update the create/POST
validation that currently allows any non-empty string so it enforces the same
rackId pattern; ensure the change is applied consistently for the same pair of
routes elsewhere (the other occurrences noted).

In `@site-workflow/pkg/activity/expectedrack_test.go`:
- Around line 136-149: The test case "test update expected rack success" (and
the similar cases around lines 222-234) must pre-seed the expected rack in the
test setup so the forge test server won’t return NotFound; before calling the
update/delete path, create the record using the same client/store used by the
handler (seed a cwssaws.ExpectedRack with RackId {Id: "test-update-rack-001"}
and RackProfileId {Id: "test-update-rack-profile-001"} via the
CarbideAtomicClient mock or test server helper), then perform the update/delete
assertions against that existing resource to validate the happy path (ensure you
reference the test’s args.request, CarbideAtomicClient, and cwssaws.ExpectedRack
identifiers).

In `@site-workflow/pkg/grpc/client/testing.go`:
- Around line 1223-1313: The mock methods (AddExpectedRack, UpdateExpectedRack,
DeleteExpectedRack, GetExpectedRack, GetAllExpectedRacks,
ReplaceAllExpectedRacks, DeleteAllExpectedRacks) must mirror the server test
double: avoid dereferencing in before nil-checking, validate inputs and return
codes.InvalidArgument for bad requests, persist expected racks in an in-memory
store on MockForgeClient (e.g., c.expectedRacks map[string]*wflows.ExpectedRack)
and update that store in Add/Update/Delete/ReplaceAll/DeleteAll, have Get return
codes.NotFound when an ID is missing, and have GetAll reflect the current
in-memory state (and honor ctx "wantError"/"wantCount" semantics only if
needed). Update the mentioned functions to perform nil checks first, use the
same status codes (InvalidArgument/NotFound), and read/write the shared
in-memory map so Get/GetAll return state consistent with prior mutations.

In `@site-workflow/pkg/workflow/expectedrack.go`:
- Line 50: The logger construction in expectedrack.go dereferences request
(e.g., request.GetRackId().GetId(), request.GetRackProfileId().GetId()) before
validating the payload which can panic on nil; modify the relevant handlers (the
Create path where logger is built and the similar sites around the other
occurrences at the same file) to first validate that request and its subfields
are non-nil (or return a well-defined error) before building the logger, and if
you still want to include IDs in logs when absent, use safe extraction
(nil-checks or default empty values) for request.GetRackId,
request.GetRackProfileId (and analogous fields at the other occurrences) so no
dereference can cause a panic.

---

Nitpick comments:
In `@site-agent/pkg/components/managers/expectedrack/init.go`:
- Around line 26-27: Replace the informal comment "// Validate the expectedrack
config later" with a searchable TODO comment such as "// TODO: validate
expectedrack config" so the deferred work is explicitly discoverable; update the
same section where ManagerAccess.Data.EB.Log.Info().Msg("ExpectedRack:
Initializing ExpectedRack API") is called to keep the TODO adjacent to that
initialization call and, if possible, include a short owner or ticket reference
in the TODO (e.g., TODO: validate expectedrack config `#ticket/owner`).

In `@site-workflow/pkg/workflow/expectedrack_test.go`:
- Around line 46-111: Add nil-request regression tests to the
CreateExpectedRackTestSuite to ensure the workflow entrypoints handle nil inputs
without panicking: create new tests (e.g., Test_CreateExpectedRack_NilRequest,
and analogous tests for UpdateExpectedRack, DeleteExpectedRack,
ReplaceAllExpectedRacks) that call certs.env.ExecuteWorkflow(CreateExpectedRack,
nil) and assert the workflow completes and returns an error (or the expected
validation error) instead of panicking; register any activities used by the
workflows (or mock them) so the test runs the workflow path that performs the
pre-validation checks, and mirror this pattern for UpdateExpectedRack,
DeleteExpectedRack, and ReplaceAllExpectedRacks to cover the ranges noted in the
comment.

In `@site-workflow/pkg/workflow/expectedrack.go`:
- Around line 23-25: Inside the workflow code in expectedrack.go replace uses of
the package logger (github.com/rs/zerolog/log) with Temporal's replay-aware
logger: in each workflow function (where you currently call log.Info/Debug/Error
at the noted locations) obtain logger := workflow.GetLogger(ctx) at the top of
the function and change log.<Level>(...) calls to logger.<Level>(...) (mapping
zerolog methods to the workflow logger methods as appropriate); also remove the
zerolog import if it becomes unused. Ensure you do this for all occurrences
mentioned (the instances using log at the start of the file and the other
locations) so logs are emitted via workflow.GetLogger(ctx) and are replay-safe.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0a71edc9-15b2-4c32-a1f4-c59b3236c5fd

📥 Commits

Reviewing files that changed from the base of the PR and between 33ddf04 and 68157bf.

⛔ Files ignored due to path filters (4)
  • workflow-schema/schema/site-agent/workflows/v1/common_carbide.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/schema/site-agent/workflows/v1/forge_carbide.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/site-agent/workflows/v1/common_carbide.proto is excluded by !workflow-schema/site-agent/workflows/v1/*_carbide.proto
  • workflow-schema/site-agent/workflows/v1/forge_carbide.proto is excluded by !workflow-schema/site-agent/workflows/v1/*_carbide.proto
📒 Files selected for processing (33)
  • api/pkg/api/handler/expectedrack.go
  • api/pkg/api/model/expectedrack.go
  • api/pkg/api/model/expectedrack_test.go
  • api/pkg/api/model/util/rack_labels.go
  • api/pkg/api/routes.go
  • db/pkg/db/model/expectedrack.go
  • db/pkg/db/model/expectedrack_test.go
  • db/pkg/migrations/20260429100000_expected_rack.go
  • docs/index.html
  • openapi/spec.yaml
  • sdk/standard/api_expected_rack.go
  • sdk/standard/client.go
  • sdk/standard/model_expected_rack.go
  • sdk/standard/model_expected_rack_create_request.go
  • sdk/standard/model_expected_rack_list.go
  • sdk/standard/model_expected_rack_metadata.go
  • sdk/standard/model_expected_rack_update_request.go
  • site-agent/pkg/components/managers/expectedrack/access.go
  • site-agent/pkg/components/managers/expectedrack/init.go
  • site-agent/pkg/components/managers/expectedrack/subscriber.go
  • site-agent/pkg/components/managers/manager.go
  • site-agent/pkg/components/managers/manageraccess.go
  • site-agent/pkg/components/managers/managerapi/expectedrack_api.go
  • site-agent/pkg/components/managers/managerapi/managerapi.go
  • site-agent/pkg/components/managers/workflow/orchestrator.go
  • site-agent/pkg/datatypes/managertypes/workflow/workflowtypes.go
  • site-workflow/pkg/activity/expectedrack.go
  • site-workflow/pkg/activity/expectedrack_test.go
  • site-workflow/pkg/activity/rack_labels.go
  • site-workflow/pkg/grpc/client/testing.go
  • site-workflow/pkg/grpc/server/forge_test_server.go
  • site-workflow/pkg/workflow/expectedrack.go
  • site-workflow/pkg/workflow/expectedrack_test.go

Comment thread api/pkg/api/handler/expectedrack.go Outdated
Comment thread api/pkg/api/handler/expectedrack.go Outdated
Comment thread api/pkg/api/model/expectedrack_test.go Outdated
Comment thread api/pkg/api/model/expectedrack.go
Comment thread api/pkg/api/routes.go Outdated
Comment thread openapi/spec.yaml Outdated
Comment thread openapi/spec.yaml
Comment on lines +4274 to +4291
summary: Update Expected Rack
operationId: update-expected-rack
description: |-
Update an existing Expected Rack identified by its operator-supplied `rackId`.

Org must have an Infrastructure Provider entity. User must have `FORGE_PROVIDER_ADMIN` role.

Infrastructure Provider must own the Expected Rack.

Alternatively, Tenant Admins with `TargetedInstanceCreation` capability can also update Expected Racks if they have an account with the Site's Infrastructure Provider.
parameters:
- schema:
type: string
format: uuid
name: siteId
in: query
description: ID of the Site the Expected Rack belongs to. Required when multiple Sites have an Expected Rack with the same `rackId`.
requestBody:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

PATCH schema should not use id when the API contract uses rackId

The endpoint is keyed by {rackId}, but ExpectedRackUpdateRequest exposes id. This inconsistency leaks into generated SDKs and creates an avoidable contract mismatch.

Suggested schema alignment
ExpectedRackUpdateRequest:
  description: |-
    Request data to update an existing Expected Rack.
-   For single updates (PATCH /expected-rack/{rackId}), the `id` field is optional in the body and will be ignored if provided (the `rackId` from the URL path is used).
+   For single updates (PATCH /expected-rack/{rackId}), the `rackId` in the URL path is authoritative.

  properties:
-   id:
-     type:
-       - string
-       - 'null'
-     description: |-
-       Operator-supplied identifier of the Expected Rack to update (string, not UUID).
-       Optional for single Expected Rack update (must be empty or match the `rackId` from the URL path).

As per coding guidelines, "Review the OpenAPI specification, check for consistency and correctness."

Also applies to: 18593-18618

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openapi/spec.yaml` around lines 4274 - 4291, The PATCH operation
update-expected-rack is keyed by {rackId} but the request schema
ExpectedRackUpdateRequest exposes an `id` property, causing a contract mismatch;
update the OpenAPI spec so the update request schema uses `rackId` (or removes
`id` entirely if server derives it from the path) and ensure the path/parameter
name `rackId` and any references to ExpectedRackUpdateRequest (including the
other occurrence referenced around lines 18593-18618) are consistent with that
change; also update any examples/required lists and generated component schemas
that mention `id` so generated SDKs reflect the path key instead of an `id` body
field.

Comment on lines +136 to +149
{
name: "test update expected rack success",
fields: fields{
CarbideAtomicClient: carbideAtomicClient,
},
args: args{
ctx: context.Background(),
request: &cwssaws.ExpectedRack{
RackId: &cwssaws.RackId{Id: "test-update-rack-001"},
RackProfileId: &cwssaws.RackProfileId{Id: "test-update-rack-profile-001"},
},
},
wantErr: false,
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Seed the rack before asserting the update/delete happy paths.

These “success” cases do not currently prove the happy path. The backing forge test server returns NotFound for unknown racks on update/delete, so these assertions only validate success if the mock client bypasses the new server semantics. Pre-create the rack in test setup, then run update/delete against that existing record.

Also applies to: 222-234

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site-workflow/pkg/activity/expectedrack_test.go` around lines 136 - 149, The
test case "test update expected rack success" (and the similar cases around
lines 222-234) must pre-seed the expected rack in the test setup so the forge
test server won’t return NotFound; before calling the update/delete path, create
the record using the same client/store used by the handler (seed a
cwssaws.ExpectedRack with RackId {Id: "test-update-rack-001"} and RackProfileId
{Id: "test-update-rack-profile-001"} via the CarbideAtomicClient mock or test
server helper), then perform the update/delete assertions against that existing
resource to validate the happy path (ensure you reference the test’s
args.request, CarbideAtomicClient, and cwssaws.ExpectedRack identifiers).

Comment thread site-workflow/pkg/grpc/client/testing.go
// CreateExpectedRackOnSite activity, then also creates the rack in RLA via
// CreateExpectedRackOnRLA (best-effort).
func CreateExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRack) error {
logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Create").Str("ID", request.GetRackId().GetId()).Str("RackProfileID", request.GetRackProfileId().GetId()).Logger()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard request before logger field extraction to avoid workflow panics.

These lines dereference request before any validation. A nil payload will panic in workflow code and bypass the activity-layer invalid-request handling.

Proposed fix
 func CreateExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRack) error {
-	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Create").Str("ID", request.GetRackId().GetId()).Str("RackProfileID", request.GetRackProfileId().GetId()).Logger()
+	rackID, rackProfileID := "", ""
+	if request != nil {
+		rackID = request.GetRackId().GetId()
+		rackProfileID = request.GetRackProfileId().GetId()
+	}
+	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Create").Str("ID", rackID).Str("RackProfileID", rackProfileID).Logger()
@@
 func UpdateExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRack) error {
-	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Update").Str("ID", request.GetRackId().GetId()).Str("RackProfileID", request.GetRackProfileId().GetId()).Logger()
+	rackID, rackProfileID := "", ""
+	if request != nil {
+		rackID = request.GetRackId().GetId()
+		rackProfileID = request.GetRackProfileId().GetId()
+	}
+	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Update").Str("ID", rackID).Str("RackProfileID", rackProfileID).Logger()
@@
 func DeleteExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRackRequest) error {
-	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Delete").Str("ID", request.GetRackId()).Logger()
+	rackID := ""
+	if request != nil {
+		rackID = request.GetRackId()
+	}
+	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Delete").Str("ID", rackID).Logger()
@@
 func ReplaceAllExpectedRacks(ctx workflow.Context, request *cwssaws.ExpectedRackList) error {
-	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "ReplaceAll").Int("Count", len(request.GetExpectedRacks())).Logger()
+	count := 0
+	if request != nil {
+		count = len(request.GetExpectedRacks())
+	}
+	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "ReplaceAll").Int("Count", count).Logger()

Also applies to: 80-80, 103-103, 125-125

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site-workflow/pkg/workflow/expectedrack.go` at line 50, The logger
construction in expectedrack.go dereferences request (e.g.,
request.GetRackId().GetId(), request.GetRackProfileId().GetId()) before
validating the payload which can panic on nil; modify the relevant handlers (the
Create path where logger is built and the similar sites around the other
occurrences at the same file) to first validate that request and its subfields
are non-nil (or return a well-defined error) before building the logger, and if
you still want to include IDs in logs when absent, use safe extraction
(nil-checks or default empty values) for request.GetRackId,
request.GetRackProfileId (and analogous fields at the other occurrences) so no
dereference can cause a panic.

@chet chet force-pushed the expected-rack-apis branch from 68157bf to eda6086 Compare April 30, 2026 06:25
@chet chet changed the title feat: add ExpectedRack APIs feat: Add ExpectedRack APIs for Create, Update, and Get Apr 30, 2026
@chet chet force-pushed the expected-rack-apis branch from eda6086 to 230d6f9 Compare April 30, 2026 06:34
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
api/pkg/api/handler/expectedrack_test.go (1)

1411-1419: 💤 Low value

Minor: Fragile rack ID generation using rune arithmetic.

The expression string(rune('a'+i)) works for small loop counts but becomes non-portable if the loop extends beyond 25 iterations. Consider using fmt.Sprintf for clarity.

♻️ Suggested improvement
 for i := 0; i < 3; i++ {
 	_, err := erDAO.Create(ctx, nil, cdbm.ExpectedRackCreateInput{
-		RackID:        "delete-all-rack-" + string(rune('a'+i)),
+		RackID:        fmt.Sprintf("delete-all-rack-%d", i),
 		SiteID:        site.ID,
 		RackProfileID: "profile-delete-all",
 		CreatedBy:     dbUser.ID,
 	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/handler/expectedrack_test.go` around lines 1411 - 1419, The rack
ID generation in the test uses string(rune('a'+i)) which is fragile; update the
loop that calls erDAO.Create with ExpectedRackCreateInput to build RackID using
fmt.Sprintf (e.g., fmt.Sprintf("delete-all-rack-%d" or
fmt.Sprintf("delete-all-rack-%c", 'a'+i)) instead of rune arithmetic) so the IDs
are clearer and safe for larger loop counts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/pkg/api/handler/expectedrack.go`:
- Around line 165-181: The duplicate RackID check is currently global; scope it
to the target site by adding the resolved site ID to the GetAll filter: when
calling erDAOForCheck.GetAll (used on cdbm.NewExpectedRackDAO and passing
cdbm.ExpectedRackFilterInput with RackIDs: []string{apiRequest.RackID}), include
the site identifier (e.g., SiteIDs: []string{site.ID} or the appropriate SiteID
field on ExpectedRackFilterInput) so the preflight lookup only considers racks
within the same site before returning the 409 conflict.

In `@site-workflow/pkg/activity/expectedrack.go`:
- Around line 69-71: The code is dereferencing
mer.CarbideAtomicClient.GetClient() and calling Carbide() without checking for
nil; change each occurrence (e.g., where you call
mer.CarbideAtomicClient.GetClient() then .Carbide()) to first call client :=
mer.CarbideAtomicClient.GetClient(), check if client == nil and if so return a
connection error (not panic) from the enclosing function, and only then call
forgeClient := client.Carbide(); apply the same nil-guard and error return
pattern to all occurrences (the instances around the existing Carbide() calls
mentioned in the review).

---

Nitpick comments:
In `@api/pkg/api/handler/expectedrack_test.go`:
- Around line 1411-1419: The rack ID generation in the test uses
string(rune('a'+i)) which is fragile; update the loop that calls erDAO.Create
with ExpectedRackCreateInput to build RackID using fmt.Sprintf (e.g.,
fmt.Sprintf("delete-all-rack-%d" or fmt.Sprintf("delete-all-rack-%c", 'a'+i))
instead of rune arithmetic) so the IDs are clearer and safe for larger loop
counts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a77584db-6649-48db-88fd-23afc5d7e261

📥 Commits

Reviewing files that changed from the base of the PR and between 68157bf and eda6086.

⛔ Files ignored due to path filters (4)
  • workflow-schema/schema/site-agent/workflows/v1/common_carbide.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/schema/site-agent/workflows/v1/forge_carbide.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/site-agent/workflows/v1/common_carbide.proto is excluded by !workflow-schema/site-agent/workflows/v1/*_carbide.proto
  • workflow-schema/site-agent/workflows/v1/forge_carbide.proto is excluded by !workflow-schema/site-agent/workflows/v1/*_carbide.proto
📒 Files selected for processing (34)
  • api/pkg/api/handler/expectedrack.go
  • api/pkg/api/handler/expectedrack_test.go
  • api/pkg/api/model/expectedrack.go
  • api/pkg/api/model/expectedrack_test.go
  • api/pkg/api/model/util/rack_labels.go
  • api/pkg/api/routes.go
  • db/pkg/db/model/expectedrack.go
  • db/pkg/db/model/expectedrack_test.go
  • db/pkg/migrations/20260429100000_expected_rack.go
  • docs/index.html
  • openapi/spec.yaml
  • sdk/standard/api_expected_rack.go
  • sdk/standard/client.go
  • sdk/standard/model_expected_rack.go
  • sdk/standard/model_expected_rack_create_request.go
  • sdk/standard/model_expected_rack_list.go
  • sdk/standard/model_expected_rack_metadata.go
  • sdk/standard/model_expected_rack_update_request.go
  • site-agent/pkg/components/managers/expectedrack/access.go
  • site-agent/pkg/components/managers/expectedrack/init.go
  • site-agent/pkg/components/managers/expectedrack/subscriber.go
  • site-agent/pkg/components/managers/manager.go
  • site-agent/pkg/components/managers/manageraccess.go
  • site-agent/pkg/components/managers/managerapi/expectedrack_api.go
  • site-agent/pkg/components/managers/managerapi/managerapi.go
  • site-agent/pkg/components/managers/workflow/orchestrator.go
  • site-agent/pkg/datatypes/managertypes/workflow/workflowtypes.go
  • site-workflow/pkg/activity/expectedrack.go
  • site-workflow/pkg/activity/expectedrack_test.go
  • site-workflow/pkg/activity/rack_labels.go
  • site-workflow/pkg/grpc/client/testing.go
  • site-workflow/pkg/grpc/server/forge_test_server.go
  • site-workflow/pkg/workflow/expectedrack.go
  • site-workflow/pkg/workflow/expectedrack_test.go
✅ Files skipped from review due to trivial changes (5)
  • api/pkg/api/model/util/rack_labels.go
  • site-agent/pkg/components/managers/managerapi/expectedrack_api.go
  • site-workflow/pkg/activity/rack_labels.go
  • db/pkg/db/model/expectedrack_test.go
  • site-workflow/pkg/workflow/expectedrack_test.go
🚧 Files skipped from review as they are similar to previous changes (12)
  • site-agent/pkg/components/managers/workflow/orchestrator.go
  • site-agent/pkg/datatypes/managertypes/workflow/workflowtypes.go
  • api/pkg/api/routes.go
  • sdk/standard/client.go
  • db/pkg/migrations/20260429100000_expected_rack.go
  • site-agent/pkg/components/managers/manageraccess.go
  • site-agent/pkg/components/managers/expectedrack/subscriber.go
  • site-workflow/pkg/grpc/server/forge_test_server.go
  • site-agent/pkg/components/managers/manager.go
  • site-workflow/pkg/grpc/client/testing.go
  • sdk/standard/model_expected_rack_list.go
  • openapi/spec.yaml

Comment thread api/pkg/api/handler/expectedrack.go Outdated
Comment on lines +69 to +71
// Call Site Controller gRPC endpoint
carbideClient := mer.CarbideAtomicClient.GetClient()
forgeClient := carbideClient.Carbide()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return a connection error instead of panicking on an unavailable Carbide client.

Each Site activity dereferences mer.CarbideAtomicClient and GetClient() without a guard. If the Carbide client is not configured or not connected, this will panic the worker instead of failing the activity cleanly. Please mirror the defensive pattern already used for RLA and return a normal connection error before calling Carbide().

Suggested fix
-	carbideClient := mer.CarbideAtomicClient.GetClient()
-	forgeClient := carbideClient.Carbide()
+	if mer.CarbideAtomicClient == nil {
+		return cclient.ErrClientNotConnected
+	}
+	carbideClient := mer.CarbideAtomicClient.GetClient()
+	if carbideClient == nil {
+		return cclient.ErrClientNotConnected
+	}
+	forgeClient := carbideClient.Carbide()

Also applies to: 106-107, 140-141, 181-182, 202-203

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site-workflow/pkg/activity/expectedrack.go` around lines 69 - 71, The code is
dereferencing mer.CarbideAtomicClient.GetClient() and calling Carbide() without
checking for nil; change each occurrence (e.g., where you call
mer.CarbideAtomicClient.GetClient() then .Carbide()) to first call client :=
mer.CarbideAtomicClient.GetClient(), check if client == nil and if so return a
connection error (not panic) from the enclosing function, and only then call
forgeClient := client.Carbide(); apply the same nil-guard and error return
pattern to all occurrences (the instances around the existing Carbide() calls
mentioned in the review).

@chet chet force-pushed the expected-rack-apis branch from 230d6f9 to 47d035f Compare April 30, 2026 06:40
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (6)
site-workflow/pkg/activity/expectedrack.go (1)

69-71: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard the Carbide client before dereferencing it.

Each Site activity assumes ManageExpectedRack.CarbideAtomicClient and ManageExpectedRack.CarbideAtomicClient.GetClient() are non-nil. If the client is unconfigured or disconnected, the worker will panic instead of returning a clean activity error. The RLA path in this file already shows the safer pattern.

Suggested fix
-	carbideClient := mer.CarbideAtomicClient.GetClient()
-	forgeClient := carbideClient.Carbide()
+	if mer.CarbideAtomicClient == nil {
+		return cclient.ErrClientNotConnected
+	}
+	carbideClient := mer.CarbideAtomicClient.GetClient()
+	if carbideClient == nil {
+		return cclient.ErrClientNotConnected
+	}
+	forgeClient := carbideClient.Carbide()

Repeat this at each Carbide call site in this file.

Also applies to: 105-107, 139-141, 180-182, 201-203

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site-workflow/pkg/activity/expectedrack.go` around lines 69 - 71, The code
dereferences ManageExpectedRack.CarbideAtomicClient and its GetClient() result
(e.g., creating carbideClient and calling carbideClient.Carbide()) without nil
checks; update each call site (including the instances around
carbideClient/forgeClient at lines shown and other occurrences) to first verify
ManageExpectedRack.CarbideAtomicClient != nil and that GetClient() returns a
non-nil client, and if either is nil return a proper activity error (same
pattern used in the RLA path) instead of panicking—apply this guard to all
Carbide call sites (GetClient, Carbide()) in this file.
site-workflow/pkg/grpc/client/testing.go (1)

1223-1313: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Mirror the Forge test server contract in this mock.

This block still diverges from site-workflow/pkg/grpc/server/forge_test_server.go:1165-1249: it can panic on nil input, returns codes.Internal instead of codes.InvalidArgument for bad requests, never returns codes.NotFound for missing racks, and GetExpectedRack/GetAllExpectedRacks synthesize data instead of reflecting prior mutations. That makes the workflow tests materially less trustworthy because the mock accepts flows the in-repo server double rejects.

Back these methods with shared in-memory ExpectedRack state and reuse the same validation / InvalidArgument / NotFound semantics as the server test double.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site-workflow/pkg/grpc/client/testing.go` around lines 1223 - 1313, The mock
methods AddExpectedRack, UpdateExpectedRack, DeleteExpectedRack,
GetExpectedRack, GetAllExpectedRacks, ReplaceAllExpectedRacks and
DeleteAllExpectedRacks currently diverge from the server test double: they can
panic on nil inputs, return codes.Internal instead of codes.InvalidArgument,
never return codes.NotFound, and synthesize data instead of reflecting prior
mutations. Fix by adding shared in-memory state on MockForgeClient (e.g., an
expectedRacks map[string]*wflows.ExpectedRack protected by a mutex) and update
each method to: validate inputs nil-safe and return
status.Error(codes.InvalidArgument, ...) for bad requests, behave like the
server by returning status.Error(codes.NotFound, ...) when an ID is missing,
mutate/read the shared map for
AddExpectedRack/UpdateExpectedRack/DeleteExpectedRack/GetExpectedRack/GetAllExpectedRacks/ReplaceAllExpectedRacks/DeleteAllExpectedRacks
so queries reflect prior mutations, and keep existing ctx-based
wantError/wantCount behavior for error simulation.
api/pkg/api/handler/expectedrack.go (1)

331-334: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not force dual-role callers down the tenant-only path.

This branch rejects requests whenever tenant != nil, even when the caller also has provider identity. That turns a valid provider-scoped list into a 400 for mixed-role service accounts.

Suggested fix
-	} else if tenant != nil {
+	} else if tenant != nil && infrastructureProvider == nil {
 		// Tenants must specify a Site ID
 		return cutil.NewAPIErrorResponse(c, http.StatusBadRequest, "Site ID must be specified in query when retrieving Expected Racks as a Tenant", nil)
 	} else {

Based on learnings: RBAC should not short-circuit on the first matching role; evaluate both authorization paths independently for dual-role service accounts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/handler/expectedrack.go` around lines 331 - 334, The current
branch unconditionally returns a 400 when tenant != nil, which blocks callers
who also have provider identity; change the check to only reject when tenant is
set and the caller lacks provider rights (e.g., if tenant != nil && provider ==
nil) so provider-scoped requests still proceed; modify the logic around
filterInput.SiteIDs and the cutil.NewAPIErrorResponse call so you only enforce
"Tenants must specify a Site ID" for pure-tenant callers and evaluate tenant and
provider authorization paths independently.
db/pkg/db/model/expectedrack.go (2)

408-455: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject mixed field masks in UpdateMultiple.

columns is built as the union of all non-nil fields across inputs, then applied to every row. If one item only updates RackProfileID and another only updates MetadataName, the missing fields on each row are written as zero values.

Suggested guard
 func (erd ExpectedRackSQLDAO) UpdateMultiple(ctx context.Context, tx *db.Tx, inputs []ExpectedRackUpdateInput) ([]ExpectedRack, error) {
 	// Create a child span and set the attributes for current request
 	ctx, expectedRackDAOSpan := erd.tracerSpan.CreateChildInCurrentContext(ctx, "ExpectedRackDAO.UpdateMultiple")
 	if expectedRackDAOSpan != nil {
 		defer expectedRackDAOSpan.End()
 		erd.tracerSpan.SetAttribute(expectedRackDAOSpan, "batch_size", len(inputs))
 	}
 
 	if len(inputs) == 0 {
 		return []ExpectedRack{}, nil
 	}
 
 	expectedRacks := make([]*ExpectedRack, 0, len(inputs))
 	ids := make([]string, 0, len(inputs))
 	columnsSet := make(map[string]bool)
+	var expectedMask struct {
+		rackProfileID       bool
+		metadataName        bool
+		metadataDescription bool
+		metadataLabels      bool
+	}
 
-	for _, input := range inputs {
+	for i, input := range inputs {
+		mask := struct {
+			rackProfileID       bool
+			metadataName        bool
+			metadataDescription bool
+			metadataLabels      bool
+		}{
+			rackProfileID:       input.RackProfileID != nil,
+			metadataName:        input.MetadataName != nil,
+			metadataDescription: input.MetadataDescription != nil,
+			metadataLabels:      input.MetadataLabels != nil,
+		}
+		if i == 0 {
+			expectedMask = mask
+		} else if mask != expectedMask {
+			return nil, fmt.Errorf("all updates in a batch must modify the same fields")
+		}
+
 		er := &ExpectedRack{
 			RackID: input.RackID,
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@db/pkg/db/model/expectedrack.go` around lines 408 - 455, The bulk update
builds a union of non-nil fields across inputs and applies that same column list
to every ExpectedRack, which will overwrite missing fields with zero values; in
UpdateMultiple validate that every item in inputs has the same set of non-nil
fields before constructing columns (compare per-item field masks derived from
RackProfileID, MetadataName, MetadataDescription, MetadataLabels etc.), and if
any input's mask differs return an error (or fall back to per-row updates)
instead of proceeding with the union; reference the inputs slice,
columnsSet/columns, expectedRacks, and the
NewUpdate().Model(&expectedRacks).Column(columns...).Bulk().Exec(...) call to
locate where to add this guard.

61-64: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make ExpectedRack identity site-scoped end-to-end.

RackID is still the sole primary key, and the DAO contract still addresses single records by bare rackID. That keeps rack IDs globally unique and makes single-record CRUD ambiguous, which conflicts with the public contract in this PR where rackId is site-scoped. Please move persistence and DAO lookup/delete signatures to (site_id, rack_id) or explicitly change the API contract before merge.

Also applies to: 124-144

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@db/pkg/db/model/expectedrack.go` around lines 61 - 64, ExpectedRack is still
using RackID as the sole primary key which makes IDs global; change persistence
and DAO lookups to be site-scoped by making the composite primary key (site_id,
rack_id) in the ExpectedRack struct (update struct tags on SiteID and RackID to
include `pk` on both), and update all DAO signatures and implementations that
currently accept a single rackID (e.g., any Get/Delete/Find methods that take
rackID) to take both siteID (uuid.UUID) and rackID (string) and ensure their
queries filter by both site_id and rack_id; also mirror this change in any
related methods referenced around lines 124-144 so all single-record CRUD is
unambiguous and site-scoped.
api/pkg/api/model/expectedrack.go (1)

103-130: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject empty ExpectedRack updates.

Validate() still accepts {} and { "id": "rack-1" }. Downstream, that becomes a write that only bumps updated and still triggers the update workflow. Require at least one mutable field before returning nil.

Suggested guard
 func (erur *APIExpectedRackUpdateRequest) Validate() error {
 	if erur.ID != nil {
 		if *erur.ID == "" {
 			return validation.Errors{
 				"id": errors.New("ID cannot be empty"),
 			}
 		}
 	}
 
 	err := validation.ValidateStruct(erur,
 		validation.Field(&erur.RackProfileID,
 			validation.NilOrNotEmpty.Error("RackProfileID cannot be empty"),
 			validation.When(erur.RackProfileID != nil && *erur.RackProfileID != "",
 				validation.Match(util.NotAllWhitespaceRegexp).Error("RackProfileID consists only of whitespace"))),
 	)
 
 	if err != nil {
 		return err
 	}
 
 	if erur.Metadata != nil {
 		if err := util.ValidateLabels(erur.Metadata.Labels); err != nil {
 			return err
 		}
 	}
+
+	if erur.RackProfileID == nil && erur.Metadata == nil {
+		return validation.Errors{
+			"body": errors.New("at least one field must be provided"),
+		}
+	}
 
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/model/expectedrack.go` around lines 103 - 130,
APIExpectedRackUpdateRequest.Validate currently allows empty update requests
(e.g., {} or only id) which should be rejected; after the existing field
validations (including util.ValidateLabels) add a guard that ensures at least
one mutable field is being updated: check that RackProfileID is non-nil and not
the empty string (and not whitespace) OR Metadata is non-nil with at least one
label (or other mutable metadata content). If neither condition is true, return
a validation error (e.g., validation.Errors with a message like "must specify at
least one updatable field") so empty/no-op updates are rejected; update the
logic in APIExpectedRackUpdateRequest.Validate to perform this presence check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@site-workflow/pkg/grpc/server/forge_test_server.go`:
- Line 81: The test server's ForgeServerImpl.er map is accessed concurrently by
multiple RPC handlers causing race conditions; add a sync.RWMutex field (e.g.,
erMu) to ForgeServerImpl and wrap all accesses to er with appropriate locking:
use erMu.Lock()/erMu.Unlock() for writes (Add/Update/Delete/ReplaceAll) and
erMu.RLock()/erMu.RUnlock() for reads (GetAll and any other readers). Update all
methods that reference ForgeServerImpl.er (including the handlers mentioned
around Add/Update/Delete/GetAll/ReplaceAll) to use these locks to prevent
concurrent map read/write panics.

---

Duplicate comments:
In `@api/pkg/api/handler/expectedrack.go`:
- Around line 331-334: The current branch unconditionally returns a 400 when
tenant != nil, which blocks callers who also have provider identity; change the
check to only reject when tenant is set and the caller lacks provider rights
(e.g., if tenant != nil && provider == nil) so provider-scoped requests still
proceed; modify the logic around filterInput.SiteIDs and the
cutil.NewAPIErrorResponse call so you only enforce "Tenants must specify a Site
ID" for pure-tenant callers and evaluate tenant and provider authorization paths
independently.

In `@api/pkg/api/model/expectedrack.go`:
- Around line 103-130: APIExpectedRackUpdateRequest.Validate currently allows
empty update requests (e.g., {} or only id) which should be rejected; after the
existing field validations (including util.ValidateLabels) add a guard that
ensures at least one mutable field is being updated: check that RackProfileID is
non-nil and not the empty string (and not whitespace) OR Metadata is non-nil
with at least one label (or other mutable metadata content). If neither
condition is true, return a validation error (e.g., validation.Errors with a
message like "must specify at least one updatable field") so empty/no-op updates
are rejected; update the logic in APIExpectedRackUpdateRequest.Validate to
perform this presence check.

In `@db/pkg/db/model/expectedrack.go`:
- Around line 408-455: The bulk update builds a union of non-nil fields across
inputs and applies that same column list to every ExpectedRack, which will
overwrite missing fields with zero values; in UpdateMultiple validate that every
item in inputs has the same set of non-nil fields before constructing columns
(compare per-item field masks derived from RackProfileID, MetadataName,
MetadataDescription, MetadataLabels etc.), and if any input's mask differs
return an error (or fall back to per-row updates) instead of proceeding with the
union; reference the inputs slice, columnsSet/columns, expectedRacks, and the
NewUpdate().Model(&expectedRacks).Column(columns...).Bulk().Exec(...) call to
locate where to add this guard.
- Around line 61-64: ExpectedRack is still using RackID as the sole primary key
which makes IDs global; change persistence and DAO lookups to be site-scoped by
making the composite primary key (site_id, rack_id) in the ExpectedRack struct
(update struct tags on SiteID and RackID to include `pk` on both), and update
all DAO signatures and implementations that currently accept a single rackID
(e.g., any Get/Delete/Find methods that take rackID) to take both siteID
(uuid.UUID) and rackID (string) and ensure their queries filter by both site_id
and rack_id; also mirror this change in any related methods referenced around
lines 124-144 so all single-record CRUD is unambiguous and site-scoped.

In `@site-workflow/pkg/activity/expectedrack.go`:
- Around line 69-71: The code dereferences
ManageExpectedRack.CarbideAtomicClient and its GetClient() result (e.g.,
creating carbideClient and calling carbideClient.Carbide()) without nil checks;
update each call site (including the instances around carbideClient/forgeClient
at lines shown and other occurrences) to first verify
ManageExpectedRack.CarbideAtomicClient != nil and that GetClient() returns a
non-nil client, and if either is nil return a proper activity error (same
pattern used in the RLA path) instead of panicking—apply this guard to all
Carbide call sites (GetClient, Carbide()) in this file.

In `@site-workflow/pkg/grpc/client/testing.go`:
- Around line 1223-1313: The mock methods AddExpectedRack, UpdateExpectedRack,
DeleteExpectedRack, GetExpectedRack, GetAllExpectedRacks,
ReplaceAllExpectedRacks and DeleteAllExpectedRacks currently diverge from the
server test double: they can panic on nil inputs, return codes.Internal instead
of codes.InvalidArgument, never return codes.NotFound, and synthesize data
instead of reflecting prior mutations. Fix by adding shared in-memory state on
MockForgeClient (e.g., an expectedRacks map[string]*wflows.ExpectedRack
protected by a mutex) and update each method to: validate inputs nil-safe and
return status.Error(codes.InvalidArgument, ...) for bad requests, behave like
the server by returning status.Error(codes.NotFound, ...) when an ID is missing,
mutate/read the shared map for
AddExpectedRack/UpdateExpectedRack/DeleteExpectedRack/GetExpectedRack/GetAllExpectedRacks/ReplaceAllExpectedRacks/DeleteAllExpectedRacks
so queries reflect prior mutations, and keep existing ctx-based
wantError/wantCount behavior for error simulation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 618be74c-159e-4f4b-8344-6818bfab56ba

📥 Commits

Reviewing files that changed from the base of the PR and between eda6086 and 230d6f9.

⛔ Files ignored due to path filters (4)
  • workflow-schema/schema/site-agent/workflows/v1/common_carbide.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/schema/site-agent/workflows/v1/forge_carbide.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/site-agent/workflows/v1/common_carbide.proto is excluded by !workflow-schema/site-agent/workflows/v1/*_carbide.proto
  • workflow-schema/site-agent/workflows/v1/forge_carbide.proto is excluded by !workflow-schema/site-agent/workflows/v1/*_carbide.proto
📒 Files selected for processing (34)
  • api/pkg/api/handler/expectedrack.go
  • api/pkg/api/handler/expectedrack_test.go
  • api/pkg/api/model/expectedrack.go
  • api/pkg/api/model/expectedrack_test.go
  • api/pkg/api/model/util/rack_labels.go
  • api/pkg/api/routes.go
  • db/pkg/db/model/expectedrack.go
  • db/pkg/db/model/expectedrack_test.go
  • db/pkg/migrations/20260429100000_expected_rack.go
  • docs/index.html
  • openapi/spec.yaml
  • sdk/standard/api_expected_rack.go
  • sdk/standard/client.go
  • sdk/standard/model_expected_rack.go
  • sdk/standard/model_expected_rack_create_request.go
  • sdk/standard/model_expected_rack_list.go
  • sdk/standard/model_expected_rack_metadata.go
  • sdk/standard/model_expected_rack_update_request.go
  • site-agent/pkg/components/managers/expectedrack/access.go
  • site-agent/pkg/components/managers/expectedrack/init.go
  • site-agent/pkg/components/managers/expectedrack/subscriber.go
  • site-agent/pkg/components/managers/manager.go
  • site-agent/pkg/components/managers/manageraccess.go
  • site-agent/pkg/components/managers/managerapi/expectedrack_api.go
  • site-agent/pkg/components/managers/managerapi/managerapi.go
  • site-agent/pkg/components/managers/workflow/orchestrator.go
  • site-agent/pkg/datatypes/managertypes/workflow/workflowtypes.go
  • site-workflow/pkg/activity/expectedrack.go
  • site-workflow/pkg/activity/expectedrack_test.go
  • site-workflow/pkg/activity/rack_labels.go
  • site-workflow/pkg/grpc/client/testing.go
  • site-workflow/pkg/grpc/server/forge_test_server.go
  • site-workflow/pkg/workflow/expectedrack.go
  • site-workflow/pkg/workflow/expectedrack_test.go
✅ Files skipped from review due to trivial changes (13)
  • site-agent/pkg/components/managers/workflow/orchestrator.go
  • site-agent/pkg/components/managers/manageraccess.go
  • api/pkg/api/model/util/rack_labels.go
  • site-agent/pkg/components/managers/manager.go
  • site-workflow/pkg/activity/rack_labels.go
  • site-agent/pkg/components/managers/managerapi/expectedrack_api.go
  • site-agent/pkg/components/managers/expectedrack/access.go
  • sdk/standard/model_expected_rack_metadata.go
  • sdk/standard/model_expected_rack.go
  • sdk/standard/model_expected_rack_list.go
  • sdk/standard/model_expected_rack_create_request.go
  • site-workflow/pkg/workflow/expectedrack.go
  • sdk/standard/model_expected_rack_update_request.go
🚧 Files skipped from review as they are similar to previous changes (11)
  • site-agent/pkg/components/managers/managerapi/managerapi.go
  • sdk/standard/client.go
  • site-agent/pkg/datatypes/managertypes/workflow/workflowtypes.go
  • site-agent/pkg/components/managers/expectedrack/init.go
  • site-agent/pkg/components/managers/expectedrack/subscriber.go
  • db/pkg/migrations/20260429100000_expected_rack.go
  • db/pkg/db/model/expectedrack_test.go
  • site-workflow/pkg/workflow/expectedrack_test.go
  • site-workflow/pkg/activity/expectedrack_test.go
  • api/pkg/api/handler/expectedrack_test.go
  • openapi/spec.yaml

@@ -78,6 +78,7 @@ type ForgeServerImpl struct {
em map[string]*cwssaws.ExpectedMachine
eps map[string]*cwssaws.ExpectedPowerShelf
es map[string]*cwssaws.ExpectedSwitch
er map[string]*cwssaws.ExpectedRack
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Protect ForgeServerImpl.er with a mutex.

These RPCs run on a gRPC server and all access the same map directly. Parallel Add/Update/Delete/GetAll/ReplaceAll calls can trigger concurrent map writes or concurrent map read and map write, which will make the test server flaky under parallel workflow tests.

Suggested direction
 import (
 	"context"
 	"fmt"
 	"net"
+	"sync"
 	"time"
@@
 type ForgeServerImpl struct {
 	cwssaws.UnimplementedForgeServer
+	erMu sync.RWMutex
 	v   map[string]*cwssaws.Vpc
@@
 	er  map[string]*cwssaws.ExpectedRack
@@
 func (f *ForgeServerImpl) AddExpectedRack(ctx context.Context, req *cwssaws.ExpectedRack) (*emptypb.Empty, error) {
 	if req == nil || req.RackId == nil || req.RackId.Id == "" {
 		return nil, status.Errorf(codes.InvalidArgument, "ID not provided for AddExpectedRack")
 	}
 	if req.RackProfileId == nil || req.RackProfileId.Id == "" {
 		return nil, status.Errorf(codes.InvalidArgument, "Rack Profile ID not provided for AddExpectedRack")
 	}
+	f.erMu.Lock()
+	defer f.erMu.Unlock()
 	f.er[req.RackId.Id] = req
 	return &emptypb.Empty{}, nil
 }

Apply the same Lock/Unlock or RLock/RUnlock pattern to the other ExpectedRack handlers as well.

Also applies to: 1165-1249

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site-workflow/pkg/grpc/server/forge_test_server.go` at line 81, The test
server's ForgeServerImpl.er map is accessed concurrently by multiple RPC
handlers causing race conditions; add a sync.RWMutex field (e.g., erMu) to
ForgeServerImpl and wrap all accesses to er with appropriate locking: use
erMu.Lock()/erMu.Unlock() for writes (Add/Update/Delete/ReplaceAll) and
erMu.RLock()/erMu.RUnlock() for reads (GetAll and any other readers). Update all
methods that reference ForgeServerImpl.er (including the handlers mentioned
around Add/Update/Delete/GetAll/ReplaceAll) to use these locks to prevent
concurrent map read/write panics.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (7)
db/pkg/db/model/expectedrack.go (2)

61-63: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Scope ExpectedRack by (site_id, rack_id), not rack_id alone.

RackID is the sole primary key here, and the DAO surface exposes Get/Delete by rackID only. That makes rack IDs global even though the public API contract describes them as site-scoped, so two sites cannot both contain the same rack ID and single-record operations are not site-safe.

Also applies to: 135-143

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@db/pkg/db/model/expectedrack.go` around lines 61 - 63, The ExpectedRack model
is currently using RackID as the sole primary key which makes racks global;
change the model to use a composite primary key of (SiteID, RackID) by updating
the struct tags on SiteID and RackID in ExpectedRack (refer to fields RackID and
SiteID) to mark both as primary keys, then update any DAO methods that fetch or
delete by RackID alone (e.g., Get/Delete methods referenced later around lines
135-143) to accept both siteID and rackID parameters and scope their bun queries
with a WHERE on site_id AND rack_id so single-record operations are site-scoped.

404-455: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject mixed field masks in UpdateMultiple.

columnsSet is built as the union of all non-nil fields across inputs, then that full column list is applied to every row in the bulk update. If one item only sets RackProfileID and another only sets MetadataName, the unset fields on each row are written as zero values.

Suggested guard
 func (erd ExpectedRackSQLDAO) UpdateMultiple(ctx context.Context, tx *db.Tx, inputs []ExpectedRackUpdateInput) ([]ExpectedRack, error) {
@@
 	expectedRacks := make([]*ExpectedRack, 0, len(inputs))
 	ids := make([]string, 0, len(inputs))
 	columnsSet := make(map[string]bool)
+	type fieldMask struct {
+		rackProfileID       bool
+		metadataName        bool
+		metadataDescription bool
+		metadataLabels      bool
+	}
+	var expectedMask *fieldMask
 
-	for _, input := range inputs {
+	for _, input := range inputs {
+		mask := &fieldMask{
+			rackProfileID:       input.RackProfileID != nil,
+			metadataName:        input.MetadataName != nil,
+			metadataDescription: input.MetadataDescription != nil,
+			metadataLabels:      input.MetadataLabels != nil,
+		}
+		if expectedMask == nil {
+			expectedMask = mask
+		} else if *expectedMask != *mask {
+			return nil, fmt.Errorf("all updates in a batch must modify the same fields")
+		}
+
 		er := &ExpectedRack{
 			RackID: input.RackID,
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@db/pkg/db/model/expectedrack.go` around lines 404 - 455, The bulk update
builds a unioned columnsSet from inputs and applies it to every row, causing
unset fields to be zeroed for rows that didn't include those fields; in
UpdateMultiple, validate that all items in inputs have the same non-nil field
mask before performing the bulk update (compute a per-input set of non-nil field
names from inputs, compare each to the first item's set and return an error if
any differ), and only then build columns from that common set and proceed to
create expectedRacks and call
NewUpdate().Model(&expectedRacks).Column(...).Bulk().Exec so no mixed field
masks are applied; reference inputs, columnsSet, expectedRacks and the bulk
update call to locate the code.
site-workflow/pkg/grpc/server/forge_test_server.go (1)

81-81: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Protect ForgeServerImpl.er with a mutex.

These RPC handlers all access the same map without synchronization. Parallel Add/Update/Delete/GetAll/ReplaceAll calls can panic with concurrent map read and map write, which makes the test server flaky under parallel workflow tests.

Suggested direction
 import (
 	"context"
 	"fmt"
 	"net"
+	"sync"
 	"time"
@@
 type ForgeServerImpl struct {
 	cwssaws.UnimplementedForgeServer
+	erMu sync.RWMutex
 	v   map[string]*cwssaws.Vpc
@@
 	er  map[string]*cwssaws.ExpectedRack
@@
 func (f *ForgeServerImpl) AddExpectedRack(ctx context.Context, req *cwssaws.ExpectedRack) (*emptypb.Empty, error) {
@@
+	f.erMu.Lock()
+	defer f.erMu.Unlock()
 	f.er[req.RackId.Id] = req
 	return &emptypb.Empty{}, nil
 }

Apply Lock/Unlock to writes and RLock/RUnlock to reads across the other ExpectedRack handlers as well.

Also applies to: 1165-1249

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site-workflow/pkg/grpc/server/forge_test_server.go` at line 81, The
ForgeServerImpl.er map is not synchronized; add a mutex field (e.g., erMu
sync.RWMutex) to ForgeServerImpl and protect all accesses: use
erMu.Lock()/erMu.Unlock() for mutating handlers (Add, Update, Delete,
ReplaceAll) and erMu.RLock()/erMu.RUnlock() for read handlers (GetAll, any
getters), ensuring every reference to the er map in methods around ExpectedRack
handling is guarded to prevent concurrent map read/write panics.
api/pkg/api/model/expectedrack.go (1)

104-130: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject empty ExpectedRack updates.

Validate() currently accepts {} and { "id": "rack-1" }. Downstream, the DAO still performs an updated-only write for that payload, which can trigger workflow and dual-write side effects without changing any mutable field.

Suggested guard
 func (erur *APIExpectedRackUpdateRequest) Validate() error {
@@
 	if erur.Metadata != nil {
 		if err := util.ValidateLabels(erur.Metadata.Labels); err != nil {
 			return err
 		}
 	}
+
+	if erur.RackProfileID == nil && erur.Metadata == nil {
+		return validation.Errors{
+			"body": errors.New("at least one field must be provided"),
+		}
+	}
 
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/model/expectedrack.go` around lines 104 - 130, The Validate
method on APIExpectedRackUpdateRequest currently allows empty update payloads;
update APIExpectedRackUpdateRequest.Validate to reject requests that contain no
mutable fields to change by returning a validation error (e.g., "no fields to
update"). After the existing checks for ID, RackProfileID, and Metadata, add a
guard that inspects the updatable fields (at minimum RackProfileID and
Metadata.Labels as referenced in Validate) and if RackProfileID is nil or empty
and Metadata is nil or Metadata.Labels is empty, return a validation.Errors
entry (e.g., "update": errors.New("no fields to update")). Ensure this logic is
implemented in APIExpectedRackUpdateRequest.Validate so empty `{}` or `{ "id":
"rack-1" }` are rejected.
site-workflow/pkg/workflow/expectedrack.go (1)

50-50: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard request field access before logger enrichment to avoid workflow panics.

These lines dereference request fields before any nil-safe extraction. A nil payload will panic in workflow code and bypass activity-layer validation/error handling.

Proposed fix
 func CreateExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRack) error {
-	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Create").Str("ID", request.GetRackId().GetId()).Str("RackProfileID", request.GetRackProfileId().GetId()).Logger()
+	rackID, rackProfileID := "", ""
+	if request != nil {
+		rackID = request.GetRackId().GetId()
+		rackProfileID = request.GetRackProfileId().GetId()
+	}
+	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Create").Str("ID", rackID).Str("RackProfileID", rackProfileID).Logger()
@@
 func UpdateExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRack) error {
-	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Update").Str("ID", request.GetRackId().GetId()).Str("RackProfileID", request.GetRackProfileId().GetId()).Logger()
+	rackID, rackProfileID := "", ""
+	if request != nil {
+		rackID = request.GetRackId().GetId()
+		rackProfileID = request.GetRackProfileId().GetId()
+	}
+	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Update").Str("ID", rackID).Str("RackProfileID", rackProfileID).Logger()
@@
 func DeleteExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRackRequest) error {
-	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Delete").Str("ID", request.GetRackId()).Logger()
+	rackID := ""
+	if request != nil {
+		rackID = request.GetRackId()
+	}
+	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Delete").Str("ID", rackID).Logger()
@@
 func ReplaceAllExpectedRacks(ctx workflow.Context, request *cwssaws.ExpectedRackList) error {
-	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "ReplaceAll").Int("Count", len(request.GetExpectedRacks())).Logger()
+	count := 0
+	if request != nil {
+		count = len(request.GetExpectedRacks())
+	}
+	logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "ReplaceAll").Int("Count", count).Logger()

Also applies to: 80-80, 103-103, 125-125

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site-workflow/pkg/workflow/expectedrack.go` at line 50, The logger enrichment
code is dereferencing request fields (request.GetRackId().GetId(),
request.GetRackProfileId().GetId()) before checking request for nil, which can
panic in the ExpectedRack workflow; update the Create path (and the similar
occurrences at the other spots) to safely extract IDs with nil checks or helper
getters before building logger: first validate request != nil and obtain rackID
and rackProfileID into local strings using guarded logic (e.g., if request !=
nil && request.GetRackId() != nil { rackID = request.GetRackId().GetId() } ),
then call log.With() to create the logger using those local values so no nil
deref occurs and activity-layer validation still runs.
openapi/spec.yaml (2)

4185-4234: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reserve rackId = all to avoid path collision

Line 4185 introduces a literal /all route, but Line 18490 still allows any non-empty rackId. That makes rackId: "all" valid and conflicts with /expected-rack/{rackId} semantics.

Suggested OpenAPI hardening
ExpectedRackCreateRequest:
  properties:
    rackId:
      type: string
      minLength: 1
+     pattern: '^(?!all$).+'
-     description: Operator-supplied identifier for the rack (string, not UUID). Must be non-empty and unique within the Site.
+     description: Operator-supplied identifier for the rack (string, not UUID). Must be non-empty, unique within the Site, and must not be `all` (reserved).

'/v2/org/{org}/carbide/expected-rack/{rackId}':
  parameters:
    - schema:
        type: string
+       pattern: '^(?!all$).+'
      name: rackId

As per coding guidelines, "Review the OpenAPI specification, check for consistency and correctness, check for misspellings and grammatical errors."

Also applies to: 18490-18494

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openapi/spec.yaml` around lines 4185 - 4234, The path
'/v2/org/{org}/carbide/expected-rack/all' collides with the parameterized path
'/v2/org/{org}/carbide/expected-rack/{rackId}' because rackId currently allows
the literal "all"; update the rackId path parameter definition (used at
'/v2/org/{org}/carbide/expected-rack/{rackId}' and the duplicate at lines
~18490-18494) to explicitly forbid "all" by adding a validation pattern such as
^(?!all$).+ (or an equivalent negative lookahead) so that "all" is not a valid
rackId and the two routes no longer collide. Ensure you update both occurrences
of the rackId parameter.

4274-4296: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align PATCH request schema with path key (rackId)

Line 4274 defines PATCH by {rackId}, but Line 18610 still exposes id in ExpectedRackUpdateRequest. This creates an avoidable contract mismatch for generated clients.

Suggested schema alignment
ExpectedRackUpdateRequest:
  description: |-
    Request data to update an existing Expected Rack.
-   For single updates (PATCH /expected-rack/{rackId}), the `id` field is optional in the body and will be ignored if provided (the `rackId` from the URL path is used).
+   For single updates (PATCH /expected-rack/{rackId}), the `rackId` in the URL path is authoritative.

  properties:
-   id:
-     type:
-       - string
-       - 'null'
-     description: |-
-       Operator-supplied identifier of the Expected Rack to update (string, not UUID).
-       Optional for single Expected Rack update (must be empty or match the `rackId` from the URL path).
    rackProfileId:
      type:
        - string
        - 'null'

As per coding guidelines, "Review the OpenAPI specification, check for consistency and correctness, check for misspellings and grammatical errors."

Also applies to: 18593-18618

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openapi/spec.yaml` around lines 4274 - 4296, The PATCH operation
update-expected-rack uses a path key rackId but the request schema
ExpectedRackUpdateRequest still exposes an id field; update the schema so the
request body does not require or advertise an id mismatch — either remove the id
property entirely from ExpectedRackUpdateRequest (preferred) or rename it to
rackId to match the path parameter, and update any examples and validations
accordingly; alternatively mark id as readOnly so clients don't send it. Ensure
consistency between operationId update-expected-rack, the path parameter rackId,
and the schema/examples to avoid contract mismatch.
🧹 Nitpick comments (1)
api/pkg/api/handler/expectedrack_test.go (1)

160-167: ⚡ Quick win

Assert Temporal mock expectations to verify workflow dispatch paths.

These mocks set ExecuteWorkflow/Get expectations but never assert them, so success-path tests can pass even if workflow invocation regresses. Add per-subtest AssertExpectations(t) (or explicit AssertCalled) for cases that should trigger workflows.

Also applies to: 843-850, 1039-1046, 1169-1175, 1422-1428

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/handler/expectedrack_test.go` around lines 160 - 167, The test
sets up temporal mocks (mockTemporalClient, mockWorkflowRun) and configures
ExecuteWorkflow and Get but never asserts expectations; update the relevant
subtests that are expected to dispatch workflows to call
mockTemporalClient.AssertExpectations(t) and
mockWorkflowRun.AssertExpectations(t) (or use
mockTemporalClient.AssertCalled/AssertCalled for ExecuteWorkflow and
mockWorkflowRun.AssertCalled for Get) at the end of each such subtest so the
test fails if ExecuteWorkflow/Get are not invoked as expected; apply the same
pattern for the other occurrences referenced (lines around 843-850, 1039-1046,
1169-1175, 1422-1428) to ensure workflow dispatch paths are verified.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sdk/standard/api_expected_rack.go`:
- Around line 134-154: The 4xx error branches decode into CarbideAPIError but
call formatErrorMessage (which expects Title/Detail), causing the server's
Carbide `message` to be lost; update the generator/template so after decoding
into CarbideAPIError (symbol: CarbideAPIError) you prefer the error's message
field when building the returned error (symbol: newErr.error) — e.g., set
newErr.error = v.Message (or v.message) when present, falling back to
formatErrorMessage(localVarHTTPResponse.Status, &v) only if the message is
empty; ensure this change is applied in the shared formatter or generation
template used to emit the branches that call formatErrorMessage so all endpoints
under sdk/standard/ inherit the fix.

---

Duplicate comments:
In `@api/pkg/api/model/expectedrack.go`:
- Around line 104-130: The Validate method on APIExpectedRackUpdateRequest
currently allows empty update payloads; update
APIExpectedRackUpdateRequest.Validate to reject requests that contain no mutable
fields to change by returning a validation error (e.g., "no fields to update").
After the existing checks for ID, RackProfileID, and Metadata, add a guard that
inspects the updatable fields (at minimum RackProfileID and Metadata.Labels as
referenced in Validate) and if RackProfileID is nil or empty and Metadata is nil
or Metadata.Labels is empty, return a validation.Errors entry (e.g., "update":
errors.New("no fields to update")). Ensure this logic is implemented in
APIExpectedRackUpdateRequest.Validate so empty `{}` or `{ "id": "rack-1" }` are
rejected.

In `@db/pkg/db/model/expectedrack.go`:
- Around line 61-63: The ExpectedRack model is currently using RackID as the
sole primary key which makes racks global; change the model to use a composite
primary key of (SiteID, RackID) by updating the struct tags on SiteID and RackID
in ExpectedRack (refer to fields RackID and SiteID) to mark both as primary
keys, then update any DAO methods that fetch or delete by RackID alone (e.g.,
Get/Delete methods referenced later around lines 135-143) to accept both siteID
and rackID parameters and scope their bun queries with a WHERE on site_id AND
rack_id so single-record operations are site-scoped.
- Around line 404-455: The bulk update builds a unioned columnsSet from inputs
and applies it to every row, causing unset fields to be zeroed for rows that
didn't include those fields; in UpdateMultiple, validate that all items in
inputs have the same non-nil field mask before performing the bulk update
(compute a per-input set of non-nil field names from inputs, compare each to the
first item's set and return an error if any differ), and only then build columns
from that common set and proceed to create expectedRacks and call
NewUpdate().Model(&expectedRacks).Column(...).Bulk().Exec so no mixed field
masks are applied; reference inputs, columnsSet, expectedRacks and the bulk
update call to locate the code.

In `@openapi/spec.yaml`:
- Around line 4185-4234: The path '/v2/org/{org}/carbide/expected-rack/all'
collides with the parameterized path
'/v2/org/{org}/carbide/expected-rack/{rackId}' because rackId currently allows
the literal "all"; update the rackId path parameter definition (used at
'/v2/org/{org}/carbide/expected-rack/{rackId}' and the duplicate at lines
~18490-18494) to explicitly forbid "all" by adding a validation pattern such as
^(?!all$).+ (or an equivalent negative lookahead) so that "all" is not a valid
rackId and the two routes no longer collide. Ensure you update both occurrences
of the rackId parameter.
- Around line 4274-4296: The PATCH operation update-expected-rack uses a path
key rackId but the request schema ExpectedRackUpdateRequest still exposes an id
field; update the schema so the request body does not require or advertise an id
mismatch — either remove the id property entirely from ExpectedRackUpdateRequest
(preferred) or rename it to rackId to match the path parameter, and update any
examples and validations accordingly; alternatively mark id as readOnly so
clients don't send it. Ensure consistency between operationId
update-expected-rack, the path parameter rackId, and the schema/examples to
avoid contract mismatch.

In `@site-workflow/pkg/grpc/server/forge_test_server.go`:
- Line 81: The ForgeServerImpl.er map is not synchronized; add a mutex field
(e.g., erMu sync.RWMutex) to ForgeServerImpl and protect all accesses: use
erMu.Lock()/erMu.Unlock() for mutating handlers (Add, Update, Delete,
ReplaceAll) and erMu.RLock()/erMu.RUnlock() for read handlers (GetAll, any
getters), ensuring every reference to the er map in methods around ExpectedRack
handling is guarded to prevent concurrent map read/write panics.

In `@site-workflow/pkg/workflow/expectedrack.go`:
- Line 50: The logger enrichment code is dereferencing request fields
(request.GetRackId().GetId(), request.GetRackProfileId().GetId()) before
checking request for nil, which can panic in the ExpectedRack workflow; update
the Create path (and the similar occurrences at the other spots) to safely
extract IDs with nil checks or helper getters before building logger: first
validate request != nil and obtain rackID and rackProfileID into local strings
using guarded logic (e.g., if request != nil && request.GetRackId() != nil {
rackID = request.GetRackId().GetId() } ), then call log.With() to create the
logger using those local values so no nil deref occurs and activity-layer
validation still runs.

---

Nitpick comments:
In `@api/pkg/api/handler/expectedrack_test.go`:
- Around line 160-167: The test sets up temporal mocks (mockTemporalClient,
mockWorkflowRun) and configures ExecuteWorkflow and Get but never asserts
expectations; update the relevant subtests that are expected to dispatch
workflows to call mockTemporalClient.AssertExpectations(t) and
mockWorkflowRun.AssertExpectations(t) (or use
mockTemporalClient.AssertCalled/AssertCalled for ExecuteWorkflow and
mockWorkflowRun.AssertCalled for Get) at the end of each such subtest so the
test fails if ExecuteWorkflow/Get are not invoked as expected; apply the same
pattern for the other occurrences referenced (lines around 843-850, 1039-1046,
1169-1175, 1422-1428) to ensure workflow dispatch paths are verified.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 797fa65c-5050-41ff-8691-e9a9d39a0b52

📥 Commits

Reviewing files that changed from the base of the PR and between 230d6f9 and 47d035f.

⛔ Files ignored due to path filters (4)
  • workflow-schema/schema/site-agent/workflows/v1/common_carbide.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/schema/site-agent/workflows/v1/forge_carbide.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/site-agent/workflows/v1/common_carbide.proto is excluded by !workflow-schema/site-agent/workflows/v1/*_carbide.proto
  • workflow-schema/site-agent/workflows/v1/forge_carbide.proto is excluded by !workflow-schema/site-agent/workflows/v1/*_carbide.proto
📒 Files selected for processing (34)
  • api/pkg/api/handler/expectedrack.go
  • api/pkg/api/handler/expectedrack_test.go
  • api/pkg/api/model/expectedrack.go
  • api/pkg/api/model/expectedrack_test.go
  • api/pkg/api/model/util/rack_labels.go
  • api/pkg/api/routes.go
  • db/pkg/db/model/expectedrack.go
  • db/pkg/db/model/expectedrack_test.go
  • db/pkg/migrations/20260429100000_expected_rack.go
  • docs/index.html
  • openapi/spec.yaml
  • sdk/standard/api_expected_rack.go
  • sdk/standard/client.go
  • sdk/standard/model_expected_rack.go
  • sdk/standard/model_expected_rack_create_request.go
  • sdk/standard/model_expected_rack_list.go
  • sdk/standard/model_expected_rack_metadata.go
  • sdk/standard/model_expected_rack_update_request.go
  • site-agent/pkg/components/managers/expectedrack/access.go
  • site-agent/pkg/components/managers/expectedrack/init.go
  • site-agent/pkg/components/managers/expectedrack/subscriber.go
  • site-agent/pkg/components/managers/manager.go
  • site-agent/pkg/components/managers/manageraccess.go
  • site-agent/pkg/components/managers/managerapi/expectedrack_api.go
  • site-agent/pkg/components/managers/managerapi/managerapi.go
  • site-agent/pkg/components/managers/workflow/orchestrator.go
  • site-agent/pkg/datatypes/managertypes/workflow/workflowtypes.go
  • site-workflow/pkg/activity/expectedrack.go
  • site-workflow/pkg/activity/expectedrack_test.go
  • site-workflow/pkg/activity/rack_labels.go
  • site-workflow/pkg/grpc/client/testing.go
  • site-workflow/pkg/grpc/server/forge_test_server.go
  • site-workflow/pkg/workflow/expectedrack.go
  • site-workflow/pkg/workflow/expectedrack_test.go
✅ Files skipped from review due to trivial changes (8)
  • site-agent/pkg/components/managers/manageraccess.go
  • site-agent/pkg/components/managers/managerapi/expectedrack_api.go
  • api/pkg/api/model/util/rack_labels.go
  • site-workflow/pkg/activity/rack_labels.go
  • sdk/standard/model_expected_rack_metadata.go
  • site-workflow/pkg/workflow/expectedrack_test.go
  • sdk/standard/model_expected_rack_create_request.go
  • api/pkg/api/handler/expectedrack.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • site-agent/pkg/components/managers/workflow/orchestrator.go
  • sdk/standard/client.go
  • db/pkg/migrations/20260429100000_expected_rack.go
  • site-agent/pkg/components/managers/expectedrack/init.go
  • site-workflow/pkg/activity/expectedrack_test.go
  • api/pkg/api/model/expectedrack_test.go

Comment on lines +134 to +154
if localVarHTTPResponse.StatusCode == 400 {
var v CarbideAPIError
err = a.client.decode(&v, localVarBody, localVarHTTPResponse.Header.Get("Content-Type"))
if err != nil {
newErr.error = err.Error()
return localVarReturnValue, localVarHTTPResponse, newErr
}
newErr.error = formatErrorMessage(localVarHTTPResponse.Status, &v)
newErr.model = v
return localVarReturnValue, localVarHTTPResponse, newErr
}
if localVarHTTPResponse.StatusCode == 403 {
var v CarbideAPIError
err = a.client.decode(&v, localVarBody, localVarHTTPResponse.Header.Get("Content-Type"))
if err != nil {
newErr.error = err.Error()
return localVarReturnValue, localVarHTTPResponse, newErr
}
newErr.error = formatErrorMessage(localVarHTTPResponse.Status, &v)
newErr.model = v
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the Carbide message field when building 4xx errors.

All of these branches decode into CarbideAPIError and then call formatErrorMessage, but that formatter expects RFC7807-style Title/Detail fields. The result is that every new Expected Rack SDK method collapses 400/403/404 responses into a generic status string instead of surfacing the server’s message, which makes client-side diagnosis much harder. Please fix this in the shared formatter or generator template so regenerated endpoints inherit the correct behavior. Based on learnings: files under sdk/standard/ are auto-generated output from the SDK generation tooling.

Also applies to: 273-293, 406-437, 591-611, 744-775, 898-918, 1055-1085

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/standard/api_expected_rack.go` around lines 134 - 154, The 4xx error
branches decode into CarbideAPIError but call formatErrorMessage (which expects
Title/Detail), causing the server's Carbide `message` to be lost; update the
generator/template so after decoding into CarbideAPIError (symbol:
CarbideAPIError) you prefer the error's message field when building the returned
error (symbol: newErr.error) — e.g., set newErr.error = v.Message (or v.message)
when present, falling back to formatErrorMessage(localVarHTTPResponse.Status,
&v) only if the message is empty; ensure this change is applied in the shared
formatter or generation template used to emit the branches that call
formatErrorMessage so all endpoints under sdk/standard/ inherit the fix.

@chet chet force-pushed the expected-rack-apis branch from 47d035f to 1f2fb8d Compare April 30, 2026 07:37
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 30, 2026

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
nico-nsm 64 2 20 33 9 0
nico-psm 56 4 29 13 2 8
nico-rest-api 57 4 30 13 2 8
nico-rest-cert-manager 54 4 28 13 1 8
nico-rest-db 55 4 28 13 2 8
nico-rest-site-agent 54 4 28 13 1 8
nico-rest-site-manager 54 4 28 13 1 8
nico-rest-workflow 56 4 29 13 2 8
nico-rla 55 4 28 13 2 8
TOTAL 505 34 248 137 22 64

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
api/pkg/api/handler/expectedrack.go (3)

167-171: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Scope the duplicate rackId check to the target site.

rackId is site-scoped, but this preflight lookup filters only by RackIDs. A create in Site B will incorrectly return 409 when the same rack ID already exists only in Site A.

Suggested fix
 existingRacks, count, err := erDAOForCheck.GetAll(ctx, nil, cdbm.ExpectedRackFilterInput{
-	RackIDs: []string{apiRequest.RackID},
+	RackIDs: []string{apiRequest.RackID},
+	SiteIDs: []uuid.UUID{site.ID},
 }, paginator.PageInput{
 	Limit: cdb.GetIntPtr(1),
 }, nil)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/handler/expectedrack.go` around lines 167 - 171, The duplicate
rackId check is missing site scoping: when calling erDAOForCheck.GetAll to
preflight the create, include the target site filter (e.g., apiRequest.SiteID)
in the ExpectedRackFilterInput so the lookup becomes site-scoped rather than
global; update the GetAll call that currently filters only by RackIDs (and
references apiRequest.RackID) to also set SiteID (or SiteIDs) from
apiRequest.SiteID so only racks within the same site are considered for the 409
conflict check.

332-334: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not force dual-role callers down the tenant-only branch.

This branch still rejects callers when tenant != nil even if the caller also has provider identity. The “siteId required” check should apply only to tenant-only requests.

Suggested fix
-	} else if tenant != nil {
+	} else if tenant != nil && infrastructureProvider == nil {
 		// Tenants must specify a Site ID
 		return cutil.NewAPIErrorResponse(c, http.StatusBadRequest, "Site ID must be specified in query when retrieving Expected Racks as a Tenant", nil)
 	} else {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/handler/expectedrack.go` around lines 332 - 334, The current
branch rejects callers when tenant != nil even if they also have provider
identity; update the conditional that enforces "Site ID must be specified" so it
only runs for tenant-only callers (e.g., change the branch from "else if tenant
!= nil" to "else if tenant != nil && provider == nil" or the equivalent check
used in this file), leaving the existing API error response unchanged so
multi-role (tenant+provider) callers are not forced into the tenant-only path.

483-485: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Resolve and mutate ExpectedRacks with site scope, not bare rackId.

These handlers still look up and delete by rackId alone. If two sites both contain the same rack ID, GET/PATCH can read the wrong row and DELETE can remove the wrong site's record. Read the optional siteId query parameter and make the DAO resolution/deletion site-scoped, or reject ambiguous rackId values when siteId is omitted.

Also applies to: 603-605, 771-773, 809-810

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/handler/expectedrack.go` around lines 483 - 485, Handlers
currently resolve ExpectedRack by bare rackID (see NewExpectedRackDAO and
erDAO.Get / erDAO.Delete calls), which can cross sites; change the handlers to
read the optional siteId query parameter from the request and use a site-scoped
DAO lookup/delete: pass siteId into the ExpectedRack DAO resolution (or into
Get/Delete calls) so the query is constrained to that site, or if siteId is
omitted detect ambiguity (multiple rows for the same rackID across sites) and
return an error asking the caller to supply siteId. Update all occurrences (the
shown Get call and the other similar calls around lines ~603-605, ~771-773,
~809-810) to use the site-scoped resolution or reject ambiguous rackId when
siteId is missing.
site-workflow/pkg/grpc/client/testing.go (1)

1224-1313: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Mirror the Forge test-server contract in this mock.

These methods still dereference in before nil-checking, return codes.Internal for request validation failures, and make Get/GetAll synthesize data instead of reflecting prior add/update/delete/replace calls. That means the workflow tests can pass against a client mock that accepts and returns states the in-repo server test double would not.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site-workflow/pkg/grpc/client/testing.go` around lines 1224 - 1313, The mock
methods (AddExpectedRack, UpdateExpectedRack, DeleteExpectedRack,
GetExpectedRack, GetAllExpectedRacks, ReplaceAllExpectedRacks,
DeleteAllExpectedRacks) must mirror the test-server contract: perform nil checks
on the request object before dereferencing, return codes.InvalidArgument (not
codes.Internal) for client request validation errors, and maintain an in-memory
store so state changes persist across calls. Update MockForgeClient to hold a
concurrency-safe map (e.g., map[string]*wflows.ExpectedRack plus a mutex) and
change each method to (1) check in != nil before accessing fields, (2) validate
IDs and return status.Error(codes.InvalidArgument, "...") on bad input, and (3)
mutate or read the in-memory store for AddExpectedRack (insert),
UpdateExpectedRack (update existing; error if missing), DeleteExpectedRack
(remove), GetExpectedRack (lookup and return stored value), GetAllExpectedRacks
(return all stored values, honoring ctx "wantError" for simulated errors),
ReplaceAllExpectedRacks (replace store with provided list after validating
entries), and DeleteAllExpectedRacks (clear the store).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/pkg/api/handler/expectedrack.go`:
- Around line 220-224: Workflow IDs are currently built only from
expectedRack.RackID which can collide across sites; update the
tclient.StartWorkflowOptions.ID construction to include the site identifier
(e.g., use "expected-rack-create-"+expectedRack.SiteID+"-"+expectedRack.RackID)
and apply the same pattern for the update and delete workflows (replace the ID
values in the StartWorkflowOptions for create/update/delete to include
expectedRack.SiteID or the local siteID variable so IDs are unique per site).
- Around line 183-246: The current code runs common.ExecuteSyncWorkflow
(triggering the site workflow) while the SQL transaction (tx) is still open,
creating a dual-write/inconsistency risk if tx.Commit() later fails; fix by
moving the workflow invocation (common.ExecuteSyncWorkflow with stc,
"CreateExpectedRack", workflowOptions, createExpectedRackRequest) to after the
DB commit and after setting txCommitted = true so the transaction is
closed/committed before contacting the site, keeping the DB creation flow
(erDAO.Create, buildExpectedRackProto) intact and preserving error handling
around tx.Commit() and the retrieval of the Temporal client
(cerh.scp.GetClientByID) but ensure you still return appropriate API errors from
ExecuteSyncWorkflow when invoked post-commit.

---

Duplicate comments:
In `@api/pkg/api/handler/expectedrack.go`:
- Around line 167-171: The duplicate rackId check is missing site scoping: when
calling erDAOForCheck.GetAll to preflight the create, include the target site
filter (e.g., apiRequest.SiteID) in the ExpectedRackFilterInput so the lookup
becomes site-scoped rather than global; update the GetAll call that currently
filters only by RackIDs (and references apiRequest.RackID) to also set SiteID
(or SiteIDs) from apiRequest.SiteID so only racks within the same site are
considered for the 409 conflict check.
- Around line 332-334: The current branch rejects callers when tenant != nil
even if they also have provider identity; update the conditional that enforces
"Site ID must be specified" so it only runs for tenant-only callers (e.g.,
change the branch from "else if tenant != nil" to "else if tenant != nil &&
provider == nil" or the equivalent check used in this file), leaving the
existing API error response unchanged so multi-role (tenant+provider) callers
are not forced into the tenant-only path.
- Around line 483-485: Handlers currently resolve ExpectedRack by bare rackID
(see NewExpectedRackDAO and erDAO.Get / erDAO.Delete calls), which can cross
sites; change the handlers to read the optional siteId query parameter from the
request and use a site-scoped DAO lookup/delete: pass siteId into the
ExpectedRack DAO resolution (or into Get/Delete calls) so the query is
constrained to that site, or if siteId is omitted detect ambiguity (multiple
rows for the same rackID across sites) and return an error asking the caller to
supply siteId. Update all occurrences (the shown Get call and the other similar
calls around lines ~603-605, ~771-773, ~809-810) to use the site-scoped
resolution or reject ambiguous rackId when siteId is missing.

In `@site-workflow/pkg/grpc/client/testing.go`:
- Around line 1224-1313: The mock methods (AddExpectedRack, UpdateExpectedRack,
DeleteExpectedRack, GetExpectedRack, GetAllExpectedRacks,
ReplaceAllExpectedRacks, DeleteAllExpectedRacks) must mirror the test-server
contract: perform nil checks on the request object before dereferencing, return
codes.InvalidArgument (not codes.Internal) for client request validation errors,
and maintain an in-memory store so state changes persist across calls. Update
MockForgeClient to hold a concurrency-safe map (e.g.,
map[string]*wflows.ExpectedRack plus a mutex) and change each method to (1)
check in != nil before accessing fields, (2) validate IDs and return
status.Error(codes.InvalidArgument, "...") on bad input, and (3) mutate or read
the in-memory store for AddExpectedRack (insert), UpdateExpectedRack (update
existing; error if missing), DeleteExpectedRack (remove), GetExpectedRack
(lookup and return stored value), GetAllExpectedRacks (return all stored values,
honoring ctx "wantError" for simulated errors), ReplaceAllExpectedRacks (replace
store with provided list after validating entries), and DeleteAllExpectedRacks
(clear the store).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: bd806cf1-36a5-45b8-8e5b-3bf979cd1507

📥 Commits

Reviewing files that changed from the base of the PR and between 47d035f and 1f2fb8d.

⛔ Files ignored due to path filters (4)
  • workflow-schema/schema/site-agent/workflows/v1/common_carbide.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/schema/site-agent/workflows/v1/forge_carbide.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/site-agent/workflows/v1/common_carbide.proto is excluded by !workflow-schema/site-agent/workflows/v1/*_carbide.proto
  • workflow-schema/site-agent/workflows/v1/forge_carbide.proto is excluded by !workflow-schema/site-agent/workflows/v1/*_carbide.proto
📒 Files selected for processing (35)
  • api/pkg/api/handler/expectedrack.go
  • api/pkg/api/handler/expectedrack_test.go
  • api/pkg/api/model/expectedrack.go
  • api/pkg/api/model/expectedrack_test.go
  • api/pkg/api/model/util/rack_labels.go
  • api/pkg/api/routes.go
  • api/pkg/api/routes_test.go
  • db/pkg/db/model/expectedrack.go
  • db/pkg/db/model/expectedrack_test.go
  • db/pkg/migrations/20260429100000_expected_rack.go
  • docs/index.html
  • openapi/spec.yaml
  • sdk/standard/api_expected_rack.go
  • sdk/standard/client.go
  • sdk/standard/model_expected_rack.go
  • sdk/standard/model_expected_rack_create_request.go
  • sdk/standard/model_expected_rack_list.go
  • sdk/standard/model_expected_rack_metadata.go
  • sdk/standard/model_expected_rack_update_request.go
  • site-agent/pkg/components/managers/expectedrack/access.go
  • site-agent/pkg/components/managers/expectedrack/init.go
  • site-agent/pkg/components/managers/expectedrack/subscriber.go
  • site-agent/pkg/components/managers/manager.go
  • site-agent/pkg/components/managers/manageraccess.go
  • site-agent/pkg/components/managers/managerapi/expectedrack_api.go
  • site-agent/pkg/components/managers/managerapi/managerapi.go
  • site-agent/pkg/components/managers/workflow/orchestrator.go
  • site-agent/pkg/datatypes/managertypes/workflow/workflowtypes.go
  • site-workflow/pkg/activity/expectedrack.go
  • site-workflow/pkg/activity/expectedrack_test.go
  • site-workflow/pkg/activity/rack_labels.go
  • site-workflow/pkg/grpc/client/testing.go
  • site-workflow/pkg/grpc/server/forge_test_server.go
  • site-workflow/pkg/workflow/expectedrack.go
  • site-workflow/pkg/workflow/expectedrack_test.go
✅ Files skipped from review due to trivial changes (10)
  • api/pkg/api/routes_test.go
  • api/pkg/api/model/util/rack_labels.go
  • site-agent/pkg/components/managers/managerapi/expectedrack_api.go
  • api/pkg/api/model/expectedrack.go
  • sdk/standard/model_expected_rack.go
  • sdk/standard/model_expected_rack_create_request.go
  • sdk/standard/model_expected_rack_update_request.go
  • site-workflow/pkg/workflow/expectedrack_test.go
  • site-workflow/pkg/activity/rack_labels.go
  • sdk/standard/api_expected_rack.go
🚧 Files skipped from review as they are similar to previous changes (9)
  • site-agent/pkg/components/managers/workflow/orchestrator.go
  • site-workflow/pkg/grpc/server/forge_test_server.go
  • site-agent/pkg/components/managers/expectedrack/init.go
  • db/pkg/migrations/20260429100000_expected_rack.go
  • api/pkg/api/routes.go
  • site-agent/pkg/components/managers/expectedrack/subscriber.go
  • api/pkg/api/model/expectedrack_test.go
  • site-agent/pkg/components/managers/expectedrack/access.go
  • api/pkg/api/handler/expectedrack_test.go

Comment thread api/pkg/api/handler/expectedrack.go Outdated
Comment on lines +183 to +246
// Start a db transaction
tx, err := cdb.BeginTx(ctx, cerh.dbSession, &sql.TxOptions{})
if err != nil {
logger.Error().Err(err).Msg("unable to start transaction")
return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to create Expected Rack due to DB transaction error", nil)
}
// this variable is used in cleanup actions to indicate if this transaction committed
txCommitted := false
defer common.RollbackTx(ctx, tx, &txCommitted)

// Build create input from request, defaulting metadata to zero values when omitted
createInput := cdbm.ExpectedRackCreateInput{
RackID: apiRequest.RackID,
SiteID: site.ID,
RackProfileID: apiRequest.RackProfileID,
CreatedBy: dbUser.ID,
}
if apiRequest.Metadata != nil {
createInput.MetadataName = apiRequest.Metadata.Name
createInput.MetadataDescription = apiRequest.Metadata.Description
createInput.MetadataLabels = apiRequest.Metadata.Labels
}

// Create the ExpectedRack in DB
erDAO := cdbm.NewExpectedRackDAO(cerh.dbSession)
expectedRack, err := erDAO.Create(ctx, tx, createInput)
if err != nil {
logger.Error().Err(err).Msg("error creating ExpectedRack record in DB")
return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to create Expected Rack due to DB error", nil)
}

// Build the create request for workflow
createExpectedRackRequest := buildExpectedRackProto(expectedRack.RackID, expectedRack.RackProfileID, apiRequest.Metadata)

logger.Info().Msg("triggering Expected Rack create workflow on Site")

// Create workflow options
workflowOptions := tclient.StartWorkflowOptions{
ID: "expected-rack-create-" + expectedRack.RackID,
WorkflowExecutionTimeout: cutil.WorkflowExecutionTimeout,
TaskQueue: queue.SiteTaskQueue,
}

// Get the temporal client for the site we are working with
stc, err := cerh.scp.GetClientByID(site.ID)
if err != nil {
logger.Error().Err(err).Msg("failed to retrieve Temporal client for Site")
return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to retrieve client for Site", nil)
}

// Run workflow
apiErr := common.ExecuteSyncWorkflow(ctx, logger, stc, "CreateExpectedRack", workflowOptions, createExpectedRackRequest)
if apiErr != nil {
return cutil.NewAPIErrorResponse(c, apiErr.Code, apiErr.Message, apiErr.Data)
}

// Commit transaction
err = tx.Commit()
if err != nil {
logger.Error().Err(err).Msg("error committing ExpectedRack transaction to DB")
return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to create Expected Rack due to DB transaction error", nil)
}
// Set committed so, deferred cleanup functions will do nothing
txCommitted = true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Do not execute the site workflow inside the open SQL transaction.

Each mutation writes the DB row, runs the synchronous Temporal workflow, and only then commits. If the workflow succeeds and tx.Commit() fails, the site agent has already applied the change while the DB rolls it back. This adds a real dual-write inconsistency path to every ExpectedRack mutation.

Also applies to: 631-703, 799-850, 951-1021, 1114-1156

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/handler/expectedrack.go` around lines 183 - 246, The current code
runs common.ExecuteSyncWorkflow (triggering the site workflow) while the SQL
transaction (tx) is still open, creating a dual-write/inconsistency risk if
tx.Commit() later fails; fix by moving the workflow invocation
(common.ExecuteSyncWorkflow with stc, "CreateExpectedRack", workflowOptions,
createExpectedRackRequest) to after the DB commit and after setting txCommitted
= true so the transaction is closed/committed before contacting the site,
keeping the DB creation flow (erDAO.Create, buildExpectedRackProto) intact and
preserving error handling around tx.Commit() and the retrieval of the Temporal
client (cerh.scp.GetClientByID) but ensure you still return appropriate API
errors from ExecuteSyncWorkflow when invoked post-commit.

Comment thread api/pkg/api/handler/expectedrack.go Outdated
@chet chet force-pushed the expected-rack-apis branch 4 times, most recently from a32bcaf to 81db0a0 Compare April 30, 2026 23:19
@thossain-nv thossain-nv changed the title feat: Add ExpectedRack APIs for Create, Update, and Get feat: Add API model and CRUD endpoints for ExpectedRack May 1, 2026
Copy link
Copy Markdown
Contributor

@thossain-nv thossain-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the implementation @chet! Left some notes on a few topics.

Comment thread db/pkg/db/model/expectedrack.go Outdated
Comment on lines +61 to +64
RackID string `bun:"rack_id,pk"`
SiteID uuid.UUID `bun:"site_id,type:uuid,notnull"`
Site *Site `bun:"rel:belongs-to,join:site_id=id"`
RackProfileID string `bun:"rack_profile_id,notnull"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Rack ID be the PK of Expected Rack object or can it have a UUID? The other option is to make (rack_id, site_id) unique

Advantage of UUID:

  • We can have same Rack ID in more than one Sites, allowing decisions to move Racks between data halls
  • URLs don't have to nest under /site/<site-id>/expected-rack
  • rack_id can still be marked unique per Site

Comment thread db/pkg/db/model/expectedrack.go Outdated
Comment thread api/pkg/api/model/expectedrack.go Outdated
@chet chet force-pushed the expected-rack-apis branch from 81db0a0 to e19825e Compare May 1, 2026 20:19
@chet
Copy link
Copy Markdown
Contributor Author

chet commented May 1, 2026

Should Rack ID be the PK of Expected Rack object or can it have a UUID? The other option is to make (rack_id, site_id) unique
Advantage of UUID:
We can have same Rack ID in more than one Sites, allowing decisions to move Racks between data halls
URLs don't have to nest under /site//expected-rack
rack_id can still be marked unique per Site

Oh yeah, definitely UUID. It's what we do for switches and power shelves too. Fixing!

@chet chet force-pushed the expected-rack-apis branch from e19825e to 9ec707b Compare May 1, 2026 20:57
@thossain-nv
Copy link
Copy Markdown
Contributor

Thank you for the changes @chet! We're missing one last part:

@chet
Copy link
Copy Markdown
Contributor Author

chet commented May 6, 2026

Thank you for the changes @chet! We're missing one last part:

Oops, my bad, sorry! ...adding! I need a checklist of things to keep track of. You'd think I would have known this from the previous Expected* calls I added. 😆

This adds `ExpectedRack` REST APIs to create, update, and get `ExpectedRack` instances, which is similar to the work for adding expected switch and power shelf API calls as well in NVIDIA#220. Tried to pattern match as much as possible again.

Tests included!

Signed-off-by: Chet Nichols III <chetn@nvidia.com>
@chet chet force-pushed the expected-rack-apis branch from 9ec707b to 533c123 Compare May 6, 2026 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants