feat: Add API model and CRUD endpoints for ExpectedRack#444
feat: Add API model and CRUD endpoints for ExpectedRack#444chet wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds full Expected Rack functionality: API handlers and models, DB model/DAO and migration, OpenAPI and generated SDK, Temporal workflows/activities, site-agent manager wiring, gRPC test helpers, and extensive unit/integration tests across layers. Changes
Sequence DiagramsequenceDiagram
participant Client
participant APIHandler
participant DB as DB (Tx)
participant Temporal as Temporal (Workflow)
participant Carbide as Carbide gRPC
participant RLA as RLA gRPC
Client->>APIHandler: POST /.../expected-rack (CreateRequest)
APIHandler->>APIHandler: Validate request, authorize, verify Site state
APIHandler->>DB: BEGIN TX / INSERT ExpectedRack
DB-->>APIHandler: Inserted row
APIHandler->>Temporal: Start CreateExpectedRack workflow (site task queue)
Temporal->>Carbide: Activity CreateExpectedRackOnSite (AddExpectedRack RPC)
Carbide-->>Temporal: Success / Error
Temporal->>RLA: Activity CreateExpectedRackOnRLA (best-effort)
RLA-->>Temporal: Success or Warning
Temporal-->>APIHandler: Workflow completes (success or logged RLA warning)
APIHandler->>DB: COMMIT TX
DB-->>APIHandler: Committed
APIHandler-->>Client: 201 Created (ExpectedRack JSON)
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
🔐 TruffleHog Secret Scan✅ No secrets or credentials found! Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉 🕐 Last updated: 2026-04-30 06:09:47 UTC | Commit: 68157bf |
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (3)
site-agent/pkg/components/managers/expectedrack/init.go (1)
26-27: ⚡ Quick winMake the deferred validation a searchable
TODO.This is clearly intentional deferred work, but the repo guidance asks for explicit TODOs when something is being left for later.
Suggested tweak
- // Validate the expectedrack config later + // TODO: validate the expectedrack config.As per coding guidelines, "Add TODO comments for features or nuances not important to implement right away".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@site-agent/pkg/components/managers/expectedrack/init.go` around lines 26 - 27, Replace the informal comment "// Validate the expectedrack config later" with a searchable TODO comment such as "// TODO: validate expectedrack config" so the deferred work is explicitly discoverable; update the same section where ManagerAccess.Data.EB.Log.Info().Msg("ExpectedRack: Initializing ExpectedRack API") is called to keep the TODO adjacent to that initialization call and, if possible, include a short owner or ticket reference in the TODO (e.g., TODO: validate expectedrack config `#ticket/owner`).site-workflow/pkg/workflow/expectedrack_test.go (1)
46-111: ⚡ Quick winAdd nil-request regression tests for workflow entrypoints.
Please add explicit nil-input cases (at least for
CreateExpectedRack,UpdateExpectedRack,DeleteExpectedRack, andReplaceAllExpectedRacks) so panics from pre-validation field access are caught by tests.Also applies to: 132-168, 189-223, 244-288
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@site-workflow/pkg/workflow/expectedrack_test.go` around lines 46 - 111, Add nil-request regression tests to the CreateExpectedRackTestSuite to ensure the workflow entrypoints handle nil inputs without panicking: create new tests (e.g., Test_CreateExpectedRack_NilRequest, and analogous tests for UpdateExpectedRack, DeleteExpectedRack, ReplaceAllExpectedRacks) that call certs.env.ExecuteWorkflow(CreateExpectedRack, nil) and assert the workflow completes and returns an error (or the expected validation error) instead of panicking; register any activities used by the workflows (or mock them) so the test runs the workflow path that performs the pre-validation checks, and mirror this pattern for UpdateExpectedRack, DeleteExpectedRack, and ReplaceAllExpectedRacks to cover the ranges noted in the comment.site-workflow/pkg/workflow/expectedrack.go (1)
23-25: ⚡ Quick winPrefer
workflow.GetLogger(ctx)inside workflows for replay-aware logging.Using the package logger in workflow code can produce duplicate/non-contextual logs during replay. Switching to Temporal’s workflow logger keeps logs aligned with workflow execution semantics.
Also applies to: 52-53, 82-83, 105-106, 127-128, 149-150
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@site-workflow/pkg/workflow/expectedrack.go` around lines 23 - 25, Inside the workflow code in expectedrack.go replace uses of the package logger (github.com/rs/zerolog/log) with Temporal's replay-aware logger: in each workflow function (where you currently call log.Info/Debug/Error at the noted locations) obtain logger := workflow.GetLogger(ctx) at the top of the function and change log.<Level>(...) calls to logger.<Level>(...) (mapping zerolog methods to the workflow logger methods as appropriate); also remove the zerolog import if it becomes unused. Ensure you do this for all occurrences mentioned (the instances using log at the start of the file and the other locations) so logs are emitted via workflow.GetLogger(ctx) and are replay-safe.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/pkg/api/handler/expectedrack.go`:
- Around line 447-466: The handler currently looks up ExpectedRack by bare
rackId (using c.Param("rackId") and erDAO.Get) which can return/modify the wrong
record when rackId is only unique per site; modify the handler to read optional
siteId via c.QueryParam("siteId") and use it when present (pass siteId into the
DAO lookup so erDAO.Get uses rackId+siteId). When siteId is omitted, query the
DAO for all matches for that rackId (e.g., via a new or existing
erDAO.ListByRackID / FindByRackID method) and if more than one match is
returned, respond with cutil.NewAPIErrorResponse(http.StatusBadRequest,
"...ambiguous rackId...") rejecting the request; if exactly one match, use that
record's siteId/rack and proceed to call erDAO.Get as before. Ensure the same
pattern is applied to the other handlers referenced (lines ~552-586 and
~744-754).
- Around line 313-316: The tenant-only branch currently rejects callers when
tenant != nil even if they also have provider identity; update the condition to
enforce the "Site ID required for tenant" check only when there is no provider
identity (e.g., change the guard to "tenant != nil && provider == nil" or
equivalent), so dual-role principals follow the provider path if provider
identity exists; ensure the error still uses cutil.NewAPIErrorResponse and the
same message when the tenant-only condition truly applies.
In `@api/pkg/api/model/expectedrack_test.go`:
- Line 231: The site-mismatch test uses an invalid UUID which causes validation
to fail early; update the test to use a valid UUID string for the otherSiteID
variable (and the other occurrences in the same test block around the later
case) so validation proceeds to the top-level/entry SiteID equality check —
replace the current otherSiteID value with a well-formed UUID (or generate one
via uuid.NewString()) in the site-mismatch test and the corresponding instances
referenced in the same test.
In `@api/pkg/api/model/expectedrack.go`:
- Around line 97-124: The Validate method on APIExpectedRackUpdateRequest
currently allows empty updates; add a guard in
APIExpectedRackUpdateRequest.Validate (after existing field validations and
before returning nil) that ensures at least one mutable field is present (e.g.,
RackProfileID != nil, Metadata != nil, or any other mutable pointer/field on
APIExpectedRackUpdateRequest) and return a validation error like "no fields to
update" if none are set; reference the Validate method on
APIExpectedRackUpdateRequest and the fields RackProfileID and Metadata when
implementing this check.
In `@api/pkg/api/routes.go`:
- Around line 487-490: The route currently reserves the literal "all" by
registering Path "/expected-rack/all" which shadows the single-resource DELETE
"/expected-rack/:rackId" and prevents deleting a rack with id "all"; fix by
either (A) renaming the bulk-delete route to a non-conflicting path (e.g. change
Path used with NewDeleteAllExpectedRacksHandler to use a plural or explicit
suffix like "/expected-racks" or "/expected-rack/bulk-delete"), or (B) add
explicit request validation in the single-rack handler (the handler constructed
by NewDeleteExpectedRackHandler) to reject rackId == "all" and reflect that
reserved value in the OpenAPI spec and request validation schema; update the
route registration for NewDeleteAllExpectedRacksHandler and the OpenAPI docs
accordingly so the behavior is explicit.
In `@db/pkg/db/model/expectedrack.go`:
- Around line 61-64: The ExpectedRack model currently uses RackID as the sole
primary key (see struct ExpectedRack and field RackID) which makes rack IDs
global; change the persistence and DAO surfaces to be site-scoped by making the
primary key composite (SiteID + RackID) and update CRUD signatures (e.g., DAO
Get/Delete methods that currently accept rackID alone) to require both siteID
and rackID; ensure the bun tags on ExpectedRack (RackID and SiteID) reflect a
composite PK, and update any queries and unique constraints referenced around
the other occurences (lines referenced near 124-144) to filter by site_id as
well.
- Around line 388-455: UpdateMultiple currently unions all non-nil fields across
inputs (columnsSet -> columns) and calls Bulk(), which will overwrite missing
fields with zero values; fix by ensuring all inputs share the same field-mask or
by grouping inputs by identical field-mask and issuing one Bulk() per group. In
UpdateMultiple, compute each input's field mask (set of non-nil fields from
ExpectedRackUpdateInput) and either (a) validate that every mask equals the
first mask and return an error if they differ, or (b) partition inputs by mask
and for each partition build its own expectedRacks slice, columns list and call
NewUpdate().Model(&expectedRacks).Column(columns...).Bulk().Exec(ctx) so only
the intended columns are updated for each subgroup (use existing symbols:
ExpectedRackUpdateInput, columnsSet/columns, expectedRacks, ids, UpdateMultiple,
Bulk()).
In `@openapi/spec.yaml`:
- Around line 4274-4291: The PATCH operation update-expected-rack is keyed by
{rackId} but the request schema ExpectedRackUpdateRequest exposes an `id`
property, causing a contract mismatch; update the OpenAPI spec so the update
request schema uses `rackId` (or removes `id` entirely if server derives it from
the path) and ensure the path/parameter name `rackId` and any references to
ExpectedRackUpdateRequest (including the other occurrence referenced around
lines 18593-18618) are consistent with that change; also update any
examples/required lists and generated component schemas that mention `id` so
generated SDKs reflect the path key instead of an `id` body field.
- Around line 4185-4221: The OpenAPI paths
'/v2/org/{org}/carbide/expected-rack/all' and
'/v2/org/{org}/carbide/expected-rack/{rackId}' conflict because a rack with id
"all" would collide; fix by constraining the rackId path parameter (the
parameter named rackId in the '{rackId}' operation) with a regex pattern that
excludes the literal "all" (e.g. pattern: '^(?!all$).+' ) or by renaming the
bulk path to a non-identifier segment, and also update the create/POST
validation that currently allows any non-empty string so it enforces the same
rackId pattern; ensure the change is applied consistently for the same pair of
routes elsewhere (the other occurrences noted).
In `@site-workflow/pkg/activity/expectedrack_test.go`:
- Around line 136-149: The test case "test update expected rack success" (and
the similar cases around lines 222-234) must pre-seed the expected rack in the
test setup so the forge test server won’t return NotFound; before calling the
update/delete path, create the record using the same client/store used by the
handler (seed a cwssaws.ExpectedRack with RackId {Id: "test-update-rack-001"}
and RackProfileId {Id: "test-update-rack-profile-001"} via the
CarbideAtomicClient mock or test server helper), then perform the update/delete
assertions against that existing resource to validate the happy path (ensure you
reference the test’s args.request, CarbideAtomicClient, and cwssaws.ExpectedRack
identifiers).
In `@site-workflow/pkg/grpc/client/testing.go`:
- Around line 1223-1313: The mock methods (AddExpectedRack, UpdateExpectedRack,
DeleteExpectedRack, GetExpectedRack, GetAllExpectedRacks,
ReplaceAllExpectedRacks, DeleteAllExpectedRacks) must mirror the server test
double: avoid dereferencing in before nil-checking, validate inputs and return
codes.InvalidArgument for bad requests, persist expected racks in an in-memory
store on MockForgeClient (e.g., c.expectedRacks map[string]*wflows.ExpectedRack)
and update that store in Add/Update/Delete/ReplaceAll/DeleteAll, have Get return
codes.NotFound when an ID is missing, and have GetAll reflect the current
in-memory state (and honor ctx "wantError"/"wantCount" semantics only if
needed). Update the mentioned functions to perform nil checks first, use the
same status codes (InvalidArgument/NotFound), and read/write the shared
in-memory map so Get/GetAll return state consistent with prior mutations.
In `@site-workflow/pkg/workflow/expectedrack.go`:
- Line 50: The logger construction in expectedrack.go dereferences request
(e.g., request.GetRackId().GetId(), request.GetRackProfileId().GetId()) before
validating the payload which can panic on nil; modify the relevant handlers (the
Create path where logger is built and the similar sites around the other
occurrences at the same file) to first validate that request and its subfields
are non-nil (or return a well-defined error) before building the logger, and if
you still want to include IDs in logs when absent, use safe extraction
(nil-checks or default empty values) for request.GetRackId,
request.GetRackProfileId (and analogous fields at the other occurrences) so no
dereference can cause a panic.
---
Nitpick comments:
In `@site-agent/pkg/components/managers/expectedrack/init.go`:
- Around line 26-27: Replace the informal comment "// Validate the expectedrack
config later" with a searchable TODO comment such as "// TODO: validate
expectedrack config" so the deferred work is explicitly discoverable; update the
same section where ManagerAccess.Data.EB.Log.Info().Msg("ExpectedRack:
Initializing ExpectedRack API") is called to keep the TODO adjacent to that
initialization call and, if possible, include a short owner or ticket reference
in the TODO (e.g., TODO: validate expectedrack config `#ticket/owner`).
In `@site-workflow/pkg/workflow/expectedrack_test.go`:
- Around line 46-111: Add nil-request regression tests to the
CreateExpectedRackTestSuite to ensure the workflow entrypoints handle nil inputs
without panicking: create new tests (e.g., Test_CreateExpectedRack_NilRequest,
and analogous tests for UpdateExpectedRack, DeleteExpectedRack,
ReplaceAllExpectedRacks) that call certs.env.ExecuteWorkflow(CreateExpectedRack,
nil) and assert the workflow completes and returns an error (or the expected
validation error) instead of panicking; register any activities used by the
workflows (or mock them) so the test runs the workflow path that performs the
pre-validation checks, and mirror this pattern for UpdateExpectedRack,
DeleteExpectedRack, and ReplaceAllExpectedRacks to cover the ranges noted in the
comment.
In `@site-workflow/pkg/workflow/expectedrack.go`:
- Around line 23-25: Inside the workflow code in expectedrack.go replace uses of
the package logger (github.com/rs/zerolog/log) with Temporal's replay-aware
logger: in each workflow function (where you currently call log.Info/Debug/Error
at the noted locations) obtain logger := workflow.GetLogger(ctx) at the top of
the function and change log.<Level>(...) calls to logger.<Level>(...) (mapping
zerolog methods to the workflow logger methods as appropriate); also remove the
zerolog import if it becomes unused. Ensure you do this for all occurrences
mentioned (the instances using log at the start of the file and the other
locations) so logs are emitted via workflow.GetLogger(ctx) and are replay-safe.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 0a71edc9-15b2-4c32-a1f4-c59b3236c5fd
⛔ Files ignored due to path filters (4)
workflow-schema/schema/site-agent/workflows/v1/common_carbide.pb.gois excluded by!**/*.pb.go,!**/*.pb.goworkflow-schema/schema/site-agent/workflows/v1/forge_carbide.pb.gois excluded by!**/*.pb.go,!**/*.pb.goworkflow-schema/site-agent/workflows/v1/common_carbide.protois excluded by!workflow-schema/site-agent/workflows/v1/*_carbide.protoworkflow-schema/site-agent/workflows/v1/forge_carbide.protois excluded by!workflow-schema/site-agent/workflows/v1/*_carbide.proto
📒 Files selected for processing (33)
api/pkg/api/handler/expectedrack.goapi/pkg/api/model/expectedrack.goapi/pkg/api/model/expectedrack_test.goapi/pkg/api/model/util/rack_labels.goapi/pkg/api/routes.godb/pkg/db/model/expectedrack.godb/pkg/db/model/expectedrack_test.godb/pkg/migrations/20260429100000_expected_rack.godocs/index.htmlopenapi/spec.yamlsdk/standard/api_expected_rack.gosdk/standard/client.gosdk/standard/model_expected_rack.gosdk/standard/model_expected_rack_create_request.gosdk/standard/model_expected_rack_list.gosdk/standard/model_expected_rack_metadata.gosdk/standard/model_expected_rack_update_request.gosite-agent/pkg/components/managers/expectedrack/access.gosite-agent/pkg/components/managers/expectedrack/init.gosite-agent/pkg/components/managers/expectedrack/subscriber.gosite-agent/pkg/components/managers/manager.gosite-agent/pkg/components/managers/manageraccess.gosite-agent/pkg/components/managers/managerapi/expectedrack_api.gosite-agent/pkg/components/managers/managerapi/managerapi.gosite-agent/pkg/components/managers/workflow/orchestrator.gosite-agent/pkg/datatypes/managertypes/workflow/workflowtypes.gosite-workflow/pkg/activity/expectedrack.gosite-workflow/pkg/activity/expectedrack_test.gosite-workflow/pkg/activity/rack_labels.gosite-workflow/pkg/grpc/client/testing.gosite-workflow/pkg/grpc/server/forge_test_server.gosite-workflow/pkg/workflow/expectedrack.gosite-workflow/pkg/workflow/expectedrack_test.go
| summary: Update Expected Rack | ||
| operationId: update-expected-rack | ||
| description: |- | ||
| Update an existing Expected Rack identified by its operator-supplied `rackId`. | ||
|
|
||
| Org must have an Infrastructure Provider entity. User must have `FORGE_PROVIDER_ADMIN` role. | ||
|
|
||
| Infrastructure Provider must own the Expected Rack. | ||
|
|
||
| Alternatively, Tenant Admins with `TargetedInstanceCreation` capability can also update Expected Racks if they have an account with the Site's Infrastructure Provider. | ||
| parameters: | ||
| - schema: | ||
| type: string | ||
| format: uuid | ||
| name: siteId | ||
| in: query | ||
| description: ID of the Site the Expected Rack belongs to. Required when multiple Sites have an Expected Rack with the same `rackId`. | ||
| requestBody: |
There was a problem hiding this comment.
PATCH schema should not use id when the API contract uses rackId
The endpoint is keyed by {rackId}, but ExpectedRackUpdateRequest exposes id. This inconsistency leaks into generated SDKs and creates an avoidable contract mismatch.
Suggested schema alignment
ExpectedRackUpdateRequest:
description: |-
Request data to update an existing Expected Rack.
- For single updates (PATCH /expected-rack/{rackId}), the `id` field is optional in the body and will be ignored if provided (the `rackId` from the URL path is used).
+ For single updates (PATCH /expected-rack/{rackId}), the `rackId` in the URL path is authoritative.
properties:
- id:
- type:
- - string
- - 'null'
- description: |-
- Operator-supplied identifier of the Expected Rack to update (string, not UUID).
- Optional for single Expected Rack update (must be empty or match the `rackId` from the URL path).As per coding guidelines, "Review the OpenAPI specification, check for consistency and correctness."
Also applies to: 18593-18618
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@openapi/spec.yaml` around lines 4274 - 4291, The PATCH operation
update-expected-rack is keyed by {rackId} but the request schema
ExpectedRackUpdateRequest exposes an `id` property, causing a contract mismatch;
update the OpenAPI spec so the update request schema uses `rackId` (or removes
`id` entirely if server derives it from the path) and ensure the path/parameter
name `rackId` and any references to ExpectedRackUpdateRequest (including the
other occurrence referenced around lines 18593-18618) are consistent with that
change; also update any examples/required lists and generated component schemas
that mention `id` so generated SDKs reflect the path key instead of an `id` body
field.
| { | ||
| name: "test update expected rack success", | ||
| fields: fields{ | ||
| CarbideAtomicClient: carbideAtomicClient, | ||
| }, | ||
| args: args{ | ||
| ctx: context.Background(), | ||
| request: &cwssaws.ExpectedRack{ | ||
| RackId: &cwssaws.RackId{Id: "test-update-rack-001"}, | ||
| RackProfileId: &cwssaws.RackProfileId{Id: "test-update-rack-profile-001"}, | ||
| }, | ||
| }, | ||
| wantErr: false, | ||
| }, |
There was a problem hiding this comment.
Seed the rack before asserting the update/delete happy paths.
These “success” cases do not currently prove the happy path. The backing forge test server returns NotFound for unknown racks on update/delete, so these assertions only validate success if the mock client bypasses the new server semantics. Pre-create the rack in test setup, then run update/delete against that existing record.
Also applies to: 222-234
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@site-workflow/pkg/activity/expectedrack_test.go` around lines 136 - 149, The
test case "test update expected rack success" (and the similar cases around
lines 222-234) must pre-seed the expected rack in the test setup so the forge
test server won’t return NotFound; before calling the update/delete path, create
the record using the same client/store used by the handler (seed a
cwssaws.ExpectedRack with RackId {Id: "test-update-rack-001"} and RackProfileId
{Id: "test-update-rack-profile-001"} via the CarbideAtomicClient mock or test
server helper), then perform the update/delete assertions against that existing
resource to validate the happy path (ensure you reference the test’s
args.request, CarbideAtomicClient, and cwssaws.ExpectedRack identifiers).
| // CreateExpectedRackOnSite activity, then also creates the rack in RLA via | ||
| // CreateExpectedRackOnRLA (best-effort). | ||
| func CreateExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRack) error { | ||
| logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Create").Str("ID", request.GetRackId().GetId()).Str("RackProfileID", request.GetRackProfileId().GetId()).Logger() |
There was a problem hiding this comment.
Guard request before logger field extraction to avoid workflow panics.
These lines dereference request before any validation. A nil payload will panic in workflow code and bypass the activity-layer invalid-request handling.
Proposed fix
func CreateExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRack) error {
- logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Create").Str("ID", request.GetRackId().GetId()).Str("RackProfileID", request.GetRackProfileId().GetId()).Logger()
+ rackID, rackProfileID := "", ""
+ if request != nil {
+ rackID = request.GetRackId().GetId()
+ rackProfileID = request.GetRackProfileId().GetId()
+ }
+ logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Create").Str("ID", rackID).Str("RackProfileID", rackProfileID).Logger()
@@
func UpdateExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRack) error {
- logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Update").Str("ID", request.GetRackId().GetId()).Str("RackProfileID", request.GetRackProfileId().GetId()).Logger()
+ rackID, rackProfileID := "", ""
+ if request != nil {
+ rackID = request.GetRackId().GetId()
+ rackProfileID = request.GetRackProfileId().GetId()
+ }
+ logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Update").Str("ID", rackID).Str("RackProfileID", rackProfileID).Logger()
@@
func DeleteExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRackRequest) error {
- logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Delete").Str("ID", request.GetRackId()).Logger()
+ rackID := ""
+ if request != nil {
+ rackID = request.GetRackId()
+ }
+ logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Delete").Str("ID", rackID).Logger()
@@
func ReplaceAllExpectedRacks(ctx workflow.Context, request *cwssaws.ExpectedRackList) error {
- logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "ReplaceAll").Int("Count", len(request.GetExpectedRacks())).Logger()
+ count := 0
+ if request != nil {
+ count = len(request.GetExpectedRacks())
+ }
+ logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "ReplaceAll").Int("Count", count).Logger()Also applies to: 80-80, 103-103, 125-125
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@site-workflow/pkg/workflow/expectedrack.go` at line 50, The logger
construction in expectedrack.go dereferences request (e.g.,
request.GetRackId().GetId(), request.GetRackProfileId().GetId()) before
validating the payload which can panic on nil; modify the relevant handlers (the
Create path where logger is built and the similar sites around the other
occurrences at the same file) to first validate that request and its subfields
are non-nil (or return a well-defined error) before building the logger, and if
you still want to include IDs in logs when absent, use safe extraction
(nil-checks or default empty values) for request.GetRackId,
request.GetRackProfileId (and analogous fields at the other occurrences) so no
dereference can cause a panic.
68157bf to
eda6086
Compare
eda6086 to
230d6f9
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
api/pkg/api/handler/expectedrack_test.go (1)
1411-1419: 💤 Low valueMinor: Fragile rack ID generation using rune arithmetic.
The expression
string(rune('a'+i))works for small loop counts but becomes non-portable if the loop extends beyond 25 iterations. Consider usingfmt.Sprintffor clarity.♻️ Suggested improvement
for i := 0; i < 3; i++ { _, err := erDAO.Create(ctx, nil, cdbm.ExpectedRackCreateInput{ - RackID: "delete-all-rack-" + string(rune('a'+i)), + RackID: fmt.Sprintf("delete-all-rack-%d", i), SiteID: site.ID, RackProfileID: "profile-delete-all", CreatedBy: dbUser.ID, })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/pkg/api/handler/expectedrack_test.go` around lines 1411 - 1419, The rack ID generation in the test uses string(rune('a'+i)) which is fragile; update the loop that calls erDAO.Create with ExpectedRackCreateInput to build RackID using fmt.Sprintf (e.g., fmt.Sprintf("delete-all-rack-%d" or fmt.Sprintf("delete-all-rack-%c", 'a'+i)) instead of rune arithmetic) so the IDs are clearer and safe for larger loop counts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/pkg/api/handler/expectedrack.go`:
- Around line 165-181: The duplicate RackID check is currently global; scope it
to the target site by adding the resolved site ID to the GetAll filter: when
calling erDAOForCheck.GetAll (used on cdbm.NewExpectedRackDAO and passing
cdbm.ExpectedRackFilterInput with RackIDs: []string{apiRequest.RackID}), include
the site identifier (e.g., SiteIDs: []string{site.ID} or the appropriate SiteID
field on ExpectedRackFilterInput) so the preflight lookup only considers racks
within the same site before returning the 409 conflict.
In `@site-workflow/pkg/activity/expectedrack.go`:
- Around line 69-71: The code is dereferencing
mer.CarbideAtomicClient.GetClient() and calling Carbide() without checking for
nil; change each occurrence (e.g., where you call
mer.CarbideAtomicClient.GetClient() then .Carbide()) to first call client :=
mer.CarbideAtomicClient.GetClient(), check if client == nil and if so return a
connection error (not panic) from the enclosing function, and only then call
forgeClient := client.Carbide(); apply the same nil-guard and error return
pattern to all occurrences (the instances around the existing Carbide() calls
mentioned in the review).
---
Nitpick comments:
In `@api/pkg/api/handler/expectedrack_test.go`:
- Around line 1411-1419: The rack ID generation in the test uses
string(rune('a'+i)) which is fragile; update the loop that calls erDAO.Create
with ExpectedRackCreateInput to build RackID using fmt.Sprintf (e.g.,
fmt.Sprintf("delete-all-rack-%d" or fmt.Sprintf("delete-all-rack-%c", 'a'+i))
instead of rune arithmetic) so the IDs are clearer and safe for larger loop
counts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: a77584db-6649-48db-88fd-23afc5d7e261
⛔ Files ignored due to path filters (4)
workflow-schema/schema/site-agent/workflows/v1/common_carbide.pb.gois excluded by!**/*.pb.go,!**/*.pb.goworkflow-schema/schema/site-agent/workflows/v1/forge_carbide.pb.gois excluded by!**/*.pb.go,!**/*.pb.goworkflow-schema/site-agent/workflows/v1/common_carbide.protois excluded by!workflow-schema/site-agent/workflows/v1/*_carbide.protoworkflow-schema/site-agent/workflows/v1/forge_carbide.protois excluded by!workflow-schema/site-agent/workflows/v1/*_carbide.proto
📒 Files selected for processing (34)
api/pkg/api/handler/expectedrack.goapi/pkg/api/handler/expectedrack_test.goapi/pkg/api/model/expectedrack.goapi/pkg/api/model/expectedrack_test.goapi/pkg/api/model/util/rack_labels.goapi/pkg/api/routes.godb/pkg/db/model/expectedrack.godb/pkg/db/model/expectedrack_test.godb/pkg/migrations/20260429100000_expected_rack.godocs/index.htmlopenapi/spec.yamlsdk/standard/api_expected_rack.gosdk/standard/client.gosdk/standard/model_expected_rack.gosdk/standard/model_expected_rack_create_request.gosdk/standard/model_expected_rack_list.gosdk/standard/model_expected_rack_metadata.gosdk/standard/model_expected_rack_update_request.gosite-agent/pkg/components/managers/expectedrack/access.gosite-agent/pkg/components/managers/expectedrack/init.gosite-agent/pkg/components/managers/expectedrack/subscriber.gosite-agent/pkg/components/managers/manager.gosite-agent/pkg/components/managers/manageraccess.gosite-agent/pkg/components/managers/managerapi/expectedrack_api.gosite-agent/pkg/components/managers/managerapi/managerapi.gosite-agent/pkg/components/managers/workflow/orchestrator.gosite-agent/pkg/datatypes/managertypes/workflow/workflowtypes.gosite-workflow/pkg/activity/expectedrack.gosite-workflow/pkg/activity/expectedrack_test.gosite-workflow/pkg/activity/rack_labels.gosite-workflow/pkg/grpc/client/testing.gosite-workflow/pkg/grpc/server/forge_test_server.gosite-workflow/pkg/workflow/expectedrack.gosite-workflow/pkg/workflow/expectedrack_test.go
✅ Files skipped from review due to trivial changes (5)
- api/pkg/api/model/util/rack_labels.go
- site-agent/pkg/components/managers/managerapi/expectedrack_api.go
- site-workflow/pkg/activity/rack_labels.go
- db/pkg/db/model/expectedrack_test.go
- site-workflow/pkg/workflow/expectedrack_test.go
🚧 Files skipped from review as they are similar to previous changes (12)
- site-agent/pkg/components/managers/workflow/orchestrator.go
- site-agent/pkg/datatypes/managertypes/workflow/workflowtypes.go
- api/pkg/api/routes.go
- sdk/standard/client.go
- db/pkg/migrations/20260429100000_expected_rack.go
- site-agent/pkg/components/managers/manageraccess.go
- site-agent/pkg/components/managers/expectedrack/subscriber.go
- site-workflow/pkg/grpc/server/forge_test_server.go
- site-agent/pkg/components/managers/manager.go
- site-workflow/pkg/grpc/client/testing.go
- sdk/standard/model_expected_rack_list.go
- openapi/spec.yaml
| // Call Site Controller gRPC endpoint | ||
| carbideClient := mer.CarbideAtomicClient.GetClient() | ||
| forgeClient := carbideClient.Carbide() |
There was a problem hiding this comment.
Return a connection error instead of panicking on an unavailable Carbide client.
Each Site activity dereferences mer.CarbideAtomicClient and GetClient() without a guard. If the Carbide client is not configured or not connected, this will panic the worker instead of failing the activity cleanly. Please mirror the defensive pattern already used for RLA and return a normal connection error before calling Carbide().
Suggested fix
- carbideClient := mer.CarbideAtomicClient.GetClient()
- forgeClient := carbideClient.Carbide()
+ if mer.CarbideAtomicClient == nil {
+ return cclient.ErrClientNotConnected
+ }
+ carbideClient := mer.CarbideAtomicClient.GetClient()
+ if carbideClient == nil {
+ return cclient.ErrClientNotConnected
+ }
+ forgeClient := carbideClient.Carbide()Also applies to: 106-107, 140-141, 181-182, 202-203
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@site-workflow/pkg/activity/expectedrack.go` around lines 69 - 71, The code is
dereferencing mer.CarbideAtomicClient.GetClient() and calling Carbide() without
checking for nil; change each occurrence (e.g., where you call
mer.CarbideAtomicClient.GetClient() then .Carbide()) to first call client :=
mer.CarbideAtomicClient.GetClient(), check if client == nil and if so return a
connection error (not panic) from the enclosing function, and only then call
forgeClient := client.Carbide(); apply the same nil-guard and error return
pattern to all occurrences (the instances around the existing Carbide() calls
mentioned in the review).
230d6f9 to
47d035f
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (6)
site-workflow/pkg/activity/expectedrack.go (1)
69-71:⚠️ Potential issue | 🟠 Major | ⚡ Quick winGuard the Carbide client before dereferencing it.
Each Site activity assumes
ManageExpectedRack.CarbideAtomicClientandManageExpectedRack.CarbideAtomicClient.GetClient()are non-nil. If the client is unconfigured or disconnected, the worker will panic instead of returning a clean activity error. The RLA path in this file already shows the safer pattern.Suggested fix
- carbideClient := mer.CarbideAtomicClient.GetClient() - forgeClient := carbideClient.Carbide() + if mer.CarbideAtomicClient == nil { + return cclient.ErrClientNotConnected + } + carbideClient := mer.CarbideAtomicClient.GetClient() + if carbideClient == nil { + return cclient.ErrClientNotConnected + } + forgeClient := carbideClient.Carbide()Repeat this at each Carbide call site in this file.
Also applies to: 105-107, 139-141, 180-182, 201-203
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@site-workflow/pkg/activity/expectedrack.go` around lines 69 - 71, The code dereferences ManageExpectedRack.CarbideAtomicClient and its GetClient() result (e.g., creating carbideClient and calling carbideClient.Carbide()) without nil checks; update each call site (including the instances around carbideClient/forgeClient at lines shown and other occurrences) to first verify ManageExpectedRack.CarbideAtomicClient != nil and that GetClient() returns a non-nil client, and if either is nil return a proper activity error (same pattern used in the RLA path) instead of panicking—apply this guard to all Carbide call sites (GetClient, Carbide()) in this file.site-workflow/pkg/grpc/client/testing.go (1)
1223-1313:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftMirror the Forge test server contract in this mock.
This block still diverges from
site-workflow/pkg/grpc/server/forge_test_server.go:1165-1249: it can panic onnilinput, returnscodes.Internalinstead ofcodes.InvalidArgumentfor bad requests, never returnscodes.NotFoundfor missing racks, andGetExpectedRack/GetAllExpectedRackssynthesize data instead of reflecting prior mutations. That makes the workflow tests materially less trustworthy because the mock accepts flows the in-repo server double rejects.Back these methods with shared in-memory ExpectedRack state and reuse the same validation /
InvalidArgument/NotFoundsemantics as the server test double.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@site-workflow/pkg/grpc/client/testing.go` around lines 1223 - 1313, The mock methods AddExpectedRack, UpdateExpectedRack, DeleteExpectedRack, GetExpectedRack, GetAllExpectedRacks, ReplaceAllExpectedRacks and DeleteAllExpectedRacks currently diverge from the server test double: they can panic on nil inputs, return codes.Internal instead of codes.InvalidArgument, never return codes.NotFound, and synthesize data instead of reflecting prior mutations. Fix by adding shared in-memory state on MockForgeClient (e.g., an expectedRacks map[string]*wflows.ExpectedRack protected by a mutex) and update each method to: validate inputs nil-safe and return status.Error(codes.InvalidArgument, ...) for bad requests, behave like the server by returning status.Error(codes.NotFound, ...) when an ID is missing, mutate/read the shared map for AddExpectedRack/UpdateExpectedRack/DeleteExpectedRack/GetExpectedRack/GetAllExpectedRacks/ReplaceAllExpectedRacks/DeleteAllExpectedRacks so queries reflect prior mutations, and keep existing ctx-based wantError/wantCount behavior for error simulation.api/pkg/api/handler/expectedrack.go (1)
331-334:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not force dual-role callers down the tenant-only path.
This branch rejects requests whenever
tenant != nil, even when the caller also has provider identity. That turns a valid provider-scoped list into a400for mixed-role service accounts.Suggested fix
- } else if tenant != nil { + } else if tenant != nil && infrastructureProvider == nil { // Tenants must specify a Site ID return cutil.NewAPIErrorResponse(c, http.StatusBadRequest, "Site ID must be specified in query when retrieving Expected Racks as a Tenant", nil) } else {Based on learnings: RBAC should not short-circuit on the first matching role; evaluate both authorization paths independently for dual-role service accounts.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/pkg/api/handler/expectedrack.go` around lines 331 - 334, The current branch unconditionally returns a 400 when tenant != nil, which blocks callers who also have provider identity; change the check to only reject when tenant is set and the caller lacks provider rights (e.g., if tenant != nil && provider == nil) so provider-scoped requests still proceed; modify the logic around filterInput.SiteIDs and the cutil.NewAPIErrorResponse call so you only enforce "Tenants must specify a Site ID" for pure-tenant callers and evaluate tenant and provider authorization paths independently.db/pkg/db/model/expectedrack.go (2)
408-455:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject mixed field masks in
UpdateMultiple.
columnsis built as the union of all non-nil fields acrossinputs, then applied to every row. If one item only updatesRackProfileIDand another only updatesMetadataName, the missing fields on each row are written as zero values.Suggested guard
func (erd ExpectedRackSQLDAO) UpdateMultiple(ctx context.Context, tx *db.Tx, inputs []ExpectedRackUpdateInput) ([]ExpectedRack, error) { // Create a child span and set the attributes for current request ctx, expectedRackDAOSpan := erd.tracerSpan.CreateChildInCurrentContext(ctx, "ExpectedRackDAO.UpdateMultiple") if expectedRackDAOSpan != nil { defer expectedRackDAOSpan.End() erd.tracerSpan.SetAttribute(expectedRackDAOSpan, "batch_size", len(inputs)) } if len(inputs) == 0 { return []ExpectedRack{}, nil } expectedRacks := make([]*ExpectedRack, 0, len(inputs)) ids := make([]string, 0, len(inputs)) columnsSet := make(map[string]bool) + var expectedMask struct { + rackProfileID bool + metadataName bool + metadataDescription bool + metadataLabels bool + } - for _, input := range inputs { + for i, input := range inputs { + mask := struct { + rackProfileID bool + metadataName bool + metadataDescription bool + metadataLabels bool + }{ + rackProfileID: input.RackProfileID != nil, + metadataName: input.MetadataName != nil, + metadataDescription: input.MetadataDescription != nil, + metadataLabels: input.MetadataLabels != nil, + } + if i == 0 { + expectedMask = mask + } else if mask != expectedMask { + return nil, fmt.Errorf("all updates in a batch must modify the same fields") + } + er := &ExpectedRack{ RackID: input.RackID, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@db/pkg/db/model/expectedrack.go` around lines 408 - 455, The bulk update builds a union of non-nil fields across inputs and applies that same column list to every ExpectedRack, which will overwrite missing fields with zero values; in UpdateMultiple validate that every item in inputs has the same set of non-nil fields before constructing columns (compare per-item field masks derived from RackProfileID, MetadataName, MetadataDescription, MetadataLabels etc.), and if any input's mask differs return an error (or fall back to per-row updates) instead of proceeding with the union; reference the inputs slice, columnsSet/columns, expectedRacks, and the NewUpdate().Model(&expectedRacks).Column(columns...).Bulk().Exec(...) call to locate where to add this guard.
61-64:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftMake ExpectedRack identity site-scoped end-to-end.
RackIDis still the sole primary key, and the DAO contract still addresses single records by barerackID. That keeps rack IDs globally unique and makes single-record CRUD ambiguous, which conflicts with the public contract in this PR whererackIdis site-scoped. Please move persistence and DAO lookup/delete signatures to(site_id, rack_id)or explicitly change the API contract before merge.Also applies to: 124-144
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@db/pkg/db/model/expectedrack.go` around lines 61 - 64, ExpectedRack is still using RackID as the sole primary key which makes IDs global; change persistence and DAO lookups to be site-scoped by making the composite primary key (site_id, rack_id) in the ExpectedRack struct (update struct tags on SiteID and RackID to include `pk` on both), and update all DAO signatures and implementations that currently accept a single rackID (e.g., any Get/Delete/Find methods that take rackID) to take both siteID (uuid.UUID) and rackID (string) and ensure their queries filter by both site_id and rack_id; also mirror this change in any related methods referenced around lines 124-144 so all single-record CRUD is unambiguous and site-scoped.api/pkg/api/model/expectedrack.go (1)
103-130:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject empty ExpectedRack updates.
Validate()still accepts{}and{ "id": "rack-1" }. Downstream, that becomes a write that only bumpsupdatedand still triggers the update workflow. Require at least one mutable field before returningnil.Suggested guard
func (erur *APIExpectedRackUpdateRequest) Validate() error { if erur.ID != nil { if *erur.ID == "" { return validation.Errors{ "id": errors.New("ID cannot be empty"), } } } err := validation.ValidateStruct(erur, validation.Field(&erur.RackProfileID, validation.NilOrNotEmpty.Error("RackProfileID cannot be empty"), validation.When(erur.RackProfileID != nil && *erur.RackProfileID != "", validation.Match(util.NotAllWhitespaceRegexp).Error("RackProfileID consists only of whitespace"))), ) if err != nil { return err } if erur.Metadata != nil { if err := util.ValidateLabels(erur.Metadata.Labels); err != nil { return err } } + + if erur.RackProfileID == nil && erur.Metadata == nil { + return validation.Errors{ + "body": errors.New("at least one field must be provided"), + } + } return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/pkg/api/model/expectedrack.go` around lines 103 - 130, APIExpectedRackUpdateRequest.Validate currently allows empty update requests (e.g., {} or only id) which should be rejected; after the existing field validations (including util.ValidateLabels) add a guard that ensures at least one mutable field is being updated: check that RackProfileID is non-nil and not the empty string (and not whitespace) OR Metadata is non-nil with at least one label (or other mutable metadata content). If neither condition is true, return a validation error (e.g., validation.Errors with a message like "must specify at least one updatable field") so empty/no-op updates are rejected; update the logic in APIExpectedRackUpdateRequest.Validate to perform this presence check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@site-workflow/pkg/grpc/server/forge_test_server.go`:
- Line 81: The test server's ForgeServerImpl.er map is accessed concurrently by
multiple RPC handlers causing race conditions; add a sync.RWMutex field (e.g.,
erMu) to ForgeServerImpl and wrap all accesses to er with appropriate locking:
use erMu.Lock()/erMu.Unlock() for writes (Add/Update/Delete/ReplaceAll) and
erMu.RLock()/erMu.RUnlock() for reads (GetAll and any other readers). Update all
methods that reference ForgeServerImpl.er (including the handlers mentioned
around Add/Update/Delete/GetAll/ReplaceAll) to use these locks to prevent
concurrent map read/write panics.
---
Duplicate comments:
In `@api/pkg/api/handler/expectedrack.go`:
- Around line 331-334: The current branch unconditionally returns a 400 when
tenant != nil, which blocks callers who also have provider identity; change the
check to only reject when tenant is set and the caller lacks provider rights
(e.g., if tenant != nil && provider == nil) so provider-scoped requests still
proceed; modify the logic around filterInput.SiteIDs and the
cutil.NewAPIErrorResponse call so you only enforce "Tenants must specify a Site
ID" for pure-tenant callers and evaluate tenant and provider authorization paths
independently.
In `@api/pkg/api/model/expectedrack.go`:
- Around line 103-130: APIExpectedRackUpdateRequest.Validate currently allows
empty update requests (e.g., {} or only id) which should be rejected; after the
existing field validations (including util.ValidateLabels) add a guard that
ensures at least one mutable field is being updated: check that RackProfileID is
non-nil and not the empty string (and not whitespace) OR Metadata is non-nil
with at least one label (or other mutable metadata content). If neither
condition is true, return a validation error (e.g., validation.Errors with a
message like "must specify at least one updatable field") so empty/no-op updates
are rejected; update the logic in APIExpectedRackUpdateRequest.Validate to
perform this presence check.
In `@db/pkg/db/model/expectedrack.go`:
- Around line 408-455: The bulk update builds a union of non-nil fields across
inputs and applies that same column list to every ExpectedRack, which will
overwrite missing fields with zero values; in UpdateMultiple validate that every
item in inputs has the same set of non-nil fields before constructing columns
(compare per-item field masks derived from RackProfileID, MetadataName,
MetadataDescription, MetadataLabels etc.), and if any input's mask differs
return an error (or fall back to per-row updates) instead of proceeding with the
union; reference the inputs slice, columnsSet/columns, expectedRacks, and the
NewUpdate().Model(&expectedRacks).Column(columns...).Bulk().Exec(...) call to
locate where to add this guard.
- Around line 61-64: ExpectedRack is still using RackID as the sole primary key
which makes IDs global; change persistence and DAO lookups to be site-scoped by
making the composite primary key (site_id, rack_id) in the ExpectedRack struct
(update struct tags on SiteID and RackID to include `pk` on both), and update
all DAO signatures and implementations that currently accept a single rackID
(e.g., any Get/Delete/Find methods that take rackID) to take both siteID
(uuid.UUID) and rackID (string) and ensure their queries filter by both site_id
and rack_id; also mirror this change in any related methods referenced around
lines 124-144 so all single-record CRUD is unambiguous and site-scoped.
In `@site-workflow/pkg/activity/expectedrack.go`:
- Around line 69-71: The code dereferences
ManageExpectedRack.CarbideAtomicClient and its GetClient() result (e.g.,
creating carbideClient and calling carbideClient.Carbide()) without nil checks;
update each call site (including the instances around carbideClient/forgeClient
at lines shown and other occurrences) to first verify
ManageExpectedRack.CarbideAtomicClient != nil and that GetClient() returns a
non-nil client, and if either is nil return a proper activity error (same
pattern used in the RLA path) instead of panicking—apply this guard to all
Carbide call sites (GetClient, Carbide()) in this file.
In `@site-workflow/pkg/grpc/client/testing.go`:
- Around line 1223-1313: The mock methods AddExpectedRack, UpdateExpectedRack,
DeleteExpectedRack, GetExpectedRack, GetAllExpectedRacks,
ReplaceAllExpectedRacks and DeleteAllExpectedRacks currently diverge from the
server test double: they can panic on nil inputs, return codes.Internal instead
of codes.InvalidArgument, never return codes.NotFound, and synthesize data
instead of reflecting prior mutations. Fix by adding shared in-memory state on
MockForgeClient (e.g., an expectedRacks map[string]*wflows.ExpectedRack
protected by a mutex) and update each method to: validate inputs nil-safe and
return status.Error(codes.InvalidArgument, ...) for bad requests, behave like
the server by returning status.Error(codes.NotFound, ...) when an ID is missing,
mutate/read the shared map for
AddExpectedRack/UpdateExpectedRack/DeleteExpectedRack/GetExpectedRack/GetAllExpectedRacks/ReplaceAllExpectedRacks/DeleteAllExpectedRacks
so queries reflect prior mutations, and keep existing ctx-based
wantError/wantCount behavior for error simulation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 618be74c-159e-4f4b-8344-6818bfab56ba
⛔ Files ignored due to path filters (4)
workflow-schema/schema/site-agent/workflows/v1/common_carbide.pb.gois excluded by!**/*.pb.go,!**/*.pb.goworkflow-schema/schema/site-agent/workflows/v1/forge_carbide.pb.gois excluded by!**/*.pb.go,!**/*.pb.goworkflow-schema/site-agent/workflows/v1/common_carbide.protois excluded by!workflow-schema/site-agent/workflows/v1/*_carbide.protoworkflow-schema/site-agent/workflows/v1/forge_carbide.protois excluded by!workflow-schema/site-agent/workflows/v1/*_carbide.proto
📒 Files selected for processing (34)
api/pkg/api/handler/expectedrack.goapi/pkg/api/handler/expectedrack_test.goapi/pkg/api/model/expectedrack.goapi/pkg/api/model/expectedrack_test.goapi/pkg/api/model/util/rack_labels.goapi/pkg/api/routes.godb/pkg/db/model/expectedrack.godb/pkg/db/model/expectedrack_test.godb/pkg/migrations/20260429100000_expected_rack.godocs/index.htmlopenapi/spec.yamlsdk/standard/api_expected_rack.gosdk/standard/client.gosdk/standard/model_expected_rack.gosdk/standard/model_expected_rack_create_request.gosdk/standard/model_expected_rack_list.gosdk/standard/model_expected_rack_metadata.gosdk/standard/model_expected_rack_update_request.gosite-agent/pkg/components/managers/expectedrack/access.gosite-agent/pkg/components/managers/expectedrack/init.gosite-agent/pkg/components/managers/expectedrack/subscriber.gosite-agent/pkg/components/managers/manager.gosite-agent/pkg/components/managers/manageraccess.gosite-agent/pkg/components/managers/managerapi/expectedrack_api.gosite-agent/pkg/components/managers/managerapi/managerapi.gosite-agent/pkg/components/managers/workflow/orchestrator.gosite-agent/pkg/datatypes/managertypes/workflow/workflowtypes.gosite-workflow/pkg/activity/expectedrack.gosite-workflow/pkg/activity/expectedrack_test.gosite-workflow/pkg/activity/rack_labels.gosite-workflow/pkg/grpc/client/testing.gosite-workflow/pkg/grpc/server/forge_test_server.gosite-workflow/pkg/workflow/expectedrack.gosite-workflow/pkg/workflow/expectedrack_test.go
✅ Files skipped from review due to trivial changes (13)
- site-agent/pkg/components/managers/workflow/orchestrator.go
- site-agent/pkg/components/managers/manageraccess.go
- api/pkg/api/model/util/rack_labels.go
- site-agent/pkg/components/managers/manager.go
- site-workflow/pkg/activity/rack_labels.go
- site-agent/pkg/components/managers/managerapi/expectedrack_api.go
- site-agent/pkg/components/managers/expectedrack/access.go
- sdk/standard/model_expected_rack_metadata.go
- sdk/standard/model_expected_rack.go
- sdk/standard/model_expected_rack_list.go
- sdk/standard/model_expected_rack_create_request.go
- site-workflow/pkg/workflow/expectedrack.go
- sdk/standard/model_expected_rack_update_request.go
🚧 Files skipped from review as they are similar to previous changes (11)
- site-agent/pkg/components/managers/managerapi/managerapi.go
- sdk/standard/client.go
- site-agent/pkg/datatypes/managertypes/workflow/workflowtypes.go
- site-agent/pkg/components/managers/expectedrack/init.go
- site-agent/pkg/components/managers/expectedrack/subscriber.go
- db/pkg/migrations/20260429100000_expected_rack.go
- db/pkg/db/model/expectedrack_test.go
- site-workflow/pkg/workflow/expectedrack_test.go
- site-workflow/pkg/activity/expectedrack_test.go
- api/pkg/api/handler/expectedrack_test.go
- openapi/spec.yaml
| @@ -78,6 +78,7 @@ type ForgeServerImpl struct { | |||
| em map[string]*cwssaws.ExpectedMachine | |||
| eps map[string]*cwssaws.ExpectedPowerShelf | |||
| es map[string]*cwssaws.ExpectedSwitch | |||
| er map[string]*cwssaws.ExpectedRack | |||
There was a problem hiding this comment.
Protect ForgeServerImpl.er with a mutex.
These RPCs run on a gRPC server and all access the same map directly. Parallel Add/Update/Delete/GetAll/ReplaceAll calls can trigger concurrent map writes or concurrent map read and map write, which will make the test server flaky under parallel workflow tests.
Suggested direction
import (
"context"
"fmt"
"net"
+ "sync"
"time"
@@
type ForgeServerImpl struct {
cwssaws.UnimplementedForgeServer
+ erMu sync.RWMutex
v map[string]*cwssaws.Vpc
@@
er map[string]*cwssaws.ExpectedRack
@@
func (f *ForgeServerImpl) AddExpectedRack(ctx context.Context, req *cwssaws.ExpectedRack) (*emptypb.Empty, error) {
if req == nil || req.RackId == nil || req.RackId.Id == "" {
return nil, status.Errorf(codes.InvalidArgument, "ID not provided for AddExpectedRack")
}
if req.RackProfileId == nil || req.RackProfileId.Id == "" {
return nil, status.Errorf(codes.InvalidArgument, "Rack Profile ID not provided for AddExpectedRack")
}
+ f.erMu.Lock()
+ defer f.erMu.Unlock()
f.er[req.RackId.Id] = req
return &emptypb.Empty{}, nil
}Apply the same Lock/Unlock or RLock/RUnlock pattern to the other ExpectedRack handlers as well.
Also applies to: 1165-1249
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@site-workflow/pkg/grpc/server/forge_test_server.go` at line 81, The test
server's ForgeServerImpl.er map is accessed concurrently by multiple RPC
handlers causing race conditions; add a sync.RWMutex field (e.g., erMu) to
ForgeServerImpl and wrap all accesses to er with appropriate locking: use
erMu.Lock()/erMu.Unlock() for writes (Add/Update/Delete/ReplaceAll) and
erMu.RLock()/erMu.RUnlock() for reads (GetAll and any other readers). Update all
methods that reference ForgeServerImpl.er (including the handlers mentioned
around Add/Update/Delete/GetAll/ReplaceAll) to use these locks to prevent
concurrent map read/write panics.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (7)
db/pkg/db/model/expectedrack.go (2)
61-63:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftScope
ExpectedRackby(site_id, rack_id), notrack_idalone.
RackIDis the sole primary key here, and the DAO surface exposesGet/DeletebyrackIDonly. That makes rack IDs global even though the public API contract describes them as site-scoped, so two sites cannot both contain the same rack ID and single-record operations are not site-safe.Also applies to: 135-143
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@db/pkg/db/model/expectedrack.go` around lines 61 - 63, The ExpectedRack model is currently using RackID as the sole primary key which makes racks global; change the model to use a composite primary key of (SiteID, RackID) by updating the struct tags on SiteID and RackID in ExpectedRack (refer to fields RackID and SiteID) to mark both as primary keys, then update any DAO methods that fetch or delete by RackID alone (e.g., Get/Delete methods referenced later around lines 135-143) to accept both siteID and rackID parameters and scope their bun queries with a WHERE on site_id AND rack_id so single-record operations are site-scoped.
404-455:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject mixed field masks in
UpdateMultiple.
columnsSetis built as the union of all non-nil fields acrossinputs, then that full column list is applied to every row in the bulk update. If one item only setsRackProfileIDand another only setsMetadataName, the unset fields on each row are written as zero values.Suggested guard
func (erd ExpectedRackSQLDAO) UpdateMultiple(ctx context.Context, tx *db.Tx, inputs []ExpectedRackUpdateInput) ([]ExpectedRack, error) { @@ expectedRacks := make([]*ExpectedRack, 0, len(inputs)) ids := make([]string, 0, len(inputs)) columnsSet := make(map[string]bool) + type fieldMask struct { + rackProfileID bool + metadataName bool + metadataDescription bool + metadataLabels bool + } + var expectedMask *fieldMask - for _, input := range inputs { + for _, input := range inputs { + mask := &fieldMask{ + rackProfileID: input.RackProfileID != nil, + metadataName: input.MetadataName != nil, + metadataDescription: input.MetadataDescription != nil, + metadataLabels: input.MetadataLabels != nil, + } + if expectedMask == nil { + expectedMask = mask + } else if *expectedMask != *mask { + return nil, fmt.Errorf("all updates in a batch must modify the same fields") + } + er := &ExpectedRack{ RackID: input.RackID, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@db/pkg/db/model/expectedrack.go` around lines 404 - 455, The bulk update builds a unioned columnsSet from inputs and applies it to every row, causing unset fields to be zeroed for rows that didn't include those fields; in UpdateMultiple, validate that all items in inputs have the same non-nil field mask before performing the bulk update (compute a per-input set of non-nil field names from inputs, compare each to the first item's set and return an error if any differ), and only then build columns from that common set and proceed to create expectedRacks and call NewUpdate().Model(&expectedRacks).Column(...).Bulk().Exec so no mixed field masks are applied; reference inputs, columnsSet, expectedRacks and the bulk update call to locate the code.site-workflow/pkg/grpc/server/forge_test_server.go (1)
81-81:⚠️ Potential issue | 🟠 Major | ⚡ Quick winProtect
ForgeServerImpl.erwith a mutex.These RPC handlers all access the same map without synchronization. Parallel
Add/Update/Delete/GetAll/ReplaceAllcalls can panic withconcurrent map read and map write, which makes the test server flaky under parallel workflow tests.Suggested direction
import ( "context" "fmt" "net" + "sync" "time" @@ type ForgeServerImpl struct { cwssaws.UnimplementedForgeServer + erMu sync.RWMutex v map[string]*cwssaws.Vpc @@ er map[string]*cwssaws.ExpectedRack @@ func (f *ForgeServerImpl) AddExpectedRack(ctx context.Context, req *cwssaws.ExpectedRack) (*emptypb.Empty, error) { @@ + f.erMu.Lock() + defer f.erMu.Unlock() f.er[req.RackId.Id] = req return &emptypb.Empty{}, nil }Apply
Lock/Unlockto writes andRLock/RUnlockto reads across the other ExpectedRack handlers as well.Also applies to: 1165-1249
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@site-workflow/pkg/grpc/server/forge_test_server.go` at line 81, The ForgeServerImpl.er map is not synchronized; add a mutex field (e.g., erMu sync.RWMutex) to ForgeServerImpl and protect all accesses: use erMu.Lock()/erMu.Unlock() for mutating handlers (Add, Update, Delete, ReplaceAll) and erMu.RLock()/erMu.RUnlock() for read handlers (GetAll, any getters), ensuring every reference to the er map in methods around ExpectedRack handling is guarded to prevent concurrent map read/write panics.api/pkg/api/model/expectedrack.go (1)
104-130:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject empty ExpectedRack updates.
Validate()currently accepts{}and{ "id": "rack-1" }. Downstream, the DAO still performs anupdated-only write for that payload, which can trigger workflow and dual-write side effects without changing any mutable field.Suggested guard
func (erur *APIExpectedRackUpdateRequest) Validate() error { @@ if erur.Metadata != nil { if err := util.ValidateLabels(erur.Metadata.Labels); err != nil { return err } } + + if erur.RackProfileID == nil && erur.Metadata == nil { + return validation.Errors{ + "body": errors.New("at least one field must be provided"), + } + } return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/pkg/api/model/expectedrack.go` around lines 104 - 130, The Validate method on APIExpectedRackUpdateRequest currently allows empty update payloads; update APIExpectedRackUpdateRequest.Validate to reject requests that contain no mutable fields to change by returning a validation error (e.g., "no fields to update"). After the existing checks for ID, RackProfileID, and Metadata, add a guard that inspects the updatable fields (at minimum RackProfileID and Metadata.Labels as referenced in Validate) and if RackProfileID is nil or empty and Metadata is nil or Metadata.Labels is empty, return a validation.Errors entry (e.g., "update": errors.New("no fields to update")). Ensure this logic is implemented in APIExpectedRackUpdateRequest.Validate so empty `{}` or `{ "id": "rack-1" }` are rejected.site-workflow/pkg/workflow/expectedrack.go (1)
50-50:⚠️ Potential issue | 🟠 Major | ⚡ Quick winGuard request field access before logger enrichment to avoid workflow panics.
These lines dereference
requestfields before any nil-safe extraction. A nil payload will panic in workflow code and bypass activity-layer validation/error handling.Proposed fix
func CreateExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRack) error { - logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Create").Str("ID", request.GetRackId().GetId()).Str("RackProfileID", request.GetRackProfileId().GetId()).Logger() + rackID, rackProfileID := "", "" + if request != nil { + rackID = request.GetRackId().GetId() + rackProfileID = request.GetRackProfileId().GetId() + } + logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Create").Str("ID", rackID).Str("RackProfileID", rackProfileID).Logger() @@ func UpdateExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRack) error { - logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Update").Str("ID", request.GetRackId().GetId()).Str("RackProfileID", request.GetRackProfileId().GetId()).Logger() + rackID, rackProfileID := "", "" + if request != nil { + rackID = request.GetRackId().GetId() + rackProfileID = request.GetRackProfileId().GetId() + } + logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Update").Str("ID", rackID).Str("RackProfileID", rackProfileID).Logger() @@ func DeleteExpectedRack(ctx workflow.Context, request *cwssaws.ExpectedRackRequest) error { - logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Delete").Str("ID", request.GetRackId()).Logger() + rackID := "" + if request != nil { + rackID = request.GetRackId() + } + logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "Delete").Str("ID", rackID).Logger() @@ func ReplaceAllExpectedRacks(ctx workflow.Context, request *cwssaws.ExpectedRackList) error { - logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "ReplaceAll").Int("Count", len(request.GetExpectedRacks())).Logger() + count := 0 + if request != nil { + count = len(request.GetExpectedRacks()) + } + logger := log.With().Str("Workflow", "ExpectedRack").Str("Action", "ReplaceAll").Int("Count", count).Logger()Also applies to: 80-80, 103-103, 125-125
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@site-workflow/pkg/workflow/expectedrack.go` at line 50, The logger enrichment code is dereferencing request fields (request.GetRackId().GetId(), request.GetRackProfileId().GetId()) before checking request for nil, which can panic in the ExpectedRack workflow; update the Create path (and the similar occurrences at the other spots) to safely extract IDs with nil checks or helper getters before building logger: first validate request != nil and obtain rackID and rackProfileID into local strings using guarded logic (e.g., if request != nil && request.GetRackId() != nil { rackID = request.GetRackId().GetId() } ), then call log.With() to create the logger using those local values so no nil deref occurs and activity-layer validation still runs.openapi/spec.yaml (2)
4185-4234:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReserve
rackId = allto avoid path collisionLine 4185 introduces a literal
/allroute, but Line 18490 still allows any non-emptyrackId. That makesrackId: "all"valid and conflicts with/expected-rack/{rackId}semantics.Suggested OpenAPI hardening
ExpectedRackCreateRequest: properties: rackId: type: string minLength: 1 + pattern: '^(?!all$).+' - description: Operator-supplied identifier for the rack (string, not UUID). Must be non-empty and unique within the Site. + description: Operator-supplied identifier for the rack (string, not UUID). Must be non-empty, unique within the Site, and must not be `all` (reserved). '/v2/org/{org}/carbide/expected-rack/{rackId}': parameters: - schema: type: string + pattern: '^(?!all$).+' name: rackIdAs per coding guidelines, "Review the OpenAPI specification, check for consistency and correctness, check for misspellings and grammatical errors."
Also applies to: 18490-18494
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openapi/spec.yaml` around lines 4185 - 4234, The path '/v2/org/{org}/carbide/expected-rack/all' collides with the parameterized path '/v2/org/{org}/carbide/expected-rack/{rackId}' because rackId currently allows the literal "all"; update the rackId path parameter definition (used at '/v2/org/{org}/carbide/expected-rack/{rackId}' and the duplicate at lines ~18490-18494) to explicitly forbid "all" by adding a validation pattern such as ^(?!all$).+ (or an equivalent negative lookahead) so that "all" is not a valid rackId and the two routes no longer collide. Ensure you update both occurrences of the rackId parameter.
4274-4296:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAlign PATCH request schema with path key (
rackId)Line 4274 defines PATCH by
{rackId}, but Line 18610 still exposesidinExpectedRackUpdateRequest. This creates an avoidable contract mismatch for generated clients.Suggested schema alignment
ExpectedRackUpdateRequest: description: |- Request data to update an existing Expected Rack. - For single updates (PATCH /expected-rack/{rackId}), the `id` field is optional in the body and will be ignored if provided (the `rackId` from the URL path is used). + For single updates (PATCH /expected-rack/{rackId}), the `rackId` in the URL path is authoritative. properties: - id: - type: - - string - - 'null' - description: |- - Operator-supplied identifier of the Expected Rack to update (string, not UUID). - Optional for single Expected Rack update (must be empty or match the `rackId` from the URL path). rackProfileId: type: - string - 'null'As per coding guidelines, "Review the OpenAPI specification, check for consistency and correctness, check for misspellings and grammatical errors."
Also applies to: 18593-18618
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openapi/spec.yaml` around lines 4274 - 4296, The PATCH operation update-expected-rack uses a path key rackId but the request schema ExpectedRackUpdateRequest still exposes an id field; update the schema so the request body does not require or advertise an id mismatch — either remove the id property entirely from ExpectedRackUpdateRequest (preferred) or rename it to rackId to match the path parameter, and update any examples and validations accordingly; alternatively mark id as readOnly so clients don't send it. Ensure consistency between operationId update-expected-rack, the path parameter rackId, and the schema/examples to avoid contract mismatch.
🧹 Nitpick comments (1)
api/pkg/api/handler/expectedrack_test.go (1)
160-167: ⚡ Quick winAssert Temporal mock expectations to verify workflow dispatch paths.
These mocks set
ExecuteWorkflow/Getexpectations but never assert them, so success-path tests can pass even if workflow invocation regresses. Add per-subtestAssertExpectations(t)(or explicitAssertCalled) for cases that should trigger workflows.Also applies to: 843-850, 1039-1046, 1169-1175, 1422-1428
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/pkg/api/handler/expectedrack_test.go` around lines 160 - 167, The test sets up temporal mocks (mockTemporalClient, mockWorkflowRun) and configures ExecuteWorkflow and Get but never asserts expectations; update the relevant subtests that are expected to dispatch workflows to call mockTemporalClient.AssertExpectations(t) and mockWorkflowRun.AssertExpectations(t) (or use mockTemporalClient.AssertCalled/AssertCalled for ExecuteWorkflow and mockWorkflowRun.AssertCalled for Get) at the end of each such subtest so the test fails if ExecuteWorkflow/Get are not invoked as expected; apply the same pattern for the other occurrences referenced (lines around 843-850, 1039-1046, 1169-1175, 1422-1428) to ensure workflow dispatch paths are verified.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@sdk/standard/api_expected_rack.go`:
- Around line 134-154: The 4xx error branches decode into CarbideAPIError but
call formatErrorMessage (which expects Title/Detail), causing the server's
Carbide `message` to be lost; update the generator/template so after decoding
into CarbideAPIError (symbol: CarbideAPIError) you prefer the error's message
field when building the returned error (symbol: newErr.error) — e.g., set
newErr.error = v.Message (or v.message) when present, falling back to
formatErrorMessage(localVarHTTPResponse.Status, &v) only if the message is
empty; ensure this change is applied in the shared formatter or generation
template used to emit the branches that call formatErrorMessage so all endpoints
under sdk/standard/ inherit the fix.
---
Duplicate comments:
In `@api/pkg/api/model/expectedrack.go`:
- Around line 104-130: The Validate method on APIExpectedRackUpdateRequest
currently allows empty update payloads; update
APIExpectedRackUpdateRequest.Validate to reject requests that contain no mutable
fields to change by returning a validation error (e.g., "no fields to update").
After the existing checks for ID, RackProfileID, and Metadata, add a guard that
inspects the updatable fields (at minimum RackProfileID and Metadata.Labels as
referenced in Validate) and if RackProfileID is nil or empty and Metadata is nil
or Metadata.Labels is empty, return a validation.Errors entry (e.g., "update":
errors.New("no fields to update")). Ensure this logic is implemented in
APIExpectedRackUpdateRequest.Validate so empty `{}` or `{ "id": "rack-1" }` are
rejected.
In `@db/pkg/db/model/expectedrack.go`:
- Around line 61-63: The ExpectedRack model is currently using RackID as the
sole primary key which makes racks global; change the model to use a composite
primary key of (SiteID, RackID) by updating the struct tags on SiteID and RackID
in ExpectedRack (refer to fields RackID and SiteID) to mark both as primary
keys, then update any DAO methods that fetch or delete by RackID alone (e.g.,
Get/Delete methods referenced later around lines 135-143) to accept both siteID
and rackID parameters and scope their bun queries with a WHERE on site_id AND
rack_id so single-record operations are site-scoped.
- Around line 404-455: The bulk update builds a unioned columnsSet from inputs
and applies it to every row, causing unset fields to be zeroed for rows that
didn't include those fields; in UpdateMultiple, validate that all items in
inputs have the same non-nil field mask before performing the bulk update
(compute a per-input set of non-nil field names from inputs, compare each to the
first item's set and return an error if any differ), and only then build columns
from that common set and proceed to create expectedRacks and call
NewUpdate().Model(&expectedRacks).Column(...).Bulk().Exec so no mixed field
masks are applied; reference inputs, columnsSet, expectedRacks and the bulk
update call to locate the code.
In `@openapi/spec.yaml`:
- Around line 4185-4234: The path '/v2/org/{org}/carbide/expected-rack/all'
collides with the parameterized path
'/v2/org/{org}/carbide/expected-rack/{rackId}' because rackId currently allows
the literal "all"; update the rackId path parameter definition (used at
'/v2/org/{org}/carbide/expected-rack/{rackId}' and the duplicate at lines
~18490-18494) to explicitly forbid "all" by adding a validation pattern such as
^(?!all$).+ (or an equivalent negative lookahead) so that "all" is not a valid
rackId and the two routes no longer collide. Ensure you update both occurrences
of the rackId parameter.
- Around line 4274-4296: The PATCH operation update-expected-rack uses a path
key rackId but the request schema ExpectedRackUpdateRequest still exposes an id
field; update the schema so the request body does not require or advertise an id
mismatch — either remove the id property entirely from ExpectedRackUpdateRequest
(preferred) or rename it to rackId to match the path parameter, and update any
examples and validations accordingly; alternatively mark id as readOnly so
clients don't send it. Ensure consistency between operationId
update-expected-rack, the path parameter rackId, and the schema/examples to
avoid contract mismatch.
In `@site-workflow/pkg/grpc/server/forge_test_server.go`:
- Line 81: The ForgeServerImpl.er map is not synchronized; add a mutex field
(e.g., erMu sync.RWMutex) to ForgeServerImpl and protect all accesses: use
erMu.Lock()/erMu.Unlock() for mutating handlers (Add, Update, Delete,
ReplaceAll) and erMu.RLock()/erMu.RUnlock() for read handlers (GetAll, any
getters), ensuring every reference to the er map in methods around ExpectedRack
handling is guarded to prevent concurrent map read/write panics.
In `@site-workflow/pkg/workflow/expectedrack.go`:
- Line 50: The logger enrichment code is dereferencing request fields
(request.GetRackId().GetId(), request.GetRackProfileId().GetId()) before
checking request for nil, which can panic in the ExpectedRack workflow; update
the Create path (and the similar occurrences at the other spots) to safely
extract IDs with nil checks or helper getters before building logger: first
validate request != nil and obtain rackID and rackProfileID into local strings
using guarded logic (e.g., if request != nil && request.GetRackId() != nil {
rackID = request.GetRackId().GetId() } ), then call log.With() to create the
logger using those local values so no nil deref occurs and activity-layer
validation still runs.
---
Nitpick comments:
In `@api/pkg/api/handler/expectedrack_test.go`:
- Around line 160-167: The test sets up temporal mocks (mockTemporalClient,
mockWorkflowRun) and configures ExecuteWorkflow and Get but never asserts
expectations; update the relevant subtests that are expected to dispatch
workflows to call mockTemporalClient.AssertExpectations(t) and
mockWorkflowRun.AssertExpectations(t) (or use
mockTemporalClient.AssertCalled/AssertCalled for ExecuteWorkflow and
mockWorkflowRun.AssertCalled for Get) at the end of each such subtest so the
test fails if ExecuteWorkflow/Get are not invoked as expected; apply the same
pattern for the other occurrences referenced (lines around 843-850, 1039-1046,
1169-1175, 1422-1428) to ensure workflow dispatch paths are verified.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 797fa65c-5050-41ff-8691-e9a9d39a0b52
⛔ Files ignored due to path filters (4)
workflow-schema/schema/site-agent/workflows/v1/common_carbide.pb.gois excluded by!**/*.pb.go,!**/*.pb.goworkflow-schema/schema/site-agent/workflows/v1/forge_carbide.pb.gois excluded by!**/*.pb.go,!**/*.pb.goworkflow-schema/site-agent/workflows/v1/common_carbide.protois excluded by!workflow-schema/site-agent/workflows/v1/*_carbide.protoworkflow-schema/site-agent/workflows/v1/forge_carbide.protois excluded by!workflow-schema/site-agent/workflows/v1/*_carbide.proto
📒 Files selected for processing (34)
api/pkg/api/handler/expectedrack.goapi/pkg/api/handler/expectedrack_test.goapi/pkg/api/model/expectedrack.goapi/pkg/api/model/expectedrack_test.goapi/pkg/api/model/util/rack_labels.goapi/pkg/api/routes.godb/pkg/db/model/expectedrack.godb/pkg/db/model/expectedrack_test.godb/pkg/migrations/20260429100000_expected_rack.godocs/index.htmlopenapi/spec.yamlsdk/standard/api_expected_rack.gosdk/standard/client.gosdk/standard/model_expected_rack.gosdk/standard/model_expected_rack_create_request.gosdk/standard/model_expected_rack_list.gosdk/standard/model_expected_rack_metadata.gosdk/standard/model_expected_rack_update_request.gosite-agent/pkg/components/managers/expectedrack/access.gosite-agent/pkg/components/managers/expectedrack/init.gosite-agent/pkg/components/managers/expectedrack/subscriber.gosite-agent/pkg/components/managers/manager.gosite-agent/pkg/components/managers/manageraccess.gosite-agent/pkg/components/managers/managerapi/expectedrack_api.gosite-agent/pkg/components/managers/managerapi/managerapi.gosite-agent/pkg/components/managers/workflow/orchestrator.gosite-agent/pkg/datatypes/managertypes/workflow/workflowtypes.gosite-workflow/pkg/activity/expectedrack.gosite-workflow/pkg/activity/expectedrack_test.gosite-workflow/pkg/activity/rack_labels.gosite-workflow/pkg/grpc/client/testing.gosite-workflow/pkg/grpc/server/forge_test_server.gosite-workflow/pkg/workflow/expectedrack.gosite-workflow/pkg/workflow/expectedrack_test.go
✅ Files skipped from review due to trivial changes (8)
- site-agent/pkg/components/managers/manageraccess.go
- site-agent/pkg/components/managers/managerapi/expectedrack_api.go
- api/pkg/api/model/util/rack_labels.go
- site-workflow/pkg/activity/rack_labels.go
- sdk/standard/model_expected_rack_metadata.go
- site-workflow/pkg/workflow/expectedrack_test.go
- sdk/standard/model_expected_rack_create_request.go
- api/pkg/api/handler/expectedrack.go
🚧 Files skipped from review as they are similar to previous changes (6)
- site-agent/pkg/components/managers/workflow/orchestrator.go
- sdk/standard/client.go
- db/pkg/migrations/20260429100000_expected_rack.go
- site-agent/pkg/components/managers/expectedrack/init.go
- site-workflow/pkg/activity/expectedrack_test.go
- api/pkg/api/model/expectedrack_test.go
| if localVarHTTPResponse.StatusCode == 400 { | ||
| var v CarbideAPIError | ||
| err = a.client.decode(&v, localVarBody, localVarHTTPResponse.Header.Get("Content-Type")) | ||
| if err != nil { | ||
| newErr.error = err.Error() | ||
| return localVarReturnValue, localVarHTTPResponse, newErr | ||
| } | ||
| newErr.error = formatErrorMessage(localVarHTTPResponse.Status, &v) | ||
| newErr.model = v | ||
| return localVarReturnValue, localVarHTTPResponse, newErr | ||
| } | ||
| if localVarHTTPResponse.StatusCode == 403 { | ||
| var v CarbideAPIError | ||
| err = a.client.decode(&v, localVarBody, localVarHTTPResponse.Header.Get("Content-Type")) | ||
| if err != nil { | ||
| newErr.error = err.Error() | ||
| return localVarReturnValue, localVarHTTPResponse, newErr | ||
| } | ||
| newErr.error = formatErrorMessage(localVarHTTPResponse.Status, &v) | ||
| newErr.model = v | ||
| } |
There was a problem hiding this comment.
Use the Carbide message field when building 4xx errors.
All of these branches decode into CarbideAPIError and then call formatErrorMessage, but that formatter expects RFC7807-style Title/Detail fields. The result is that every new Expected Rack SDK method collapses 400/403/404 responses into a generic status string instead of surfacing the server’s message, which makes client-side diagnosis much harder. Please fix this in the shared formatter or generator template so regenerated endpoints inherit the correct behavior. Based on learnings: files under sdk/standard/ are auto-generated output from the SDK generation tooling.
Also applies to: 273-293, 406-437, 591-611, 744-775, 898-918, 1055-1085
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/standard/api_expected_rack.go` around lines 134 - 154, The 4xx error
branches decode into CarbideAPIError but call formatErrorMessage (which expects
Title/Detail), causing the server's Carbide `message` to be lost; update the
generator/template so after decoding into CarbideAPIError (symbol:
CarbideAPIError) you prefer the error's message field when building the returned
error (symbol: newErr.error) — e.g., set newErr.error = v.Message (or v.message)
when present, falling back to formatErrorMessage(localVarHTTPResponse.Status,
&v) only if the message is empty; ensure this change is applied in the shared
formatter or generation template used to emit the branches that call
formatErrorMessage so all endpoints under sdk/standard/ inherit the fix.
47d035f to
1f2fb8d
Compare
🔍 Container Scan Summary
Per-CVE detail lives in the per-service |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (4)
api/pkg/api/handler/expectedrack.go (3)
167-171:⚠️ Potential issue | 🟠 Major | ⚡ Quick winScope the duplicate
rackIdcheck to the target site.
rackIdis site-scoped, but this preflight lookup filters only byRackIDs. A create in Site B will incorrectly return409when the same rack ID already exists only in Site A.Suggested fix
existingRacks, count, err := erDAOForCheck.GetAll(ctx, nil, cdbm.ExpectedRackFilterInput{ - RackIDs: []string{apiRequest.RackID}, + RackIDs: []string{apiRequest.RackID}, + SiteIDs: []uuid.UUID{site.ID}, }, paginator.PageInput{ Limit: cdb.GetIntPtr(1), }, nil)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/pkg/api/handler/expectedrack.go` around lines 167 - 171, The duplicate rackId check is missing site scoping: when calling erDAOForCheck.GetAll to preflight the create, include the target site filter (e.g., apiRequest.SiteID) in the ExpectedRackFilterInput so the lookup becomes site-scoped rather than global; update the GetAll call that currently filters only by RackIDs (and references apiRequest.RackID) to also set SiteID (or SiteIDs) from apiRequest.SiteID so only racks within the same site are considered for the 409 conflict check.
332-334:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not force dual-role callers down the tenant-only branch.
This branch still rejects callers when
tenant != nileven if the caller also has provider identity. The “siteId required” check should apply only to tenant-only requests.Suggested fix
- } else if tenant != nil { + } else if tenant != nil && infrastructureProvider == nil { // Tenants must specify a Site ID return cutil.NewAPIErrorResponse(c, http.StatusBadRequest, "Site ID must be specified in query when retrieving Expected Racks as a Tenant", nil) } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/pkg/api/handler/expectedrack.go` around lines 332 - 334, The current branch rejects callers when tenant != nil even if they also have provider identity; update the conditional that enforces "Site ID must be specified" so it only runs for tenant-only callers (e.g., change the branch from "else if tenant != nil" to "else if tenant != nil && provider == nil" or the equivalent check used in this file), leaving the existing API error response unchanged so multi-role (tenant+provider) callers are not forced into the tenant-only path.
483-485:⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy liftResolve and mutate ExpectedRacks with site scope, not bare
rackId.These handlers still look up and delete by
rackIdalone. If two sites both contain the same rack ID, GET/PATCH can read the wrong row and DELETE can remove the wrong site's record. Read the optionalsiteIdquery parameter and make the DAO resolution/deletion site-scoped, or reject ambiguousrackIdvalues whensiteIdis omitted.Also applies to: 603-605, 771-773, 809-810
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/pkg/api/handler/expectedrack.go` around lines 483 - 485, Handlers currently resolve ExpectedRack by bare rackID (see NewExpectedRackDAO and erDAO.Get / erDAO.Delete calls), which can cross sites; change the handlers to read the optional siteId query parameter from the request and use a site-scoped DAO lookup/delete: pass siteId into the ExpectedRack DAO resolution (or into Get/Delete calls) so the query is constrained to that site, or if siteId is omitted detect ambiguity (multiple rows for the same rackID across sites) and return an error asking the caller to supply siteId. Update all occurrences (the shown Get call and the other similar calls around lines ~603-605, ~771-773, ~809-810) to use the site-scoped resolution or reject ambiguous rackId when siteId is missing.site-workflow/pkg/grpc/client/testing.go (1)
1224-1313:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftMirror the Forge test-server contract in this mock.
These methods still dereference
inbefore nil-checking, returncodes.Internalfor request validation failures, and makeGet/GetAllsynthesize data instead of reflecting prior add/update/delete/replace calls. That means the workflow tests can pass against a client mock that accepts and returns states the in-repo server test double would not.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@site-workflow/pkg/grpc/client/testing.go` around lines 1224 - 1313, The mock methods (AddExpectedRack, UpdateExpectedRack, DeleteExpectedRack, GetExpectedRack, GetAllExpectedRacks, ReplaceAllExpectedRacks, DeleteAllExpectedRacks) must mirror the test-server contract: perform nil checks on the request object before dereferencing, return codes.InvalidArgument (not codes.Internal) for client request validation errors, and maintain an in-memory store so state changes persist across calls. Update MockForgeClient to hold a concurrency-safe map (e.g., map[string]*wflows.ExpectedRack plus a mutex) and change each method to (1) check in != nil before accessing fields, (2) validate IDs and return status.Error(codes.InvalidArgument, "...") on bad input, and (3) mutate or read the in-memory store for AddExpectedRack (insert), UpdateExpectedRack (update existing; error if missing), DeleteExpectedRack (remove), GetExpectedRack (lookup and return stored value), GetAllExpectedRacks (return all stored values, honoring ctx "wantError" for simulated errors), ReplaceAllExpectedRacks (replace store with provided list after validating entries), and DeleteAllExpectedRacks (clear the store).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/pkg/api/handler/expectedrack.go`:
- Around line 220-224: Workflow IDs are currently built only from
expectedRack.RackID which can collide across sites; update the
tclient.StartWorkflowOptions.ID construction to include the site identifier
(e.g., use "expected-rack-create-"+expectedRack.SiteID+"-"+expectedRack.RackID)
and apply the same pattern for the update and delete workflows (replace the ID
values in the StartWorkflowOptions for create/update/delete to include
expectedRack.SiteID or the local siteID variable so IDs are unique per site).
- Around line 183-246: The current code runs common.ExecuteSyncWorkflow
(triggering the site workflow) while the SQL transaction (tx) is still open,
creating a dual-write/inconsistency risk if tx.Commit() later fails; fix by
moving the workflow invocation (common.ExecuteSyncWorkflow with stc,
"CreateExpectedRack", workflowOptions, createExpectedRackRequest) to after the
DB commit and after setting txCommitted = true so the transaction is
closed/committed before contacting the site, keeping the DB creation flow
(erDAO.Create, buildExpectedRackProto) intact and preserving error handling
around tx.Commit() and the retrieval of the Temporal client
(cerh.scp.GetClientByID) but ensure you still return appropriate API errors from
ExecuteSyncWorkflow when invoked post-commit.
---
Duplicate comments:
In `@api/pkg/api/handler/expectedrack.go`:
- Around line 167-171: The duplicate rackId check is missing site scoping: when
calling erDAOForCheck.GetAll to preflight the create, include the target site
filter (e.g., apiRequest.SiteID) in the ExpectedRackFilterInput so the lookup
becomes site-scoped rather than global; update the GetAll call that currently
filters only by RackIDs (and references apiRequest.RackID) to also set SiteID
(or SiteIDs) from apiRequest.SiteID so only racks within the same site are
considered for the 409 conflict check.
- Around line 332-334: The current branch rejects callers when tenant != nil
even if they also have provider identity; update the conditional that enforces
"Site ID must be specified" so it only runs for tenant-only callers (e.g.,
change the branch from "else if tenant != nil" to "else if tenant != nil &&
provider == nil" or the equivalent check used in this file), leaving the
existing API error response unchanged so multi-role (tenant+provider) callers
are not forced into the tenant-only path.
- Around line 483-485: Handlers currently resolve ExpectedRack by bare rackID
(see NewExpectedRackDAO and erDAO.Get / erDAO.Delete calls), which can cross
sites; change the handlers to read the optional siteId query parameter from the
request and use a site-scoped DAO lookup/delete: pass siteId into the
ExpectedRack DAO resolution (or into Get/Delete calls) so the query is
constrained to that site, or if siteId is omitted detect ambiguity (multiple
rows for the same rackID across sites) and return an error asking the caller to
supply siteId. Update all occurrences (the shown Get call and the other similar
calls around lines ~603-605, ~771-773, ~809-810) to use the site-scoped
resolution or reject ambiguous rackId when siteId is missing.
In `@site-workflow/pkg/grpc/client/testing.go`:
- Around line 1224-1313: The mock methods (AddExpectedRack, UpdateExpectedRack,
DeleteExpectedRack, GetExpectedRack, GetAllExpectedRacks,
ReplaceAllExpectedRacks, DeleteAllExpectedRacks) must mirror the test-server
contract: perform nil checks on the request object before dereferencing, return
codes.InvalidArgument (not codes.Internal) for client request validation errors,
and maintain an in-memory store so state changes persist across calls. Update
MockForgeClient to hold a concurrency-safe map (e.g.,
map[string]*wflows.ExpectedRack plus a mutex) and change each method to (1)
check in != nil before accessing fields, (2) validate IDs and return
status.Error(codes.InvalidArgument, "...") on bad input, and (3) mutate or read
the in-memory store for AddExpectedRack (insert), UpdateExpectedRack (update
existing; error if missing), DeleteExpectedRack (remove), GetExpectedRack
(lookup and return stored value), GetAllExpectedRacks (return all stored values,
honoring ctx "wantError" for simulated errors), ReplaceAllExpectedRacks (replace
store with provided list after validating entries), and DeleteAllExpectedRacks
(clear the store).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: bd806cf1-36a5-45b8-8e5b-3bf979cd1507
⛔ Files ignored due to path filters (4)
workflow-schema/schema/site-agent/workflows/v1/common_carbide.pb.gois excluded by!**/*.pb.go,!**/*.pb.goworkflow-schema/schema/site-agent/workflows/v1/forge_carbide.pb.gois excluded by!**/*.pb.go,!**/*.pb.goworkflow-schema/site-agent/workflows/v1/common_carbide.protois excluded by!workflow-schema/site-agent/workflows/v1/*_carbide.protoworkflow-schema/site-agent/workflows/v1/forge_carbide.protois excluded by!workflow-schema/site-agent/workflows/v1/*_carbide.proto
📒 Files selected for processing (35)
api/pkg/api/handler/expectedrack.goapi/pkg/api/handler/expectedrack_test.goapi/pkg/api/model/expectedrack.goapi/pkg/api/model/expectedrack_test.goapi/pkg/api/model/util/rack_labels.goapi/pkg/api/routes.goapi/pkg/api/routes_test.godb/pkg/db/model/expectedrack.godb/pkg/db/model/expectedrack_test.godb/pkg/migrations/20260429100000_expected_rack.godocs/index.htmlopenapi/spec.yamlsdk/standard/api_expected_rack.gosdk/standard/client.gosdk/standard/model_expected_rack.gosdk/standard/model_expected_rack_create_request.gosdk/standard/model_expected_rack_list.gosdk/standard/model_expected_rack_metadata.gosdk/standard/model_expected_rack_update_request.gosite-agent/pkg/components/managers/expectedrack/access.gosite-agent/pkg/components/managers/expectedrack/init.gosite-agent/pkg/components/managers/expectedrack/subscriber.gosite-agent/pkg/components/managers/manager.gosite-agent/pkg/components/managers/manageraccess.gosite-agent/pkg/components/managers/managerapi/expectedrack_api.gosite-agent/pkg/components/managers/managerapi/managerapi.gosite-agent/pkg/components/managers/workflow/orchestrator.gosite-agent/pkg/datatypes/managertypes/workflow/workflowtypes.gosite-workflow/pkg/activity/expectedrack.gosite-workflow/pkg/activity/expectedrack_test.gosite-workflow/pkg/activity/rack_labels.gosite-workflow/pkg/grpc/client/testing.gosite-workflow/pkg/grpc/server/forge_test_server.gosite-workflow/pkg/workflow/expectedrack.gosite-workflow/pkg/workflow/expectedrack_test.go
✅ Files skipped from review due to trivial changes (10)
- api/pkg/api/routes_test.go
- api/pkg/api/model/util/rack_labels.go
- site-agent/pkg/components/managers/managerapi/expectedrack_api.go
- api/pkg/api/model/expectedrack.go
- sdk/standard/model_expected_rack.go
- sdk/standard/model_expected_rack_create_request.go
- sdk/standard/model_expected_rack_update_request.go
- site-workflow/pkg/workflow/expectedrack_test.go
- site-workflow/pkg/activity/rack_labels.go
- sdk/standard/api_expected_rack.go
🚧 Files skipped from review as they are similar to previous changes (9)
- site-agent/pkg/components/managers/workflow/orchestrator.go
- site-workflow/pkg/grpc/server/forge_test_server.go
- site-agent/pkg/components/managers/expectedrack/init.go
- db/pkg/migrations/20260429100000_expected_rack.go
- api/pkg/api/routes.go
- site-agent/pkg/components/managers/expectedrack/subscriber.go
- api/pkg/api/model/expectedrack_test.go
- site-agent/pkg/components/managers/expectedrack/access.go
- api/pkg/api/handler/expectedrack_test.go
| // Start a db transaction | ||
| tx, err := cdb.BeginTx(ctx, cerh.dbSession, &sql.TxOptions{}) | ||
| if err != nil { | ||
| logger.Error().Err(err).Msg("unable to start transaction") | ||
| return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to create Expected Rack due to DB transaction error", nil) | ||
| } | ||
| // this variable is used in cleanup actions to indicate if this transaction committed | ||
| txCommitted := false | ||
| defer common.RollbackTx(ctx, tx, &txCommitted) | ||
|
|
||
| // Build create input from request, defaulting metadata to zero values when omitted | ||
| createInput := cdbm.ExpectedRackCreateInput{ | ||
| RackID: apiRequest.RackID, | ||
| SiteID: site.ID, | ||
| RackProfileID: apiRequest.RackProfileID, | ||
| CreatedBy: dbUser.ID, | ||
| } | ||
| if apiRequest.Metadata != nil { | ||
| createInput.MetadataName = apiRequest.Metadata.Name | ||
| createInput.MetadataDescription = apiRequest.Metadata.Description | ||
| createInput.MetadataLabels = apiRequest.Metadata.Labels | ||
| } | ||
|
|
||
| // Create the ExpectedRack in DB | ||
| erDAO := cdbm.NewExpectedRackDAO(cerh.dbSession) | ||
| expectedRack, err := erDAO.Create(ctx, tx, createInput) | ||
| if err != nil { | ||
| logger.Error().Err(err).Msg("error creating ExpectedRack record in DB") | ||
| return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to create Expected Rack due to DB error", nil) | ||
| } | ||
|
|
||
| // Build the create request for workflow | ||
| createExpectedRackRequest := buildExpectedRackProto(expectedRack.RackID, expectedRack.RackProfileID, apiRequest.Metadata) | ||
|
|
||
| logger.Info().Msg("triggering Expected Rack create workflow on Site") | ||
|
|
||
| // Create workflow options | ||
| workflowOptions := tclient.StartWorkflowOptions{ | ||
| ID: "expected-rack-create-" + expectedRack.RackID, | ||
| WorkflowExecutionTimeout: cutil.WorkflowExecutionTimeout, | ||
| TaskQueue: queue.SiteTaskQueue, | ||
| } | ||
|
|
||
| // Get the temporal client for the site we are working with | ||
| stc, err := cerh.scp.GetClientByID(site.ID) | ||
| if err != nil { | ||
| logger.Error().Err(err).Msg("failed to retrieve Temporal client for Site") | ||
| return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to retrieve client for Site", nil) | ||
| } | ||
|
|
||
| // Run workflow | ||
| apiErr := common.ExecuteSyncWorkflow(ctx, logger, stc, "CreateExpectedRack", workflowOptions, createExpectedRackRequest) | ||
| if apiErr != nil { | ||
| return cutil.NewAPIErrorResponse(c, apiErr.Code, apiErr.Message, apiErr.Data) | ||
| } | ||
|
|
||
| // Commit transaction | ||
| err = tx.Commit() | ||
| if err != nil { | ||
| logger.Error().Err(err).Msg("error committing ExpectedRack transaction to DB") | ||
| return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to create Expected Rack due to DB transaction error", nil) | ||
| } | ||
| // Set committed so, deferred cleanup functions will do nothing | ||
| txCommitted = true |
There was a problem hiding this comment.
Do not execute the site workflow inside the open SQL transaction.
Each mutation writes the DB row, runs the synchronous Temporal workflow, and only then commits. If the workflow succeeds and tx.Commit() fails, the site agent has already applied the change while the DB rolls it back. This adds a real dual-write inconsistency path to every ExpectedRack mutation.
Also applies to: 631-703, 799-850, 951-1021, 1114-1156
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/pkg/api/handler/expectedrack.go` around lines 183 - 246, The current code
runs common.ExecuteSyncWorkflow (triggering the site workflow) while the SQL
transaction (tx) is still open, creating a dual-write/inconsistency risk if
tx.Commit() later fails; fix by moving the workflow invocation
(common.ExecuteSyncWorkflow with stc, "CreateExpectedRack", workflowOptions,
createExpectedRackRequest) to after the DB commit and after setting txCommitted
= true so the transaction is closed/committed before contacting the site,
keeping the DB creation flow (erDAO.Create, buildExpectedRackProto) intact and
preserving error handling around tx.Commit() and the retrieval of the Temporal
client (cerh.scp.GetClientByID) but ensure you still return appropriate API
errors from ExecuteSyncWorkflow when invoked post-commit.
a32bcaf to
81db0a0
Compare
thossain-nv
left a comment
There was a problem hiding this comment.
Thank you for the implementation @chet! Left some notes on a few topics.
| RackID string `bun:"rack_id,pk"` | ||
| SiteID uuid.UUID `bun:"site_id,type:uuid,notnull"` | ||
| Site *Site `bun:"rel:belongs-to,join:site_id=id"` | ||
| RackProfileID string `bun:"rack_profile_id,notnull"` |
There was a problem hiding this comment.
Should Rack ID be the PK of Expected Rack object or can it have a UUID? The other option is to make (rack_id, site_id) unique
Advantage of UUID:
- We can have same Rack ID in more than one Sites, allowing decisions to move Racks between data halls
- URLs don't have to nest under
/site/<site-id>/expected-rack rack_idcan still be marked unique per Site
Oh yeah, definitely UUID. It's what we do for switches and power shelves too. Fixing! |
|
Thank you for the changes @chet! We're missing one last part:
|
Oops, my bad, sorry! ...adding! I need a checklist of things to keep track of. You'd think I would have known this from the previous Expected* calls I added. 😆 |
This adds `ExpectedRack` REST APIs to create, update, and get `ExpectedRack` instances, which is similar to the work for adding expected switch and power shelf API calls as well in NVIDIA#220. Tried to pattern match as much as possible again. Tests included! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
Description
This adds
ExpectedRackREST APIs to create, update, and getExpectedRackinstances, which is similar to the work for addingExpectedSwitchandExpectedPowerShelfAPI calls as done in #220 (with similar dual-write approach as implemented in #303. Tried to pattern match as much as possible again.This also folds in the same fixes applied in #448, as well as the
ToProtopattern from #467 and #456.Tests included!
Signed-off-by: Chet Nichols III chetn@nvidia.com
Type of Change
Services Affected
Related Issues (Optional)
None
Breaking Changes
Testing
Additional Notes
None