Skip to content

refactor: Migrate networksecuritygroup handler to WithTx#478

Open
chet wants to merge 1 commit intoNVIDIA:mainfrom
chet:with-tx-networksecuritygroup
Open

refactor: Migrate networksecuritygroup handler to WithTx#478
chet wants to merge 1 commit intoNVIDIA:mainfrom
chet:with-tx-networksecuritygroup

Conversation

@chet
Copy link
Copy Markdown
Contributor

@chet chet commented May 4, 2026

Description

Applies WithTx (and WithTxResult!) from #462 to the Create/Update/Delete NSG handlers.

Implements our "timeoutResp pattern" (which is something we had introduced in #472, and then @coderabbitai said we should be consistent by doing it everywhere). TLDR is the existing code calls common.TerminateWorkflowOnTimeOut on timeout, but we want to defer that until after the transaction is unwound + DB connection back (because we don't want it to block waiting on the network).

The adjustment (which we've done before, but figured I'd call it out more explicitly here) is effectively:

    var timeoutResp func() error

    err = cdb.WithTx(ctx, ..., func(tx *cdb.Tx) error {
      ...
      if /* workflow timeout detected */ {
        // capture the terminate work, but DON'T do it yet
        timeoutResp = func() error {
          return common.TerminateWorkflowOnTimeOut(...)
        }
        return cutil.NewAPIError(...)   // forces rollback
      }
      ...
    })

    // rollback has now completed, now we do potentially blocking network work
    if timeoutResp != nil {
      return timeoutResp()
    }

Also addressed some @coderabbitai feedback around log messages in advance.

Signed-off-by: Chet Nichols III chetn@nvidia.com

Type of Change

  • Feature - New feature or functionality (feat:)
  • Fix - Bug fixes (fix:)
  • Chore - Modification or removal of existing functionality (chore:)
  • Refactor - Refactoring of existing functionality (refactor:)
  • Docs - Changes in documentation or OpenAPI schema (docs:)
  • CI - Changes in GitHub workflows. Requires additional scrutiny (ci:)
  • Version - Issuing a new release version (version:)

Services Affected

  • API - API models or endpoints updated
  • Workflow - Workflow service updated
  • DB - DB DAOs or migrations updated
  • Site Manager - Site Manager updated
  • Cert Manager - Cert Manager updated
  • Site Agent - Site Agent updated
  • RLA - RLA service updated
  • Powershelf Manager - Powershelf Manager updated
  • NVSwitch Manager - NVSwitch Manager updated

Related Issues (Optional)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

Applies `WithTx` (and `WithTxResult`!) from NVIDIA#462 to the `Create`/`Update`/`Delete` NSG handlers.

Implements our "`timeoutResp` pattern" (which is something we had introduced in NVIDIA#472, and then @coderabbitai said we should be consistent by doing it everywhere). TLDR is the existing code calls `common.TerminateWorkflowOnTimeOut` on timeout, but we want to defer that until after the transaction is unwound + DB connection back (because we don't want it to block waiting on the network).

The adjustment (which we've done before, but figured I'd call it out more explicitly here) is effectively:
```
    var timeoutResp func() error

    err = cdb.WithTx(ctx, ..., func(tx *cdb.Tx) error {
      ...
      if /* workflow timeout detected */ {
        // capture the terminate work, but DON'T do it yet
        timeoutResp = func() error {
          return common.TerminateWorkflowOnTimeOut(...)
        }
        return cutil.NewAPIError(...)   // forces rollback
      }
      ...
    })

    // rollback has now completed, now we do potentially blocking network work
    if timeoutResp != nil {
      return timeoutResp()
    }
```

Also addressed some @coderabbitai feedback around log messages in advance.

Signed-off-by: Chet Nichols III <chetn@nvidia.com>
@chet chet requested a review from a team as a code owner May 4, 2026 16:57
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

Summary by CodeRabbit

Release Notes

No user-facing changes – This release includes internal code refactoring and optimization of transaction handling for improved reliability and maintainability. End-users will not notice any visible differences in functionality or behavior.

Walkthrough

Manual database transaction handling in the NetworkSecurityGroup Create, Delete, and Update handler methods is replaced with transaction helper functions (cdb.WithTxResult and cdb.WithTx). Workflow execution and timeout termination are relocated to ensure proper ordering relative to transaction boundaries. The database/sql import is removed as a consequence of eliminating direct sql.TxOptions usage.

Changes

NetworkSecurityGroup Transaction Management Refactoring

Layer / File(s) Summary
Import Cleanup
api/pkg/api/handler/networksecuritygroup.go (lines 20–21)
Removes database/sql import, no longer required after eliminating direct sql.TxOptions instantiation.
Create Handler Transaction Refactoring
api/pkg/api/handler/networksecuritygroup.go (lines 218–360)
Replaces explicit BeginTx/Commit/Rollback with cdb.WithTxResult. Moves NSG creation, status-detail creation, and synchronous Temporal workflow execution into the transaction closure. Captures timeoutResp outside closure for deferred execution after transaction finalization.
Delete Handler Transaction Refactoring
api/pkg/api/handler/networksecuritygroup.go (lines 907–1018)
Replaces manual transaction management with cdb.WithTx. Executes NSG status update, deletion, status-detail creation, and synchronous Temporal delete workflow within the closure. Defers timeoutResp callback until after closure returns, preserving rollback semantics on error.
Update Handler Transaction Refactoring
api/pkg/api/handler/networksecuritygroup.go (lines 1193–1325)
Replaces explicit transaction handling with cdb.WithTxResult. Captures updated NSG and status details from within the closure for response construction. Moves synchronous Temporal update workflow and timeout logic into the closure with deferred termination callback.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the primary refactoring work: migrating the networksecuritygroup handler to use the WithTx pattern.
Description check ✅ Passed The description provides comprehensive context about the refactoring, explains the timeoutResp pattern with code examples, and references related PRs (#462, #472), clearly relating to the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Review rate limit: 9/10 reviews remaining, refill in 6 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-05-04 16:58:16 UTC | Commit: 064594a

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
api/pkg/api/handler/networksecuritygroup.go (1)

937-940: 💤 Low value

Inconsistent error handling for status detail creation.

In the Create handler (lines 253-256), a failed sdDAO.CreateFromParams call returns an error and aborts the transaction. Here in Delete, the error is logged but silently ignored, allowing the deletion to proceed. If this is intentional—treating status detail as non-critical for deletes—consider adding a brief comment to document this design decision; otherwise, align with the Create handler's behavior.

💡 Suggested documentation if intentional
 		// Create status detail
+		// NOTE: Status detail creation is non-critical for deletes; log and continue.
 		if _, derr := sdDAO.CreateFromParams(ctx, tx, nsg.ID, *cdb.GetStrPtr(cdbm.NetworkSecurityGroupStatusDeleting),
 			cdb.GetStrPtr("received request for deletion, pending processing")); derr != nil {
 			logger.Error().Err(derr).Msg("error creating Status Detail DB entry")
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/handler/networksecuritygroup.go` around lines 937 - 940, The
Delete handler currently calls sdDAO.CreateFromParams and only logs errors while
Create handler treats that error as fatal and aborts the transaction; make them
consistent by either (A) propagating the error from sdDAO.CreateFromParams in
the Delete handler (roll back/abort the current transaction and return the error
exactly like the Create handler does), or (B) if the status-detail write is
intentionally non-critical on delete, add a concise comment above the
sdDAO.CreateFromParams call explaining this design decision so future readers
know the difference; reference sdDAO.CreateFromParams, the Delete handler in
networksecuritygroup.go, and the Create handler behavior when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@api/pkg/api/handler/networksecuritygroup.go`:
- Around line 937-940: The Delete handler currently calls sdDAO.CreateFromParams
and only logs errors while Create handler treats that error as fatal and aborts
the transaction; make them consistent by either (A) propagating the error from
sdDAO.CreateFromParams in the Delete handler (roll back/abort the current
transaction and return the error exactly like the Create handler does), or (B)
if the status-detail write is intentionally non-critical on delete, add a
concise comment above the sdDAO.CreateFromParams call explaining this design
decision so future readers know the difference; reference
sdDAO.CreateFromParams, the Delete handler in networksecuritygroup.go, and the
Create handler behavior when making the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 32121a4c-e1eb-42a6-8182-d78927e24bf4

📥 Commits

Reviewing files that changed from the base of the PR and between bce7503 and 064594a.

📒 Files selected for processing (1)
  • api/pkg/api/handler/networksecuritygroup.go

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
carbide-nsm 66 4 20 33 9 0
carbide-psm 58 6 29 13 2 8
carbide-rest-api 60 6 31 13 2 8
carbide-rest-cert-manager 54 4 28 13 1 8
carbide-rest-db 58 6 29 13 2 8
carbide-rest-site-agent 55 5 28 13 1 8
carbide-rest-site-manager 54 4 28 13 1 8
carbide-rest-workflow 59 6 30 13 2 8
carbide-rla 57 6 28 13 2 8
TOTAL 521 47 251 137 22 64

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant