Skip to content

feat: Allow updating Site capabilities using Site update API endpoint#470

Open
thossain-nv wants to merge 3 commits intomainfrom
feat/site-cap-update
Open

feat: Allow updating Site capabilities using Site update API endpoint#470
thossain-nv wants to merge 3 commits intomainfrom
feat/site-cap-update

Conversation

@thossain-nv
Copy link
Copy Markdown
Contributor

Description

Allows Provider to enable or disable supported Site capabilities through Site update API endpoint.

Type of Change

  • Feature - New feature or functionality (feat:)

Services Affected

  • API - API models or endpoints updated
  • DB - DB DAOs or migrations updated

Related Issues (Optional)

Addresses: #425

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated

Additional Notes

None

@thossain-nv thossain-nv requested a review from a team as a code owner May 2, 2026 01:43
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 2, 2026

Summary by CodeRabbit

  • New Features
    • Site capabilities can now be updated by providers, including NativeNetworking, NetworkSecurityGroup, NVLink Partition, Rack Level Administration, and Image-Based Operating System settings.
    • Added validation to prevent disabling NativeNetworking when FNN virtualization resources exist.

Walkthrough

Adds provider-only partial Site capability updates (nativeNetworking, networkSecurityGroup, nvLinkPartition, rackLevelAdministration, imageBasedOperatingSystem), enforces a precondition preventing disabling NativeNetworking when FNN-virtualized VPCs exist, updates persistence to refresh updated timestamp on config merges, and adjusts test helpers and many tests to support parameterized VPC virtualization type and SiteConfig persistence.

Changes

Site Capabilities Update

Layer / File(s) Summary
API Model / Types
api/pkg/api/model/site.go, api/pkg/api/model/site_test.go
Added APISiteCapabilitiesUpdateRequest with ToSiteConfig(existing *cdbm.SiteConfig) and added Capabilities *APISiteCapabilitiesUpdateRequest to APISiteUpdateRequest; validation disallows Capabilities for non-provider requesters.
Handler Implementation
api/pkg/api/handler/site.go
UpdateSiteHandler.Handle performs a pre-check: if Capabilities present and NativeNetworking is being disabled, it queries VPCs for FNN virtualization type and rejects the update (400) if any exist; when provided, maps capabilities into SiteUpdateInput.Config.
Database Persistence
db/pkg/db/model/site.go, db/pkg/db/model/site_test.go
SiteSQLDAO.Update now sets updated = current_timestamp when merging config JSONB to ensure update timestamp changes on partial config updates; minor test literal ordering adjusted.
OpenAPI / SDK
openapi/spec.yaml, sdk/standard/model_site_capabilities_update_request.go, sdk/standard/model_site_update_request.go, sdk/standard/api_site.go
OpenAPI: added SiteCapabilitiesUpdateRequest schema and capabilities on SiteUpdateRequest. SDK: generated SiteCapabilitiesUpdateRequest model (NullableBool tri-state fields, accessors, JSON behavior) and added Capabilities field and accessors to SiteUpdateRequest. Minor API doc wording updated.
Tests: Site-related
api/pkg/api/handler/site_test.go
Site tests extended: Site creation now persists Config, defaulting nil to empty SiteConfig; new test sites exercise capability flags; added test cases covering capability updates including rejection when VPCs/FNN exist; test DB resets updated to include NVLinkLogicalPartition, NetworkSecurityGroup, Vpc tables.

Test VPC Helper Signature & Call Sites

Layer / File(s) Summary
Test Helper
api/pkg/api/handler/util/common/testing.go
TestBuildVPC signature extended with networkVirtualizationType *string; defaults to Ethernet when nil; VpcCreateInput uses the provided pointer.
Test Call Sites
api/pkg/api/handler/ipblock_test.go, api/pkg/api/handler/dpuextensionservice_test.go, api/pkg/api/handler/instancetype_test.go, api/pkg/api/handler/machineinstancetype_test.go, api/pkg/api/handler/vpcpeering_test.go
Updated all TestBuildVPC invocations to pass the extra parameter (mostly nil) to match the new signature; several test site-builder calls adjusted to accept an optional SiteConfig parameter.
Test Adjustments
api/pkg/api/handler/ipblock_test.go
Updated testSiteBuildSite invocations to include additional trailing nil for new signature; no assertion or logic changes.

Sequence Diagram

sequenceDiagram
    participant Client as Client (API Caller)
    participant Handler as UpdateSiteHandler
    participant VPCDB as VPC Query (DB)
    participant Config as SiteConfig Model
    participant SiteDB as SiteSQLDAO (DB)

    Client->>Handler: PATCH /site/{id} with capabilities
    Handler->>Handler: Validate requester may set capabilities
    alt requester not allowed
        Handler-->>Client: 400 Bad Request
    else requester allowed and NativeNetworking->false specified
        Handler->>VPCDB: Query VPCs for Site with virtualizationType == "FNN"
        VPCDB-->>Handler: count > 0 ?
        alt FNN VPCs exist
            Handler-->>Client: 400 Bad Request (cannot disable NativeNetworking)
        else no FNN VPCs
            Handler->>Config: ToSiteConfig(existing) merge capabilities
            Config-->>Handler: merged SiteConfig
            Handler->>SiteDB: Update Site (config merge, set updated = current_timestamp)
            SiteDB-->>Handler: updated site
            Handler-->>Client: 200 OK (updated site)
        end
    end
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~28 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.77% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the primary change: enabling Site capability updates via the Site update API endpoint, which is the core feature introduced across multiple handler, model, and SDK files.
Description check ✅ Passed The description clearly explains the feature: allowing Providers to enable/disable Site capabilities through the update API, with related issue reference and testing confirmation. It aligns with the comprehensive changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/site-cap-update

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-05-02 01:45:55 UTC | Commit: 37675cc

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

Test Results

9 055 tests  +4   9 055 ✅ +4   7m 13s ⏱️ ±0s
  151 suites ±0       0 💤 ±0 
   14 files   ±0       0 ❌ ±0 

Results for commit df31ef9. ± Comparison against base commit e7e194a.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
api/pkg/api/model/site_test.go (1)

383-402: ⚡ Quick win

Add a mixed-role validation test case (isProvider=true, isTenant=true).

You now cover provider-only and tenant-only capability paths, but not dual-role service accounts. Add one case to ensure provider-authorized capability updates are not rejected when tenant role is also present.

Based on learnings: dual-role accounts should evaluate both authorization paths and only fail when all applicable paths fail.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/model/site_test.go` around lines 383 - 402, The test suite lacks
a case for dual-role accounts: add a test entry where both isProvider and
isTenant are true (e.g., name "validate update request success, Provider+Tenant
configuring capabilities"), with fields.Capabilities set to an
APISiteCapabilitiesUpdateRequest (NativeNetworking false) and wantErr false;
this ensures the validation logic for ValidateUpdateRequest evaluates both
authorization paths (APISiteCapabilitiesUpdateRequest validation when isProvider
and isTenant flags are both set) and accepts the request if any applicable path
authorizes it rather than rejecting when one role would deny it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/pkg/api/handler/site.go`:
- Around line 498-506: The code sets siteUpdateInput.Config when
apiRequest.Capabilities is non-nil but later unconditionally dereferences *us;
before entering the transactional update flow you must reject tenant-path
updates by enforcing provider != nil. Add an explicit check on the provider
variable (e.g., if provider == nil { return error(...) }) prior to
creating/applying siteUpdateInput.Config and starting the transaction so
capability-only updates are allowed only for provider-backed sites and avoid a
nil-deref of *us during response creation.

In `@db/pkg/db/model/site.go`:
- Line 598: The update uses the SQL fragment "config = config || ?::jsonb" which
drops updates when Site.Config is NULL; change the Set(...) call that currently
embeds that fragment (the one passing input.Config) to coalesce the existing
config to an empty jsonb before the merge (i.e., use COALESCE on the existing
config to '{}'::jsonb so NULL becomes an empty object) so the merge always
applies and updated timestamp still sets; update the Set invocation string
accordingly while keeping the same input.Config parameter.

In `@sdk/standard/model_site_capabilities_update_request.go`:
- Around line 1-10: The file model_site_capabilities_update_request.go is
missing the required SPDX license header; add a Go comment block at the very top
of the file containing the current copyright year range and the
SPDX-License-Identifier: Apache-2.0 (matching repository guidelines) so
generated SDK files include the standard header; ensure the header precedes the
existing OpenAPI banner and is applied to this generated file
(model_site_capabilities_update_request.go) and any similar generated SDK files.

---

Nitpick comments:
In `@api/pkg/api/model/site_test.go`:
- Around line 383-402: The test suite lacks a case for dual-role accounts: add a
test entry where both isProvider and isTenant are true (e.g., name "validate
update request success, Provider+Tenant configuring capabilities"), with
fields.Capabilities set to an APISiteCapabilitiesUpdateRequest (NativeNetworking
false) and wantErr false; this ensures the validation logic for
ValidateUpdateRequest evaluates both authorization paths
(APISiteCapabilitiesUpdateRequest validation when isProvider and isTenant flags
are both set) and accepts the request if any applicable path authorizes it
rather than rejecting when one role would deny it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d4656517-5262-4c5d-9f83-9873f2f6aba7

📥 Commits

Reviewing files that changed from the base of the PR and between bce7503 and 37675cc.

📒 Files selected for processing (239)
  • api/pkg/api/handler/ipblock_test.go
  • api/pkg/api/handler/site.go
  • api/pkg/api/handler/site_test.go
  • api/pkg/api/model/site.go
  • api/pkg/api/model/site_test.go
  • db/pkg/db/model/site.go
  • db/pkg/db/model/site_test.go
  • docs/index.html
  • openapi/spec.yaml
  • sdk/standard/api_allocation.go
  • sdk/standard/api_audit.go
  • sdk/standard/api_dpu_extension_service.go
  • sdk/standard/api_expected_machine.go
  • sdk/standard/api_expected_power_shelf.go
  • sdk/standard/api_expected_switch.go
  • sdk/standard/api_infini_band_partition.go
  • sdk/standard/api_infrastructure_provider.go
  • sdk/standard/api_instance.go
  • sdk/standard/api_instance_type.go
  • sdk/standard/api_ip_block.go
  • sdk/standard/api_machine.go
  • sdk/standard/api_metadata.go
  • sdk/standard/api_network_security_group.go
  • sdk/standard/api_nv_link_logical_partition.go
  • sdk/standard/api_operating_system.go
  • sdk/standard/api_rack.go
  • sdk/standard/api_service_account.go
  • sdk/standard/api_site.go
  • sdk/standard/api_sku.go
  • sdk/standard/api_ssh_key.go
  • sdk/standard/api_ssh_key_group.go
  • sdk/standard/api_subnet.go
  • sdk/standard/api_tenant.go
  • sdk/standard/api_tenant_account.go
  • sdk/standard/api_tray.go
  • sdk/standard/api_user.go
  • sdk/standard/api_vpc.go
  • sdk/standard/api_vpc_peering.go
  • sdk/standard/api_vpc_prefix.go
  • sdk/standard/client.go
  • sdk/standard/configuration.go
  • sdk/standard/model_allocation.go
  • sdk/standard/model_allocation_constraint.go
  • sdk/standard/model_allocation_constraint_create_request.go
  • sdk/standard/model_allocation_constraint_update_request.go
  • sdk/standard/model_allocation_create_request.go
  • sdk/standard/model_allocation_stats.go
  • sdk/standard/model_allocation_status.go
  • sdk/standard/model_allocation_update_request.go
  • sdk/standard/model_audit_entry.go
  • sdk/standard/model_batch_bring_up_rack_request.go
  • sdk/standard/model_batch_instance_create_request.go
  • sdk/standard/model_batch_rack_firmware_update_request.go
  • sdk/standard/model_batch_tray_firmware_update_request.go
  • sdk/standard/model_batch_update_rack_power_state_request.go
  • sdk/standard/model_batch_update_tray_power_state_request.go
  • sdk/standard/model_bmc_info.go
  • sdk/standard/model_bring_up_rack_request.go
  • sdk/standard/model_bring_up_rack_response.go
  • sdk/standard/model_carbide_api_error.go
  • sdk/standard/model_component_diff.go
  • sdk/standard/model_deprecation.go
  • sdk/standard/model_dpu_extension_service.go
  • sdk/standard/model_dpu_extension_service_create_request.go
  • sdk/standard/model_dpu_extension_service_credentials.go
  • sdk/standard/model_dpu_extension_service_deployment.go
  • sdk/standard/model_dpu_extension_service_deployment_request.go
  • sdk/standard/model_dpu_extension_service_deployment_status.go
  • sdk/standard/model_dpu_extension_service_observability.go
  • sdk/standard/model_dpu_extension_service_observability_config.go
  • sdk/standard/model_dpu_extension_service_observability_logging.go
  • sdk/standard/model_dpu_extension_service_observability_prometheus.go
  • sdk/standard/model_dpu_extension_service_status.go
  • sdk/standard/model_dpu_extension_service_summary.go
  • sdk/standard/model_dpu_extension_service_update_request.go
  • sdk/standard/model_dpu_extension_service_version_info.go
  • sdk/standard/model_expected_machine.go
  • sdk/standard/model_expected_machine_create_request.go
  • sdk/standard/model_expected_machine_update_request.go
  • sdk/standard/model_expected_power_shelf.go
  • sdk/standard/model_expected_power_shelf_create_request.go
  • sdk/standard/model_expected_power_shelf_update_request.go
  • sdk/standard/model_expected_switch.go
  • sdk/standard/model_expected_switch_create_request.go
  • sdk/standard/model_expected_switch_update_request.go
  • sdk/standard/model_field_diff.go
  • sdk/standard/model_firmware_update_request.go
  • sdk/standard/model_firmware_update_response.go
  • sdk/standard/model_infini_band_interface.go
  • sdk/standard/model_infini_band_interface_create_request.go
  • sdk/standard/model_infini_band_interface_status.go
  • sdk/standard/model_infini_band_partition.go
  • sdk/standard/model_infini_band_partition_create_request.go
  • sdk/standard/model_infini_band_partition_status.go
  • sdk/standard/model_infini_band_partition_update_request.go
  • sdk/standard/model_infrastructure_provider.go
  • sdk/standard/model_infrastructure_provider_stats.go
  • sdk/standard/model_instance.go
  • sdk/standard/model_instance_count_by_status.go
  • sdk/standard/model_instance_create_request.go
  • sdk/standard/model_instance_delete_request.go
  • sdk/standard/model_instance_status.go
  • sdk/standard/model_instance_type.go
  • sdk/standard/model_instance_type_allocation_stats.go
  • sdk/standard/model_instance_type_capability_create_request.go
  • sdk/standard/model_instance_type_create_request.go
  • sdk/standard/model_instance_type_stats.go
  • sdk/standard/model_instance_type_status.go
  • sdk/standard/model_instance_type_summary.go
  • sdk/standard/model_instance_type_update_request.go
  • sdk/standard/model_instance_update_request.go
  • sdk/standard/model_interface.go
  • sdk/standard/model_interface_create_request.go
  • sdk/standard/model_interface_status.go
  • sdk/standard/model_ip_block.go
  • sdk/standard/model_ip_block_count_by_status.go
  • sdk/standard/model_ip_block_create_request.go
  • sdk/standard/model_ip_block_status.go
  • sdk/standard/model_ip_block_summary.go
  • sdk/standard/model_ip_block_update_request.go
  • sdk/standard/model_ip_block_usage_stats.go
  • sdk/standard/model_machine.go
  • sdk/standard/model_machine_bmc_info.go
  • sdk/standard/model_machine_capability.go
  • sdk/standard/model_machine_count_by_status.go
  • sdk/standard/model_machine_dmi_data.go
  • sdk/standard/model_machine_gpu_info.go
  • sdk/standard/model_machine_gpu_stats.go
  • sdk/standard/model_machine_health.go
  • sdk/standard/model_machine_health_issue.go
  • sdk/standard/model_machine_health_probe_alert.go
  • sdk/standard/model_machine_health_probe_success.go
  • sdk/standard/model_machine_infini_band_interface.go
  • sdk/standard/model_machine_instance_type.go
  • sdk/standard/model_machine_instance_type_create_request.go
  • sdk/standard/model_machine_instance_type_stats.go
  • sdk/standard/model_machine_instance_type_summary.go
  • sdk/standard/model_machine_interface.go
  • sdk/standard/model_machine_metadata.go
  • sdk/standard/model_machine_network_interface.go
  • sdk/standard/model_machine_status.go
  • sdk/standard/model_machine_status_breakdown.go
  • sdk/standard/model_machine_summary.go
  • sdk/standard/model_machine_update_request.go
  • sdk/standard/model_metadata.go
  • sdk/standard/model_network_security_group.go
  • sdk/standard/model_network_security_group_create_request.go
  • sdk/standard/model_network_security_group_propagation_details.go
  • sdk/standard/model_network_security_group_propagation_status.go
  • sdk/standard/model_network_security_group_rule.go
  • sdk/standard/model_network_security_group_status.go
  • sdk/standard/model_network_security_group_update_request.go
  • sdk/standard/model_nv_link_interface.go
  • sdk/standard/model_nv_link_interface_create_request.go
  • sdk/standard/model_nv_link_interface_status.go
  • sdk/standard/model_nv_link_logical_partition.go
  • sdk/standard/model_nv_link_logical_partition_create_request.go
  • sdk/standard/model_nv_link_logical_partition_status.go
  • sdk/standard/model_nv_link_logical_partition_update_request.go
  • sdk/standard/model_operating_system.go
  • sdk/standard/model_operating_system_create_request.go
  • sdk/standard/model_operating_system_site_association.go
  • sdk/standard/model_operating_system_status.go
  • sdk/standard/model_operating_system_update_request.go
  • sdk/standard/model_rack.go
  • sdk/standard/model_rack_component.go
  • sdk/standard/model_rack_filter.go
  • sdk/standard/model_rack_location.go
  • sdk/standard/model_rack_task.go
  • sdk/standard/model_rack_validation_result.go
  • sdk/standard/model_service_account.go
  • sdk/standard/model_site.go
  • sdk/standard/model_site_capabilities.go
  • sdk/standard/model_site_capabilities_update_request.go
  • sdk/standard/model_site_contact.go
  • sdk/standard/model_site_create_request.go
  • sdk/standard/model_site_location.go
  • sdk/standard/model_site_machine_stats.go
  • sdk/standard/model_site_machine_stats_by_allocation.go
  • sdk/standard/model_site_machine_stats_by_health.go
  • sdk/standard/model_site_machine_stats_by_status.go
  • sdk/standard/model_site_machine_stats_by_status_and_health.go
  • sdk/standard/model_site_status.go
  • sdk/standard/model_site_summary.go
  • sdk/standard/model_site_update_request.go
  • sdk/standard/model_sku.go
  • sdk/standard/model_sku_chassis.go
  • sdk/standard/model_sku_components.go
  • sdk/standard/model_sku_cpu.go
  • sdk/standard/model_sku_ethernet_device.go
  • sdk/standard/model_sku_gpu.go
  • sdk/standard/model_sku_infiniband_device.go
  • sdk/standard/model_sku_memory.go
  • sdk/standard/model_sku_storage.go
  • sdk/standard/model_sku_tpm.go
  • sdk/standard/model_ssh_key.go
  • sdk/standard/model_ssh_key_create_request.go
  • sdk/standard/model_ssh_key_group.go
  • sdk/standard/model_ssh_key_group_create_request.go
  • sdk/standard/model_ssh_key_group_site_association.go
  • sdk/standard/model_ssh_key_group_site_association_status.go
  • sdk/standard/model_ssh_key_group_status.go
  • sdk/standard/model_ssh_key_group_update_request.go
  • sdk/standard/model_ssh_key_update_request.go
  • sdk/standard/model_status_detail.go
  • sdk/standard/model_subnet.go
  • sdk/standard/model_subnet_count_by_status.go
  • sdk/standard/model_subnet_create_request.go
  • sdk/standard/model_subnet_status.go
  • sdk/standard/model_subnet_update_request.go
  • sdk/standard/model_tenant.go
  • sdk/standard/model_tenant_account.go
  • sdk/standard/model_tenant_account_count_by_status.go
  • sdk/standard/model_tenant_account_create_request.go
  • sdk/standard/model_tenant_account_status.go
  • sdk/standard/model_tenant_capabilities.go
  • sdk/standard/model_tenant_instance_type_stats.go
  • sdk/standard/model_tenant_stats.go
  • sdk/standard/model_tray.go
  • sdk/standard/model_tray_filter.go
  • sdk/standard/model_tray_position.go
  • sdk/standard/model_update_power_state_request.go
  • sdk/standard/model_update_power_state_response.go
  • sdk/standard/model_user.go
  • sdk/standard/model_vpc.go
  • sdk/standard/model_vpc_count_by_status.go
  • sdk/standard/model_vpc_create_request.go
  • sdk/standard/model_vpc_peering.go
  • sdk/standard/model_vpc_peering_create_request.go
  • sdk/standard/model_vpc_peering_status.go
  • sdk/standard/model_vpc_prefix.go
  • sdk/standard/model_vpc_prefix_create_request.go
  • sdk/standard/model_vpc_prefix_status.go
  • sdk/standard/model_vpc_prefix_update_request.go
  • sdk/standard/model_vpc_status.go
  • sdk/standard/model_vpc_update_request.go
  • sdk/standard/model_vpc_virtualization_update_request.go
  • sdk/standard/response.go
  • sdk/standard/utils.go

Comment on lines +498 to +506
if apiRequest.Capabilities != nil {
siteUpdateInput.Config = &cdbm.SiteConfigUpdateInput{
NativeNetworking: apiRequest.Capabilities.NativeNetworking,
NetworkSecurityGroup: apiRequest.Capabilities.NetworkSecurityGroup,
NVLinkPartition: apiRequest.Capabilities.NVLinkPartition,
RackLevelAdministration: apiRequest.Capabilities.RackLevelAdministration,
ImageBasedOperatingSystem: apiRequest.Capabilities.ImageBasedOperatingSystem,
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Enforce provider-only access before update flow to prevent nil-dereference path.

siteUpdateInput.Config is applied only inside the provider branch, but response creation later dereferences *us unconditionally. Add an explicit provider == nil rejection before transactional update to avoid tenant-path nil dereference and to align with provider-only capability updates.

Proposed fix
 func (ush UpdateSiteHandler) Handle(c echo.Context) error {
   org, dbUser, ctx, logger, handlerSpan := common.SetupHandler("Site", "Update", c, ush.tracerSpan)
@@
   provider, tenant, apiErr := common.IsProviderOrTenant(ctx, logger, ush.dbSession, org, dbUser, false, false)
   if apiErr != nil {
     return c.JSON(apiErr.Code, apiErr)
   }
+  if provider == nil {
+    logger.Warn().Msg("user does not have Provider Admin role with org, access denied")
+    return cutil.NewAPIErrorResponse(c, http.StatusForbidden, "User does not have Provider Admin role with org", nil)
+  }
@@
-  if provider != nil {
-    siteUpdateInput := cdbm.SiteUpdateInput{
+  siteUpdateInput := cdbm.SiteUpdateInput{
       SiteID:                        siteID,
       Name:                          apiRequest.Name,
       Description:                   apiRequest.Description,
@@
-    if apiRequest.Capabilities != nil {
-      siteUpdateInput.Config = &cdbm.SiteConfigUpdateInput{
-        NativeNetworking:          apiRequest.Capabilities.NativeNetworking,
-        NetworkSecurityGroup:      apiRequest.Capabilities.NetworkSecurityGroup,
-        NVLinkPartition:           apiRequest.Capabilities.NVLinkPartition,
-        RackLevelAdministration:   apiRequest.Capabilities.RackLevelAdministration,
-        ImageBasedOperatingSystem: apiRequest.Capabilities.ImageBasedOperatingSystem,
-      }
-    }
-
-    // Update Site
-    us, err = stDAO.Update(ctx, tx, siteUpdateInput)
-    if err != nil {
-      logger.Error().Err(err).Msg("error updating Site")
-      return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to update Site, DB error", nil)
-    }
+  if apiRequest.Capabilities != nil {
+    siteUpdateInput.Config = &cdbm.SiteConfigUpdateInput{
+      NativeNetworking:          apiRequest.Capabilities.NativeNetworking,
+      NetworkSecurityGroup:      apiRequest.Capabilities.NetworkSecurityGroup,
+      NVLinkPartition:           apiRequest.Capabilities.NVLinkPartition,
+      RackLevelAdministration:   apiRequest.Capabilities.RackLevelAdministration,
+      ImageBasedOperatingSystem: apiRequest.Capabilities.ImageBasedOperatingSystem,
+    }
+  }
+
+  // Update Site
+  us, err = stDAO.Update(ctx, tx, siteUpdateInput)
+  if err != nil {
+    logger.Error().Err(err).Msg("error updating Site")
+    return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to update Site, DB error", nil)
   }
- }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/pkg/api/handler/site.go` around lines 498 - 506, The code sets
siteUpdateInput.Config when apiRequest.Capabilities is non-nil but later
unconditionally dereferences *us; before entering the transactional update flow
you must reject tenant-path updates by enforcing provider != nil. Add an
explicit check on the provider variable (e.g., if provider == nil { return
error(...) }) prior to creating/applying siteUpdateInput.Config and starting the
transaction so capability-only updates are allowed only for provider-backed
sites and avoid a nil-deref of *us during response creation.

Comment thread db/pkg/db/model/site.go
_, err := db.GetIDB(tx, ssd.dbSession).NewUpdate().
Model(st).
Set("config = config || ?::jsonb", input.Config).
Set("config = config || ?::jsonb, updated = current_timestamp", input.Config).
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot May 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Confirm current update expression
rg -n -C2 "config = config \\|\\| \\?::jsonb" db/pkg/db/model/site.go

# Inspect schema/migrations for site.config nullability/default
rg -n --type=sql -C3 "create table[[:space:]]+site|alter table[[:space:]]+site|\\bconfig\\b.*jsonb" .

Repository: NVIDIA/infra-controller-rest

Length of output: 282


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find SQL/migration files in the repository
find . -type f \( -name "*.sql" -o -name "*migration*" -o -name "*schema*" \) | head -20

# Look for database initialization or setup files
find . -type f -name "*.go" | xargs grep -l "CREATE TABLE.*site" 2>/dev/null | head -5

# Check common migration directories
ls -la db/ 2>/dev/null || true
ls -la migrations/ 2>/dev/null || true
ls -la schema/ 2>/dev/null || true

Repository: NVIDIA/infra-controller-rest

Length of output: 1267


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# List files in db/pkg/migrations/
ls -la db/pkg/migrations/ 2>/dev/null | head -20

# Search for site table references in migration files
find db/pkg/migrations -type f \( -name "*.sql" -o -name "*.go" \) -exec grep -l "site" {} \; 2>/dev/null

# Read setup.sql to find site.config definition
cat db/scripts/setup.sql | grep -A 10 -B 2 "CREATE TABLE.*site"

Repository: NVIDIA/infra-controller-rest

Length of output: 1680


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Read the initial migration file to find site table schema
cat db/pkg/migrations/20220526234258_initial.go

# Also search for config column definition in migrations
grep -n "config" db/pkg/migrations/*.go | grep -i site

Repository: NVIDIA/infra-controller-rest

Length of output: 6108


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Read the Site model definition to check for NOT NULL constraint on config
head -150 db/pkg/db/model/site.go

# Check if there are any NOT NULL constraints mentioned
grep -A 5 -B 5 "config" db/pkg/db/model/site.go | head -30

Repository: NVIDIA/infra-controller-rest

Length of output: 8229


Guard JSONB merge against NULL config values in site update operations.

The Site.Config field is nullable (pointer type without NOT NULL constraint), and while migrations added a DEFAULT value, existing rows created before the config column was introduced may retain NULL values. The expression config = config || ?::jsonb will fail to persist updates when config is NULL, as PostgreSQL evaluates NULL || value as NULL. This results in silent data loss—the update query succeeds without error, but the modification is discarded.

Recommended fix
-			Set("config = config || ?::jsonb, updated = current_timestamp", input.Config).
+			Set("config = COALESCE(config, '{}'::jsonb) || ?::jsonb, updated = current_timestamp", input.Config).
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Set("config = config || ?::jsonb, updated = current_timestamp", input.Config).
Set("config = COALESCE(config, '{}'::jsonb) || ?::jsonb, updated = current_timestamp", input.Config).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@db/pkg/db/model/site.go` at line 598, The update uses the SQL fragment
"config = config || ?::jsonb" which drops updates when Site.Config is NULL;
change the Set(...) call that currently embeds that fragment (the one passing
input.Config) to coalesce the existing config to an empty jsonb before the merge
(i.e., use COALESCE on the existing config to '{}'::jsonb so NULL becomes an
empty object) so the merge always applies and updated timestamp still sets;
update the Set invocation string accordingly while keeping the same input.Config
parameter.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai In PostgreSQL, if you add a new column with a DEFAULT clause, existing rows will automatically get that default value. A 'capabilities' column was added with a default value. Later, a 'config' column was added with a default value and existing values in 'capabilities' were migrated to 'config'. Review migrations for the two columns to confirm.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# List all migration files
ls db/pkg/migrations/

# Find migrations that add capabilities or config column to site table
grep -rn -l "capabilities\|config" db/pkg/migrations/ | sort

Length of output: 5105


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for additions of capabilities or config columns in migrations
grep -n -A 10 "capabilities\|\"config\"" db/pkg/migrations/*.go | grep -i "ALTER TABLE\|ADD COLUMN\|DEFAULT\|capabilities\|config" | head -60

Length of output: 5716


@bcavnvidia, you are correct — my concern was unfounded. The migration history confirms it definitively:

  1. 20250320162954_site_capabilities.go added the capabilities column as jsonb DEFAULT '{}'::jsonb. PostgreSQL immediately backfills that default for all existing rows, so no row can be NULL from that point.

  2. 20250625142649_site_capabilities_to_config_step1.go then executed UPDATE site SET config=capabilities WHERE deleted IS NULL, explicitly migrating every live row's value from capabilities to config.

  3. 20260325032528_site_capabilities_to_config_step2.go added config jsonb DEFAULT '{}'::jsonb, set a richer column-level default, and dropped the legacy capabilities column.

Because config was always introduced with a DEFAULT '{}'::jsonb clause and a full-table backfill was performed, the column is never NULL for any live row. The COALESCE guard I suggested is unnecessary. I apologise for the noise.

(*ᵔᵕᵔ*) (reconsidering carefully next time)


✏️ Learnings added
Learnt from: bcavnvidia
Repo: NVIDIA/infra-controller-rest PR: 470
File: db/pkg/db/model/site.go:598-598
Timestamp: 2026-05-02T02:26:24.365Z
Learning: In the NVIDIA/infra-controller-rest repository, the `config` column on the `site` table (db/pkg/db/model/site.go, type SiteConfig) can never be NULL for live rows. Migration 20250320162954_site_capabilities.go added the `capabilities` column with `DEFAULT '{}'::jsonb`, migration 20250625142649_site_capabilities_to_config_step1.go backfilled `config` from `capabilities` for all non-deleted rows, and migration 20260325032528_site_capabilities_to_config_step2.go added `config` with `DEFAULT '{}'::jsonb` and dropped `capabilities`. Therefore, expressions like `config = config || ?::jsonb` in SQL UPDATE statements are safe without a COALESCE guard, and such suggestions should not be flagged as bugs.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: bcavnvidia
Repo: NVIDIA/ncx-infra-controller-rest PR: 300
File: db/pkg/db/model/instance.go:539-541
Timestamp: 2026-03-24T13:42:22.945Z
Learning: When reviewing this repo’s Go code under `db/pkg`, don’t require explicit nil checks before calling `(*TracerSpan).SetAttribute(...)` on a `TracerSpan` instance. In `db/pkg/tracer/tracer.go`, `SetAttribute(cspan *CurrentContextSpan, ...)` is nil-safe: it internally checks whether `cspan` is nil and returns immediately. Therefore, wrapping `SetAttribute(instanceDAOSpan, ...)` with `if instanceDAOSpan != nil` is redundant and should not be flagged as a bug.

Learnt from: parmani-nv
Repo: NVIDIA/ncx-infra-controller-rest PR: 372
File: api/pkg/api/handler/fabric.go:137-143
Timestamp: 2026-04-15T23:49:00.903Z
Learning: In this repository, `cdb.ErrDoesNotExist` (from `db/pkg/db`) is a DB-layer sentinel error that is never wrapped. Therefore, in Go code that calls DAO/DB methods (e.g., `TenantSiteDAO.GetByTenantIDAndSiteID`), direct equality checks like `err == cdb.ErrDoesNotExist` / `err != cdb.ErrDoesNotExist` are intentional and safe. Do not flag these comparisons as robustness issues and do not suggest replacing them with `errors.Is()` for this specific sentinel.

Comment thread sdk/standard/model_site_capabilities_update_request.go
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
nico-nsm 64 2 20 33 9 0
nico-psm 56 4 29 13 2 8
nico-rest-api 57 4 30 13 2 8
nico-rest-cert-manager 54 4 28 13 1 8
nico-rest-db 55 4 28 13 2 8
nico-rest-site-agent 54 4 28 13 1 8
nico-rest-site-manager 54 4 28 13 1 8
nico-rest-workflow 56 4 29 13 2 8
nico-rla 55 4 28 13 2 8
TOTAL 505 34 248 137 22 64

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.


if apiRequest.Capabilities != nil {
siteUpdateInput.Config = &cdbm.SiteConfigUpdateInput{
NativeNetworking: apiRequest.Capabilities.NativeNetworking,
Copy link
Copy Markdown
Contributor

@bcavnvidia bcavnvidia May 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we block a user from switching a VPC to FNN when FNN isn't enabled, do we still allow disabling FNN if there are already FNN VPCs?

Just curious. Still makes sense to me if this is just a gate against new things being created and we'd just document it as such. E.g., disabling FNN is a way to prevent new FNN VPCs for whatever reason the provider has.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can block them from disabling Native Networking if they have VPC Prefixes or FNN VPCs. Maybe we ask users that they delete those before disabling the flag?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we only allow them ETHER based to FNN but not other

@kfelternv
Copy link
Copy Markdown
Contributor

You should be able to update with the latest main, then regenerate the sdk files. I've added a script that should automatically add these headers when they are generated using the make command.

@thossain-nv I think this should knock down the diff significantly for this PR.

@thossain-nv
Copy link
Copy Markdown
Contributor Author

thossain-nv commented May 4, 2026

@kfelternv Is the comment you quoted from another PR? Nevermind I see what you're saying, just need to rebase.

if apiRequest.Capabilities != nil {
siteUpdateInput.Config = &cdbm.SiteConfigUpdateInput{
NativeNetworking: apiRequest.Capabilities.NativeNetworking,
NetworkSecurityGroup: apiRequest.Capabilities.NetworkSecurityGroup,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a similar vein to the previous comments, we should probably prevent the following and ask them to delete the relevant records first

  • disabling NSG if there are already NSGs on the site
  • disabling NVLink if there are already NVLink partitions
  • disabling RLA if there are already racks, trays, expected switches, expected power shelves or expected racks (though that last one depends on this PR)
  • disabling image based operating systems if there are image-based OSes on the site

Copy link
Copy Markdown
Contributor Author

@thossain-nv thossain-nv May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabling these flags only affect create/update operations, still allows deleting existing resource. We should allow the Provider to modify the flags to temporarily or permanently disable creation/update of these resources if they want.

For example: disabling network security group flag only prevents creation/update. Existing ones can still be used to create Instances. Similarly disabling image based flag prevents using image based OS creation or using them in for new Instances.

Disabling them and continuing with regular operations won't cause any issues, except for FNN - where we rely on VPC's virtualization type to make some decisions instead of checking Site capability. So we treat that as a special case.

In general I think we should allow users to make these changes even if there are existing objects. Most Providers will likely use this to enable certain flags.

@thossain-nv thossain-nv force-pushed the feat/site-cap-update branch from 37675cc to 2f5bea6 Compare May 6, 2026 00:43
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
api/pkg/api/handler/site_test.go (1)

642-679: ⚡ Quick win

Add a tenant-side negative case for Capabilities updates.

These additions cover the provider happy path and the FNN precondition, but they do not lock down the new provider-only authorization rule for Capabilities. If the handler accidentally accepts a tenant update carrying Capabilities, this suite would still go green because the existing tenant failures exercise different fields.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/site_test.go` around lines 642 - 679, Add a tenant-side
negative test case to ensure tenant users cannot update Capabilities: in the
test table in site_test.go add a new case similar to the provider "modifying
capabilities" case but set args.user/org to a tenant user/org (not provider),
provide reqData with Capabilities:
&model.APISiteCapabilitiesUpdateRequest{NativeNetworking:
cdb.GetBoolPtr(false)}, set csmEnabled accordingly (same as provider case), and
expect wantErr: true and verifyChildSpanner: false so the suite fails if tenant
updates are accepted; reference the
APISiteUpdateRequest/APISiteCapabilitiesUpdateRequest payload and the test case
fields/args struct to locate where to add it.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/pkg/api/handler/site_test.go`:
- Around line 661-679: The test case that updates a Site with
NativeNetworking=false (site_test.go case using model.APISiteUpdateRequest and
APISiteCapabilitiesUpdateRequest, with csmEnabled true) currently only asserts a
generic error; change the assertions to expect the precise HTTP status (403) and
assert the response error string contains a stable FNN precondition message
(e.g., "cannot disable native networking: site has VPC(s) with FNN
virtualization") so the test checks both status and a deterministic substring
rather than any non-200; update the test's wantErr/wantStatus or equivalent
assertions to verify these exact values for the failing path.

In `@api/pkg/api/handler/site.go`:
- Around line 459-466: The code reads es.Config.NativeNetworking without
guarding es.Config; update the branch that handles apiRequest.Capabilities (when
apiRequest.Capabilities != nil and apiRequest.Capabilities.NativeNetworking !=
nil) to first ensure es.Config is non-nil (e.g., if es.Config == nil { return a
sensible validation error or treat missing Config as default config with
NativeNetworking=true/false }) before dereferencing es.Config.NativeNetworking;
modify the conditional that compares *apiRequest.Capabilities.NativeNetworking
with es.Config.NativeNetworking to handle es.Config == nil safely (use a local
var like currentNativeNetworking := false/true when es.Config==nil) and then
proceed to the existing VPC check using vpcDAO.GetAll as before.

---

Nitpick comments:
In `@api/pkg/api/handler/site_test.go`:
- Around line 642-679: Add a tenant-side negative test case to ensure tenant
users cannot update Capabilities: in the test table in site_test.go add a new
case similar to the provider "modifying capabilities" case but set args.user/org
to a tenant user/org (not provider), provide reqData with Capabilities:
&model.APISiteCapabilitiesUpdateRequest{NativeNetworking:
cdb.GetBoolPtr(false)}, set csmEnabled accordingly (same as provider case), and
expect wantErr: true and verifyChildSpanner: false so the suite fails if tenant
updates are accepted; reference the
APISiteUpdateRequest/APISiteCapabilitiesUpdateRequest payload and the test case
fields/args struct to locate where to add it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d83f60b1-0f6e-4c09-b21a-e6210c658611

📥 Commits

Reviewing files that changed from the base of the PR and between 37675cc and 2f5bea6.

📒 Files selected for processing (10)
  • api/pkg/api/handler/dpuextensionservice_test.go
  • api/pkg/api/handler/instancetype_test.go
  • api/pkg/api/handler/ipblock_test.go
  • api/pkg/api/handler/machineinstancetype_test.go
  • api/pkg/api/handler/site.go
  • api/pkg/api/handler/site_test.go
  • api/pkg/api/handler/util/common/testing.go
  • api/pkg/api/handler/vpcpeering_test.go
  • api/pkg/api/model/site.go
  • api/pkg/api/model/site_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • api/pkg/api/model/site.go

Comment on lines +661 to +679
{
name: "test Site update API endpoint failure when disabling Native Networking while Site has one or more VPCs with virtualization type set to FNN",
fields: fields{
dbSession: dbSession,
tc: &tmocks.Client{},
cfg: cfg,
},
args: args{
site: st6,
org: ipOrg,
user: ipu,
reqData: &model.APISiteUpdateRequest{
Capabilities: &model.APISiteCapabilitiesUpdateRequest{NativeNetworking: cdb.GetBoolPtr(false)},
},
},
csmEnabled: true,
wantErr: true,
verifyChildSpanner: false,
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert the exact validation failure for the FNN precondition.

This new case only proves “non-200”. A 403 or 500 would still pass, which leaves the new guard’s API contract under-specified. Please pin the expected status and a stable error substring here so regressions in the validation path do not slip through unnoticed.

Suggested tightening
 type tests := []struct {
   name               string
   fields             fields
   args               args
   wantErr            bool
+  wantRespCode       int
+  wantErrMsg         string
   query              url.Values
   wantStatus         *string
   siteMgrErr         bool
   csmEnabled         bool
   verifyTenantUpdate bool
   verifyChildSpanner bool
 }{
   {
     name: "test Site update API endpoint failure when disabling Native Networking while Site has one or more VPCs with virtualization type set to FNN",
     fields: fields{
       dbSession: dbSession,
       tc:        &tmocks.Client{},
       cfg:       cfg,
     },
     args: args{
       site: st6,
       org:  ipOrg,
       user: ipu,
       reqData: &model.APISiteUpdateRequest{
         Capabilities: &model.APISiteCapabilitiesUpdateRequest{NativeNetworking: cdb.GetBoolPtr(false)},
       },
     },
-    csmEnabled:         true,
-    wantErr:            true,
-    verifyChildSpanner: false,
+    csmEnabled:         true,
+    wantErr:            true,
+    wantRespCode:       http.StatusBadRequest,
+    wantErrMsg:         "Native Networking",
+    verifyChildSpanner: false,
   },
 }
 ...
 require.Equal(t, tt.wantErr, rec.Code != http.StatusOK)
+if tt.wantErr {
+  require.Equal(t, tt.wantRespCode, rec.Code)
+  if tt.wantErrMsg != "" {
+    assert.Contains(t, rec.Body.String(), tt.wantErrMsg)
+  }
+}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/site_test.go` around lines 661 - 679, The test case that
updates a Site with NativeNetworking=false (site_test.go case using
model.APISiteUpdateRequest and APISiteCapabilitiesUpdateRequest, with csmEnabled
true) currently only asserts a generic error; change the assertions to expect
the precise HTTP status (403) and assert the response error string contains a
stable FNN precondition message (e.g., "cannot disable native networking: site
has VPC(s) with FNN virtualization") so the test checks both status and a
deterministic substring rather than any non-200; update the test's
wantErr/wantStatus or equivalent assertions to verify these exact values for the
failing path.

Comment thread api/pkg/api/handler/site.go
Signed-off-by: Tareque Hossain <thossain@nvidia.com>
Signed-off-by: Tareque Hossain <thossain@nvidia.com>
@thossain-nv thossain-nv force-pushed the feat/site-cap-update branch from 2f5bea6 to 0ffa7f0 Compare May 6, 2026 01:00
Signed-off-by: Tareque Hossain <thossain@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
api/pkg/api/handler/site_test.go (2)

849-869: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

This row can fail before it reaches the Site Manager error path.

Line 860 renames st to "test-site-4", but st4 already owns that name, so the handler can legitimately return a conflict before it ever hits the mocked CSM failure branch. Use a unique name here, or avoid renaming in this case so the test actually covers the Site Manager error flow.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/site_test.go` around lines 849 - 869, The failing test
case sets the APISiteUpdateRequest.Name to "test-site-4" which conflicts with
existing st4 and causes a pre-check conflict before the Site Manager (mocked
CSM) error path; update the test case (the args.reqData in the table entry that
uses st and st4) to use a unique name not owned by any fixture (or omit the Name
field / set it to nil) so the request proceeds to the Site Manager flow and
triggers the mocked CSM failure path in the handler under test (refer to
APISiteUpdateRequest, st, st4, and the table-driven test entry).

813-829: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

This same name case is no longer exercising a no-op rename.

By the time this row runs, st has already been renamed in the earlier success case, so setting Name to "test-site" is a real rename back to the original value. Seed a fresh site for this row, or reset st before asserting the “same name” behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/site_test.go` around lines 813 - 829, The "same name"
test row is not a no-op because the shared variable st was mutated earlier;
update the test to seed a fresh site or reset st before this table row so the
APISiteUpdateRequest{Name: cdb.GetStrPtr("test-site")} truly matches the current
site's name. Specifically, in the table-driven tests around the site update
cases, ensure the entry that checks "test Site update success with same name"
either constructs a new site value (new st) or reassigns st to the original
name/state before using it in args.site and in the assertion, so the handler
function (being exercised via model.APISiteUpdateRequest and the test's
fields/dbSession) observes an unchanged-name update rather than a real rename.
♻️ Duplicate comments (1)
api/pkg/api/handler/site.go (1)

362-366: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Authorize the update by the successful site-specific path, not by raw role presence.

Validate(provider != nil, tenant != nil) and the later if provider != nil branch let any dual-role caller act as a provider updater even when only the tenant association matched this Site. That permits provider-only mutations on a Site owned by another provider, and tenant-only callers can still fall through with us == nil and panic at model.NewAPISite(*us, ...).

Track providerAssociated and tenantAssociated separately, reject the update unless the provider path succeeded for this Site, and add coverage for both a tenant-only empty PATCH and a dual-role service account whose tenant path succeeds while provider ownership does not.

Suggested fix
-	// Validate request attributes
-	verr := apiRequest.Validate(provider != nil, tenant != nil)
-	if verr != nil {
-		logger.Warn().Err(verr).Msg("error validating Site update request data")
-		return cutil.NewAPIErrorResponse(c, http.StatusBadRequest, "Error validating Site update request data", verr)
-	}
-
 	// Check that Site exists
 	es, err := stDAO.GetByID(ctx, nil, siteID, nil, false)
 	if err != nil {
 		logger.Warn().Err(err).Msg("error retrieving Site DB entity")
 		return cutil.NewAPIErrorResponse(c, http.StatusNotFound, "Could not retrieve Site to update", nil)
 	}
 
-	isAssociated := false
+	providerAssociated := provider != nil && provider.ID == es.InfrastructureProviderID
+	tenantAssociated := false
+	isAssociated := providerAssociated
 
-	// Check that Site is associated with Provider
-	if provider != nil {
-		isAssociated = provider.ID == es.InfrastructureProviderID
-	}
-
 	tsDAO := cdbm.NewTenantSiteDAO(ush.dbSession)
 	var ts *cdbm.TenantSite
 
 	if !isAssociated && tenant != nil {
 		// Check if Tenant is associated with Site
@@
 		}
 		if len(tss) > 0 {
 			ts = &tss[0]
+			tenantAssociated = true
 			isAssociated = true
 		}
 	}
 
 	if !isAssociated {
 		logger.Warn().Msg("Site is not associated with org's Infrastructure Provider or Tenant")
 		return cutil.NewAPIErrorResponse(c, http.StatusForbidden, "Current org is not allowed to modify Site attributes", nil)
 	}
+
+	verr := apiRequest.Validate(providerAssociated, tenantAssociated)
+	if verr != nil {
+		logger.Warn().Err(verr).Msg("error validating Site update request data")
+		return cutil.NewAPIErrorResponse(c, http.StatusBadRequest, "Error validating Site update request data", verr)
+	}
+
+	if !providerAssociated {
+		logger.Warn().Msg("only the owning provider may update Site attributes")
+		return cutil.NewAPIErrorResponse(c, http.StatusForbidden, "Only the owning Provider Admin may update Site attributes", nil)
+	}
@@
-	if provider != nil {
-		siteUpdateInput := cdbm.SiteUpdateInput{
-			SiteID:                        siteID,
-			Name:                          apiRequest.Name,
-			Description:                   apiRequest.Description,
-			RegistrationToken:             registrationToken,
-			RegistrationTokenExpiration:   registrationTokenExpires,
-			SerialConsoleHostname:         apiRequest.SerialConsoleHostname,
-			IsSerialConsoleEnabled:        apiRequest.IsSerialConsoleEnabled,
-			SerialConsoleIdleTimeout:      apiRequest.SerialConsoleIdleTimeout,
-			SerialConsoleMaxSessionLength: apiRequest.SerialConsoleMaxSessionLength,
-			Status:                        status,
-		}
-		if apiRequest.Location != nil {
-			siteUpdateInput.Location = &cdbm.SiteLocation{
-				City:    apiRequest.Location.City,
-				State:   apiRequest.Location.State,
-				Country: apiRequest.Location.Country,
-			}
-		}
-		if apiRequest.Contact != nil {
-			siteUpdateInput.Contact = &cdbm.SiteContact{
-				Email: apiRequest.Contact.Email,
-			}
-		}
-
-		if apiRequest.Capabilities != nil {
-			siteUpdateInput.Config = &cdbm.SiteConfigUpdateInput{
-				NativeNetworking:          apiRequest.Capabilities.NativeNetworking,
-				NetworkSecurityGroup:      apiRequest.Capabilities.NetworkSecurityGroup,
-				NVLinkPartition:           apiRequest.Capabilities.NVLinkPartition,
-				RackLevelAdministration:   apiRequest.Capabilities.RackLevelAdministration,
-				ImageBasedOperatingSystem: apiRequest.Capabilities.ImageBasedOperatingSystem,
-			}
-		}
-
-		// Update Site
-		us, err = stDAO.Update(ctx, tx, siteUpdateInput)
-		if err != nil {
-			logger.Error().Err(err).Msg("error updating Site")
-			return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to update Site, DB error", nil)
-		}
+	siteUpdateInput := cdbm.SiteUpdateInput{
+		SiteID:                        siteID,
+		Name:                          apiRequest.Name,
+		Description:                   apiRequest.Description,
+		RegistrationToken:             registrationToken,
+		RegistrationTokenExpiration:   registrationTokenExpires,
+		SerialConsoleHostname:         apiRequest.SerialConsoleHostname,
+		IsSerialConsoleEnabled:        apiRequest.IsSerialConsoleEnabled,
+		SerialConsoleIdleTimeout:      apiRequest.SerialConsoleIdleTimeout,
+		SerialConsoleMaxSessionLength: apiRequest.SerialConsoleMaxSessionLength,
+		Status:                        status,
+	}
+	if apiRequest.Location != nil {
+		siteUpdateInput.Location = &cdbm.SiteLocation{
+			City:    apiRequest.Location.City,
+			State:   apiRequest.Location.State,
+			Country: apiRequest.Location.Country,
+		}
+	}
+	if apiRequest.Contact != nil {
+		siteUpdateInput.Contact = &cdbm.SiteContact{
+			Email: apiRequest.Contact.Email,
+		}
+	}
+	if apiRequest.Capabilities != nil {
+		siteUpdateInput.Config = &cdbm.SiteConfigUpdateInput{
+			NativeNetworking:          apiRequest.Capabilities.NativeNetworking,
+			NetworkSecurityGroup:      apiRequest.Capabilities.NetworkSecurityGroup,
+			NVLinkPartition:           apiRequest.Capabilities.NVLinkPartition,
+			RackLevelAdministration:   apiRequest.Capabilities.RackLevelAdministration,
+			ImageBasedOperatingSystem: apiRequest.Capabilities.ImageBasedOperatingSystem,
+		}
+	}
+
+	us, err = stDAO.Update(ctx, tx, siteUpdateInput)
+	if err != nil {
+		logger.Error().Err(err).Msg("error updating Site")
+		return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to update Site, DB error", nil)
 	}

Based on learnings: “RBAC should not short-circuit on the first matching role … Evaluate both authorization paths independently … and only reject if all applicable paths fail.”

Also applies to: 375-401, 492-562

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/handler/site.go` around lines 362 - 366, The validation currently
treats provider != nil/tenant != nil as authorization success; change
apiRequest.Validate usage to return two explicit booleans (e.g.,
providerAssociated, tenantAssociated) or compute those flags after validation,
then use those flags to gate provider-specific updates and the
model.NewAPISite(...) call: reject the request unless the provider path truly
succeeded when performing provider mutations, and ensure tenant-only empty
PATCHs are handled separately; update the authorization branches that reference
provider, tenant, and us (e.g., the code around apiRequest.Validate, the
subsequent if provider != nil branch, and the model.NewAPISite(...) invocation)
to check providerAssociated and tenantAssociated independently and add tests for
tenant-only empty PATCH and dual-role cases where tenant succeeds but provider
ownership fails.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@api/pkg/api/handler/site_test.go`:
- Around line 849-869: The failing test case sets the APISiteUpdateRequest.Name
to "test-site-4" which conflicts with existing st4 and causes a pre-check
conflict before the Site Manager (mocked CSM) error path; update the test case
(the args.reqData in the table entry that uses st and st4) to use a unique name
not owned by any fixture (or omit the Name field / set it to nil) so the request
proceeds to the Site Manager flow and triggers the mocked CSM failure path in
the handler under test (refer to APISiteUpdateRequest, st, st4, and the
table-driven test entry).
- Around line 813-829: The "same name" test row is not a no-op because the
shared variable st was mutated earlier; update the test to seed a fresh site or
reset st before this table row so the APISiteUpdateRequest{Name:
cdb.GetStrPtr("test-site")} truly matches the current site's name. Specifically,
in the table-driven tests around the site update cases, ensure the entry that
checks "test Site update success with same name" either constructs a new site
value (new st) or reassigns st to the original name/state before using it in
args.site and in the assertion, so the handler function (being exercised via
model.APISiteUpdateRequest and the test's fields/dbSession) observes an
unchanged-name update rather than a real rename.

---

Duplicate comments:
In `@api/pkg/api/handler/site.go`:
- Around line 362-366: The validation currently treats provider != nil/tenant !=
nil as authorization success; change apiRequest.Validate usage to return two
explicit booleans (e.g., providerAssociated, tenantAssociated) or compute those
flags after validation, then use those flags to gate provider-specific updates
and the model.NewAPISite(...) call: reject the request unless the provider path
truly succeeded when performing provider mutations, and ensure tenant-only empty
PATCHs are handled separately; update the authorization branches that reference
provider, tenant, and us (e.g., the code around apiRequest.Validate, the
subsequent if provider != nil branch, and the model.NewAPISite(...) invocation)
to check providerAssociated and tenantAssociated independently and add tests for
tenant-only empty PATCH and dual-role cases where tenant succeeds but provider
ownership fails.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5f42eb0f-6b6b-403c-aa57-e9bd1ebb5093

📥 Commits

Reviewing files that changed from the base of the PR and between 2f5bea6 and df31ef9.

📒 Files selected for processing (17)
  • api/pkg/api/handler/dpuextensionservice_test.go
  • api/pkg/api/handler/instancetype_test.go
  • api/pkg/api/handler/ipblock_test.go
  • api/pkg/api/handler/machineinstancetype_test.go
  • api/pkg/api/handler/site.go
  • api/pkg/api/handler/site_test.go
  • api/pkg/api/handler/util/common/testing.go
  • api/pkg/api/handler/vpcpeering_test.go
  • api/pkg/api/model/site.go
  • api/pkg/api/model/site_test.go
  • db/pkg/db/model/site.go
  • db/pkg/db/model/site_test.go
  • docs/index.html
  • openapi/spec.yaml
  • sdk/standard/api_site.go
  • sdk/standard/model_site_capabilities_update_request.go
  • sdk/standard/model_site_update_request.go
✅ Files skipped from review due to trivial changes (1)
  • sdk/standard/api_site.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • db/pkg/db/model/site_test.go
  • openapi/spec.yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants