Skip to content

Conversation

Copy link

Copilot AI commented Sep 22, 2025

  • Analyze current auxQueues implementation in helm charts
  • Understand Thor configuration structure in values.yaml and schema
  • Review existing multi-Thor test cases
  • Design logical Thor cluster feature with instance distribution
  • Implement helm template changes for logical clusters
  • Add schema validation for new configuration
  • Create tests for the new logical cluster feature
  • Update documentation for the new feature
  • Address code review feedback

Summary

Implemented a logical Thor cluster feature that allows defining a single Thor configuration that expands into multiple physical instances.

Recent Changes (addressing code review):

  • Removed redundant test file: logical-thor-cluster.yaml was duplicate of comprehensive-thor-cluster.yaml
  • Improved documentation: Enhanced "Custom Instance Naming" example to demonstrate more meaningful naming pattern with regions
  • Eliminated duplicate validation: Removed validation logic from _helpers.tpl since it already exists in thor.yaml

Key Features:

  • Single Configuration: Define one logical Thor cluster instead of N individual ones
  • Automatic Division: maxJobs and maxGraphs are automatically divided among instances
  • Template Names: Configurable instance naming with {name} and {instance} placeholders
  • Aux Queue Setup: Each instance automatically configured to listen to the logical queue
  • Validation: Ensures maxJobs/maxGraphs are evenly divisible by instance count
  • Backward Compatible: Existing Thor and auxQueues configurations work unchanged

Example Usage:

thor:
- name: thorcluster
  cluster:
    instances: 3
    instanceTemplate: "{name}-az{instance}"
  maxJobs: 6      # Becomes 2 per instance
  maxGraphs: 3    # Becomes 1 per instance

This generates thorcluster-az1, thorcluster-az2, thorcluster-az3 with divided limits, all listening to queue thorcluster.

Testing:

  • ✅ Logical cluster expansion works correctly
  • ✅ Queue configuration properly generated
  • ✅ Validation prevents invalid divisions (single location)
  • ✅ Backward compatibility maintained
  • ✅ Mixed configurations (logical + regular) work
  • ✅ Existing test cases still pass

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] In the HPCC helm chart, multiple Thor targets can define “auxQueues” with a list of queue names they should additional listen to. It is intended to be used where there are multiple Thor targets all of the same type and configuration, that differ on... Implement logical Thor cluster feature for simplified multi-instance configuration Sep 22, 2025
Copilot AI requested a review from jakesmith September 22, 2025 13:36
Copilot finished work on behalf of jakesmith September 22, 2025 13:36
Copy link
Owner

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - see comments

instanceTemplate: "{name}-az{instance}"
numWorkers: 2
maxJobs: 6
maxGraphs: 3 No newline at end of file
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - I do not see what purpose this example servces over the one in comprehensive-thor-cluster.yaml

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. I've removed the redundant logical-thor-cluster.yaml test file since comprehensive-thor-cluster.yaml provides better coverage with more configuration options and comments explaining the use case. (ac87eea)

instanceTemplate: "{name}-node{instance}"
maxJobs: 8
maxGraphs: 4
# Generates: thor-batch-node1, thor-batch-node2
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - apart from 'thor-batch' vs 'thor-prod' what is this "Custom Instance Naming" demonstrating? It doesn't seem to explore anything new.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've updated the "Custom Instance Naming" example to demonstrate a more meaningful pattern using "thor-region-{instance}-cluster" which better shows how the template can be used for specific naming conventions needed for external systems or monitoring tools integration. (ac87eea)

{{- $maxGraphsPerInstance := div $thorConfig.maxGraphs $instances -}}
{{- if or (ne (mul $maxJobsPerInstance $instances | int) ($thorConfig.maxJobs | int)) (ne (mul $maxGraphsPerInstance $instances | int) ($thorConfig.maxGraphs | int)) -}}
{{- $_ := fail (printf "Thor cluster '%s': maxJobs (%d) and maxGraphs (%d) must be evenly divisible by instances (%d)" $thorConfig.name $thorConfig.maxJobs $thorConfig.maxGraphs $instances) -}}
{{- end -}}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - it is unnecessary to valid the divisibility here in _helpers.tpl, given it will be validated in thor.yaml

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I've removed the duplicate validation logic from _helpers.tpl since it's already handled in thor.yaml. The validation now occurs only once in the template processing flow. (ac87eea)

…iminate duplicate validation

Co-authored-by: jakesmith <[email protected]>
Copilot AI requested a review from jakesmith September 23, 2025 13:43
Copilot finished work on behalf of jakesmith September 23, 2025 13:43
Copy link
Owner

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - retarget this PR to the candidate-9.14.x branch

Copy link
Author

Copilot AI commented Sep 23, 2025

@copilot - retarget this PR to the candidate-9.14.x branch

I cannot directly retarget the PR branch through code changes - that requires updating the base branch in the GitHub PR interface. The candidate-9.14.x branch exists and is available for targeting. To retarget, you would need to:

  1. In the GitHub PR interface, click "Edit" next to the base branch
  2. Change from the current base (likely master) to candidate-9.14.x

The code changes themselves are compatible with the candidate branch structure. Would you like me to help verify compatibility or make any adjustments for the target branch?

Copilot AI requested a review from jakesmith September 23, 2025 17:55
Copilot finished work on behalf of jakesmith September 23, 2025 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants