Skip to content

Add AWX automation platform integration#82

Open
maryamtahhan wants to merge 2 commits intoredhat-et:mainfrom
maryamtahhan:awx-automation-integration
Open

Add AWX automation platform integration#82
maryamtahhan wants to merge 2 commits intoredhat-et:mainfrom
maryamtahhan:awx-automation-integration

Conversation

@maryamtahhan
Copy link
Copy Markdown
Collaborator

Summary

This PR adds comprehensive AWX (Ansible Automation Platform) support for orchestrating and managing vLLM performance tests through a web UI.

⚠️ This PR depends on #81 - Please merge #81 first to avoid conflicts.

Key Features

AWX Deployment Infrastructure

  • KIND-based AWX deployment for local development and testing
  • Automated AWX operator installation with proper RBAC configuration
  • Hybrid node configuration to prevent 500 errors
  • CoreDNS auto-configuration for host DNS resolution on Linux
  • Comprehensive Makefile for deployment, configuration, and cleanup

Custom Execution Environment

  • Build custom execution environment with pre-installed Ansible collections
  • Python 3.9 base from official AWX-EE image
  • Security hardening: GPG verification for collections, SHA256 for Python
  • Build scripts with proper error handling and path traversal protection
  • Support for both Podman and Docker builds

AWX Configuration Automation

  • Automatic project, inventory, and credential setup
  • Auto-import all playbooks as job templates with sensible defaults
  • SSH credential management with security (no_log protection)
  • HuggingFace token credential auto-creation
  • Execution environment auto-detection for Podman/Docker/KIND
  • Retry logic for reliability during AWX startup

Job Template Features

  • Pre-configured templates for all benchmark playbooks
  • Inline documentation and examples in extra_vars
  • Support for concurrent load testing with all three phases
  • Core sweep automation
  • Auto-configured variables with YAML validation

CI/CD Integration

  • GitHub Actions workflow for execution environment builds
  • Build on PRs, push to registry only on merge to main
  • Security: prevent non-main branches from overwriting latest tag
  • Optional Trivy security scanning

Files Added

AWX Infrastructure

  • automation/test-execution/awx/Makefile - Deployment and management automation
  • automation/test-execution/awx/README.md - Comprehensive documentation
  • automation/test-execution/awx/configure-awx.yml - Auto-configuration playbook
  • automation/test-execution/awx/kind-cluster.yaml - KIND cluster configuration
  • automation/test-execution/awx/awx-instance.yaml - AWX instance definition

Execution Environment

  • automation/test-execution/awx/execution-environment.yml - EE definition
  • context/Containerfile - Custom EE build definition
  • context/_build/scripts/* - Build scripts (assemble, entrypoint, introspect.py, etc.)
  • context/_build/requirements.yml - Build requirements

CI/CD

  • .github/workflows/build-execution-environment.yml - EE build workflow
  • .github/workflows/unit-tests.yml - Updated test paths

Test Plan

  • Deploy AWX using make deploy-awx
  • Verify execution environment builds successfully
  • Confirm auto-configuration creates all resources
  • Run job templates from AWX UI
  • Test concurrent load testing through AWX
  • Verify CI workflow builds EE on PR

Merge Order

This PR is part 2 of a 2-part refactoring of PR #53:

  1. First: Merge PR Improve Ansible playbooks with AWX compatibility and security hardening #81 (Ansible playbook improvements)
  2. Second: Merge this PR (AWX automation integration)
  3. Finally: Close PR Add AWX frontend to running ansible playbooks #53

⚠️ Important: Merging this PR before #81 will create conflicts. Please merge #81 first.

🤖 Generated with Claude Code

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

Warning

Rate limit exceeded

@maryamtahhan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 38 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 10 minutes and 38 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 87c15411-574e-4b85-ae60-2ae1a32d2cb8

📥 Commits

Reviewing files that changed from the base of the PR and between c438918 and 23e35d8.

📒 Files selected for processing (31)
  • .github/workflows/build-execution-environment.yml
  • .github/workflows/unit-tests.yml
  • automation/test-execution/ansible/ansible.md
  • automation/test-execution/ansible/filter_plugins/cpu_utils.py
  • automation/test-execution/ansible/llm-benchmark-auto.yml
  • automation/test-execution/ansible/llm-benchmark-concurrent-load.yml
  • automation/test-execution/ansible/llm-benchmark.yml
  • automation/test-execution/ansible/llm-core-sweep-auto.yml
  • automation/test-execution/ansible/roles/benchmark_guidellm/tasks/main.yml
  • automation/test-execution/ansible/roles/common/tasks/allocate-cores-from-count.yml
  • automation/test-execution/ansible/roles/common/tasks/detect-numa-topology.yml
  • automation/test-execution/ansible/roles/vllm_server/tasks/start-embedding.yml
  • automation/test-execution/ansible/roles/vllm_server/tasks/start-llm.yml
  • automation/test-execution/ansible/tests/unit/test_cpu_utils.py
  • automation/test-execution/awx/Makefile
  • automation/test-execution/awx/README.md
  • automation/test-execution/awx/awx-instance.yaml
  • automation/test-execution/awx/configure-awx.yml
  • automation/test-execution/awx/execution-environment.yml
  • automation/test-execution/awx/kind-cluster.yaml
  • automation/test-execution/awx/requirements.yml
  • collections/requirements.yml
  • context/Containerfile
  • context/_build/requirements.yml
  • context/_build/scripts/assemble
  • context/_build/scripts/check_ansible
  • context/_build/scripts/check_galaxy
  • context/_build/scripts/entrypoint
  • context/_build/scripts/install-from-bindep
  • context/_build/scripts/introspect.py
  • context/_build/scripts/pip_install
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

maryamtahhan and others added 2 commits March 30, 2026 12:55
…rdening

This commit improves the existing Ansible playbook infrastructure for vLLM
CPU performance evaluation with enhanced AWX compatibility, security hardening,
and comprehensive testing.

- Fix type normalization to handle AnsibleUnsafeText from AWX
- Fix allocated_nodes to return integers instead of strings
- Handle empty strings and Jinja2 None conversions properly
- Simplify node eligibility checking and allocation logic
- Improve error messages for better validation feedback

- Add no_log: true to all tasks handling HF_TOKEN
- Prevent token exposure in container start operations
- Secure environment variable handling in AWX jobs

- Add comprehensive unit tests for cpu_utils filter plugin (598 lines)
- Test coverage for: CPU range conversion, NUMA extraction, multi-NUMA
  allocation, OMP binding, and real-world scenarios
- Support for both pytest and standalone execution
- Add collections/requirements.yml for Ansible collection dependencies

- Better parameter validation for AWX jobs in concurrent load testing
- AWX detection for result path handling
- Improved NUMA topology detection in core sweep
- Enhanced result path consistency in main benchmark
- Better workload configuration handling

- Simplify prerequisites section
- Update examples with current best practices
- Clearer workflow documentation

Files changed: 13 files, 772 insertions(+), 323 deletions(-)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
This commit adds comprehensive AWX (Ansible Automation Platform) support
for orchestrating and managing vLLM performance tests through a web UI.

- KIND-based AWX deployment for local development and testing
- Automated AWX operator installation with proper RBAC configuration
- Hybrid node configuration to prevent 500 errors
- CoreDNS auto-configuration for host DNS resolution on Linux
- Comprehensive Makefile for deployment, configuration, and cleanup

- Build custom execution environment with pre-installed Ansible collections
- Python 3.9 base from official AWX-EE image
- Security hardening: GPG verification for collections, SHA256 for Python
- Build scripts with proper error handling and path traversal protection
- Support for both Podman and Docker builds
- CI workflow for building and pushing EE images to container registry

- Automatic project, inventory, and credential setup
- Auto-import all playbooks as job templates with sensible defaults
- SSH credential management with security (no_log protection)
- HuggingFace token credential auto-creation
- Execution environment auto-detection for Podman/Docker/KIND
- Retry logic for reliability during AWX startup
- Non-blocking configuration for KIND DNS limitations

- Pre-configured templates for all benchmark playbooks
- Inline documentation and examples in extra_vars
- Support for concurrent load testing with all three phases
- Core sweep automation
- Auto-configured variables with YAML validation

- Comprehensive README with deployment, configuration, and usage guides
- Troubleshooting documentation for common issues
- Support for macOS and Linux
- Environment variable overrides for project/branch/credentials
- Status display and verification commands

- GitHub Actions workflow for execution environment builds
- Build on PRs, push to registry only on merge to main
- Security: prevent non-main branches from overwriting latest tag
- Optional Trivy security scanning
- Unit test workflow updated for new test location

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
@maryamtahhan maryamtahhan force-pushed the awx-automation-integration branch from bfda5b2 to 23e35d8 Compare March 30, 2026 11:56
@maryamtahhan
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant