Skip to content

Conversation

@josecelano
Copy link
Member

@josecelano josecelano commented Dec 4, 2025

Summary

This PR implements the foundational scaffolding for the release and run commands using a minimal docker-compose deployment with nginx. The goal is to validate the full pipeline (release → run → verify) before adding complexity with the actual Torrust Tracker services.

Issue: Closes #217

Implementation Progress

Phase Tracking

Phase Description Status
Phase 1 Presentation Layer - CLI Commands (No-Op) ✅ Complete
Phase 2 E2E Test Refactoring (Safety Net) ✅ Complete
Phase 3 Presentation Layer - Controllers ✅ Complete
Phase 4 Application Layer - Command Handlers (Skeleton) ✅ Complete
Phase 5 Application Layer - State Transitions ✅ Complete
Phase 6 Steps Layer - Prepare Compose Files ✅ Complete
Phase 7 Infrastructure Layer - Docker Compose Template Renderer ✅ Complete
Phase 8 Steps Layer - Transfer Files to VM ✅ Complete
Phase 9 Steps Layer - Start Services ✅ Complete
Phase 10 E2E Test Coverage ✅ Complete

What's Implemented

CLI Commands

  • release - Release application files to a configured environment
  • run - Run the application stack on a released environment

State Transitions

  • ConfiguredReleasingReleased (release command)
  • ReleasedStartingRunning (run command)

Infrastructure

Docker Compose Template Renderer

  • DockerComposeTemplateRenderer following the same pattern as AnsibleTemplateRenderer and TofuTemplateRenderer
  • Embedded docker-compose.yml template with nginx:alpine demo service
  • Template extraction via TemplateManager (double-indirection pattern)

Release Handler - Three-Level Architecture (Phase 8)

  • ReleaseCommandHandler refactored to follow Provision Handler patterns
  • RenderDockerComposeTemplatesStep - Renders templates locally
  • DeployComposeFilesStep - Deploys files to VM via Ansible playbook
  • ReleaseTraceWriter - Generates trace files for failure diagnostics
  • ReleaseStep enum and ReleaseFailureContext for step tracking
  • Ansible playbook deploy-compose-files.yml for file transfer

Run Handler - Three-Level Architecture (Phase 9)

  • RunCommandHandler with full step architecture
  • StartServicesStep - Starts Docker Compose services on VM
  • RunTraceWriter - Generates trace files for failure diagnostics
  • RunStep enum and RunFailureContext for step tracking
  • Ansible playbook start-services.yml.tera for service management

E2E Test Coverage (Phase 10)

  • RunningServicesValidator - Remote action to validate running services via SSH
  • Separate validation modules per command (SRP):
    • run_configuration_validation - Validates configure command (Docker installed)
    • run_release_validation - Validates release command (Compose files deployed)
    • run_run_validation - Validates run command (services running)
  • Full workflow tested: create → register → configure → release → run → destroy

Files Changed

Application Layer

  • src/application/command_handlers/release/ - ReleaseCommandHandler (refactored)
  • src/application/command_handlers/run/ - RunCommandHandler (fully implemented)
  • src/application/steps/application/deploy_compose_files.rs - DeployComposeFilesStep
  • src/application/steps/application/start_services.rs - StartServicesStep (NEW)
  • src/application/steps/rendering/docker_compose_templates.rs - RenderDockerComposeTemplatesStep

Domain Layer

  • src/domain/environment/state/release_failed.rs - ReleaseStep enum, ReleaseFailureContext
  • src/domain/environment/state/run_failed.rs - RunStep enum, RunFailureContext (NEW)

Infrastructure Layer

  • src/infrastructure/external_tools/docker_compose/template/renderer/mod.rs - DockerComposeTemplateRenderer
  • src/infrastructure/remote_actions/running_services.rs - RunningServicesValidator (NEW)
  • src/infrastructure/trace/writer/commands/release.rs - ReleaseTraceWriter
  • src/infrastructure/trace/writer/commands/run.rs - RunTraceWriter (NEW)

Presentation Layer

  • src/presentation/controllers/release/ - Release controller
  • src/presentation/controllers/run/ - Run controller

Testing

  • src/testing/e2e/tasks/run_configuration_validation.rs - Configure command validation
  • src/testing/e2e/tasks/run_release_validation.rs - Release command validation (NEW)
  • src/testing/e2e/tasks/run_run_validation.rs - Run command validation (NEW)
  • src/bin/e2e_config_and_release_tests.rs - Full workflow E2E tests

Templates

  • templates/docker-compose/docker-compose.yml - nginx:alpine demo service
  • templates/ansible/deploy-compose-files.yml - File deployment playbook
  • templates/ansible/start-services.yml.tera - Service start playbook (NEW)

Manual E2E Test

# Full workflow verified:
cargo run -- create environment --env-file envs/e2e-full.json
cargo run -- provision e2e-full
cargo run -- configure e2e-full
cargo run -- release e2e-full
# Verified: docker-compose.yml deployed to VM at /opt/torrust/
cargo run -- run e2e-full
# Verified: nginx service running
cargo run -- destroy e2e-full

All Phases Complete ✅

This PR is ready for review. All phases have been implemented:

  • Full release and run command implementation
  • Complete E2E test coverage with separate validation per command
  • Three-level architecture (Commands → Steps → Actions) consistently applied

Related Documentation

See docs/issues/217-demo-slice-release-run-commands.md for detailed implementation plan.

Phase 1 of Issue #217: Presentation Layer - CLI Commands (No-Op)

- Add Release command variant to Commands enum with environment parameter
- Add Run command variant to Commands enum with environment parameter
- Add routing cases in dispatch/router.rs (prints 'not implemented yet')
- Update all test match statements to include new command variants
- Fix rustdoc warning for unclosed HTML tag in documentation

Both commands are now recognized by the CLI:
- cargo run -- release <environment>
- cargo run -- run <environment>

The commands currently print 'not implemented yet' and return success.
This is the first step in the outside-in implementation approach.
- Rename src/bin/e2e_config_tests.rs to src/bin/e2e_config_and_release_tests.rs
- Update Cargo.toml binary definition
- Update scripts/pre-commit.sh to use new binary name
- Update .cargo/config.toml alias
- Update .github/workflows/test-e2e-config.yml workflow
- Update all documentation references across 16 files
…commands

Phase 4 implementation:
- Release handler: loads environment, validates Configured state
- Run handler: loads environment, validates Released state
- Updated error types to use StateTypeError with #[from] conversion
- Updated tests to verify environment-not-found error handling
- Both handlers follow ConfigureCommandHandler pattern
Phase 5 implementation:
- Release handler: Configured → Releasing → Released transitions
- Run handler: Released → Running transition
- State persistence at each transition point
- Returns typed environments (Environment<Released>, Environment<Running>)
- Placeholder for actual step execution (Phase 6)
- Add run_release_command and run_run_command methods to ProcessRunner
- Add release_software and run_services methods to E2eTestRunner
- Update e2e_config_and_release_tests.rs to call release and run commands
- Update e2e_tests_full.rs to call release and run commands

This ensures E2E tests exercise the new release and run commands
even though they currently just perform state transitions without
executing remote actions (placeholder implementations).
… files

- Add DockerComposeTemplateRenderer in template/renderer pattern
- Create embedded docker-compose.yml template with nginx:alpine demo service
- Integrate ReleaseStep with TemplateManager for on-demand template extraction
- Add comprehensive error handling with actionable help messages
- Update ReleaseCommandHandler to execute actual release step
- Update presentation layer to report release step completion

Phase 6 deliverable: 'release' command generates docker-compose.yml in build
directory, extracting from embedded templates and copying to build location.

Manual E2E test verified:
- create environment → provision → configure → release → run → destroy
- docker-compose.yml generated correctly in build/e2e-full/docker-compose/
- Environment state transitions to Released after release command
…tatus

- Mark Phase 6 (Steps Layer - Prepare Compose Files) as ✅ COMPLETE
- Mark Phase 7 (Infrastructure Layer - Docker Compose Template Renderer) as ✅ COMPLETE
- Update folder structure to reflect template/renderer pattern
- Update source template location to templates/docker-compose/ (embedded)
- Update Files to Create section with actual implemented files
- Rename DockerComposeFileManager references to DockerComposeTemplateRenderer
- Add implementation notes about embedded template system integration
@josecelano josecelano self-assigned this Dec 4, 2025
…three-level architecture

- Add ReleaseStep enum and ReleaseFailureContext for step tracking
- Add ReleaseTraceWriter for failure trace files
- Add DeployComposeFilesStep with Ansible playbook integration
- Add RenderDockerComposeTemplatesStep for template rendering
- Add deploy-compose-files.yml Ansible playbook
- Remove deprecated write_remote_file from SSH client
- Remove old ReleaseStep class (replaced by new steps)
- Update module exports across application and infrastructure layers
- Add passwordless sudo to Docker SSH server for deployment testing
…lease

- Add MissingInstanceIp error variant with troubleshooting help
- Validate instance IP upfront before transitioning to Releasing state
- Pass validated IP explicitly to workflow methods
- Remove conditional deployment logic (fail fast instead of silent skip)
- Enhance logging with instance_ip field
…r across all handlers

This commit ensures consistent error handling across all command handlers
when an environment is not found. Previously, most handlers returned
StatePersistence(RepositoryError::NotFound) which was semantically
incorrect - a missing environment is a configuration/usage error, not a
persistence error.

Changes:
- Configure handler: Added EnvironmentNotFound variant with full
  Traceable implementation, help text, and tests
- Destroy handler: Added EnvironmentNotFound variant with full
  Traceable implementation, help text, and tests
- Provision handler: Added EnvironmentNotFound variant with full
  Traceable implementation, help text, and tests
- Release handler: Use existing EnvironmentNotFound instead of
  StatePersistence(NotFound)
- Run handler: Use existing EnvironmentNotFound instead of
  StatePersistence(NotFound)
- Test handler: Added EnvironmentNotFound variant with full
  Traceable implementation, help text, and tests

All EnvironmentNotFound errors now:
- Use ErrorKind::Configuration (not StatePersistence)
- Include the environment name for context
- Provide comprehensive troubleshooting guidance in help()
- Have proper test coverage

This follows the pattern established by RegisterCommandHandler which
correctly distinguished between environment not found (configuration
error) and actual persistence failures.
- provision/errors.rs: Add #[allow(clippy::too_many_lines)] to help()
  function (104 lines exceeds the 100 line limit)
- test/errors.rs: Merge match arms with identical Configuration
  error kind (EnvironmentNotFound and MissingInstanceIp)
- Extract load_environment methods in all command handlers
- Merge load + state validation into state-specific methods:
  - load_configured_environment (release handler)
  - load_released_environment (run handler)
  - load_created_environment (register, provision handlers)
  - load_provisioned_environment (configure handler)
  - load_environment (test, destroy - use AnyEnvironmentState)
- Remove redundant step-by-step numbered comments
- Remove redundant entry info! logs (handled by #[instrument])
- Move ansible_client creation inside execute_configuration_with_tracking
- Add symmetric logging: log first in both Ok and Err branches
- Remove unused build_ansible_client method
- Ensure logging happens before repository operations that might fail
- Log success/error first in both Ok and Err branches (before repository ops)
- Ensures errors are captured even if subsequent persistence fails
- Applied to release and provision handlers (only ones with match blocks)
- run and register handlers use ? operator, will get logging when they add failure handling
- Create run-compose-services.yml Ansible playbook for starting docker compose
- Register playbook in AnsibleTemplateRenderer::copy_static_templates()
- Create StartServicesStep in src/application/steps/application/start_services.rs
- Add RunStep enum and RunFailureContext to domain state
- Update run_failed() signature to accept RunFailureContext
- Create RunTraceWriter for run command failure tracing
- Wire StartServicesStep into RunCommandHandler with step tracking
- Add TypedEnvironmentRepository to run handler
- Add MissingInstanceIp and StartServicesFailed error variants
- Update all tests to use new RunFailureContext pattern
- Update RunCommandController to delegate to RunCommandHandler
- Remove scaffolding code, implement real service start workflow
- Add RunCommandHandlerError to RunSubcommandError conversion
- Update controller tests to expect real behavior (environment not found)
- Add TORRUST_TD_SKIP_RUN_IN_CONTAINER env var for E2E tests
- Skip run command in container-based E2E tests (no Docker-in-Docker)
- Add Docker CE installation to provisioned-instance Dockerfile
- Add dockerd process to supervisord with vfs storage driver
- Enable privileged mode in testcontainers for DinD support
- Add TORRUST_TD_SKIP_DOCKER_INSTALL_IN_CONTAINER env var to skip
  Docker installation when pre-installed (avoids package conflicts)
- Update wait condition to wait for dockerd RUNNING state
- Create ADR documenting the Docker-in-Docker decision
- Remove TORRUST_TD_SKIP_RUN_IN_CONTAINER since run command now works
Since Rust 1.81, std::env::set_var is an unsafe function because it can
cause undefined behavior in multithreaded programs. Added unsafe block
with safety comment explaining that the environment variables are set
before any async runtime or threads are created.
- Change status from Proposed to Accepted (Implemented)
- Update implementation plan with actual code and configurations
- Add Phase 4 (Docker package conflict handling) and Phase 5
- Add Lessons Learned section with key discoveries:
  - vfs storage driver requirement for nested Docker
  - Package conflicts between Docker CE and docker.io
  - Rust 1.81+ unsafe requirement for set_var
  - Wait condition best practices
- Add references to Docker storage drivers and Rust docs
…commands

Create proper separation of concerns for E2E validations:

- run_configuration_validation.rs: Validates 'configure' command
  (Docker and Docker Compose installed correctly)

- run_release_validation.rs: Validates 'release' command
  (Docker Compose files deployed to /opt/torrust)

- run_run_validation.rs: Validates 'run' command
  (Docker Compose services running and healthy)

- RunningServicesValidator: New remote action that checks:
  - Services listed in 'docker compose ps' output
  - Services in running status (not exited/restarting)
  - Health check status if configured
  - HTTP accessibility for web services (optional)

E2E test now validates each command's output:
configure → validate config → release → validate release → run → validate run

Added documentation notes about future external validation:
- Current scope: Demo slice with nginx, internal validation via SSH
- Future: External accessibility testing for real Torrust services
- Future: Firewall rule verification through external tests

This completes Phase 10 of issue #217 with proper validation architecture.
…ase_run_tests

The function now tests the complete configure → release → run workflow,
not just configuration. Renamed to accurately reflect what is being tested:

- run_configuration_tests → run_configure_release_run_tests
- Updated doc comments to describe the full workflow
- Updated log messages to match

This binary (e2e_config_and_release_tests) tests:
create → register → configure → release → run

While e2e_provision_and_destroy_tests tests:
create → provision → destroy

The split exists because GitHub Actions runners have LXD networking
limitations that prevent running commands inside provisioned VMs.
@josecelano josecelano marked this pull request as ready for review December 4, 2025 20:12
@josecelano
Copy link
Member Author

ACK fd53b7f

@josecelano josecelano merged commit 0e85a6d into main Dec 4, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Demo Slice: Release and Run Commands Scaffolding

2 participants