Releases: stackhpc/ansible-slurm-appliance
v2.15
What's Changed
Security-related changes
- Use upstream munge packages with fix for CVE-2026-25506 by @elelaysh in #912
- NB: This CVE was previously fixed using an adhoc playbook to install StackHPC-built packages. This is no longer necessary as fixed upstream packages are included in the new images (see below).
- Update jupyter to v1.1.1 to fix CVE-2025-7783 by @sjpb in #916
- Update ondemand to v4.1.4 to fix GHSA-353f-x4gh-cqq8 by @sjpb in #918
- Fix trivy warning for deepdiff package CVE-2025-58367 by @sjpb in #919
New features
- Enable persistent journaling by @jovial in #763
- Integrate opkssh for ssh using OIDC by @sjpb in #893
Other fixes and improvements
- Fix permissions for sshd config when using compute-init by @jovial in #909
- Fix secrets_openhpc_mungekey_default to be a plain value by @elelaysh in #907
- Make filters to_ood_regex, prometheus_node_exporter_targets stable by @elelaysh in #900
- Default NFS mounts to requiring working networking by @sjpb in #894
- Run pulp CLI on pulp_server instead of localhost by @sjpb in #910
- Configure MaxStartups in sshd and set for control by @MoteHue in #901
- Use new OpenTofu-based jumphost for Leafcloud CI by @sjpb in #886
Images
Two new images are available:
- RockyLinux8: openhpc-RL8-260306-0934-45b9a589
- RockyLinux9: openhpc-RL9-260306-0934-45b9a589
Full Changelog: v2.14...v2.15
v2.14
Important
The new images include our build of munge 0.5.18 to fix CVE-2026-25506.
As a precaution, you should rotate MUNGE keys in your cluster. Please see instructions here
What's Changed
- Configurable administrator user groups for ssh by @elelaysh in #899
- Provide fix for CVE-2026-25506: MUNGE buffer overflow allowing key leakage by @elelaysh in #903
- Update StackHPC images to fix CVE-2026-25506 (upgrade to munge 0.5.18) by @elelaysh in #906
- Upgrade pillow to 12.1.1 by @elelaysh in #908
Images
Two new images are available:
- RL8: openhpc-RL8-260217-0837-79ae25d1
- RL9: openhpc-RL9-260217-0837-79ae25d1
Full Changelog: v2.13...v2.14
v2.13
v2.12
What's Changed
- Fix CI rebuild check for cases where image hasn't changed since latest release by @sjpb in #895
- Ensure tmp.mount mask status matches fstab by @sjpb in #896 (Fixes failure for RL8 instances to complete reboot)
- Fix OpenSSL bug RLSA-2026:1473 by @claudia-lola in #898 (NB: includes OS package update)
- Bump python cryptography for CVE-2026-26007 by @sjpb in #902
Full Changelog: v2.11...v2.12
Images
Two new images are available:
- RL8: openhpc-RL8-260210-0941-ccf0e76b
- RL9: openhpc-RL9-260210-0941-ccf0e76b
v2.11
What's Changed
Important
By default, the control node will now run a squid service as a proxy for EESSI clients. See #876 and review the defaults in docs/eessi.md#eessi-proxy-configuration to ensure that they are suitable for your situation and control node sizing.
Updates
- Update DNF packages & upgrade to Rocky Linux 9.7 by @elelaysh in #858
- Bump CUDA to 13.1.1 and NVIDIA driver to 590.48.01 by @priteau in #880
- Enable EESSI proxy by default by @sjpb in #876
New features and improvements
- Re-enable gpgchecks for dnf packages, Manila patchlevel bump & docs improvements by @bertiethorpe in #873
- Use Ark repos for Open Ondemand installs (with in-repo GPG key) by @sjpb in #831
- alertmanager: support tuning systemd sandboxing by @priteau in #877
- Add DEX support to openondemand role by @sjpb in #862
- Add mounts role with /tmp defaulted to tmpfs on login & compute nodes by @sjpb in #888
- Install bash-completion, for slurm command completions by @elelaysh in #892
Fixes and CI changes
- Fix permissions issues for slurm-controlled rebuilds by @sjpb in #889
- Bump python version for current release in CI by @sjpb in #885
- Enable yamllint + key ordering + refresh super-linter by @elelaysh in #874
- Provide useful error from topology if no instances found by @sjpb in #884
Images
Two new images are available:
- RockyLinux 8: openhpc-RL8-260127-1007-d7ed9234
- RockyLinux 9: openhpc-RL9-260127-1007-d7ed9234
Full Changelog: v2.10...v2.11
v2.10
What's Changed
Updates
Important
Python 3.12 is now required on the deploy host. The venv/ will be upgraded automatically on running dev/setup-env.sh (with no venv activated).
- Use ansible-core over ansible and upgrade to v2.16 by @bertiethorpe in #853.
- Bump CUDA to 13.1.0 by @priteau in #860
New features and improvements
Important
Running the site.yml playbook in production environments will prompt for confirmation by default - see PR883 below for details
- Write summary from hpctests pingpong to repo by @sjpb in #870
- Open OnDemand: Add dropdown to select select partition by @elelaysh in #869
- Support templating and ansible_ssh_common_args in ansible-ssh by @sjpb in #872
- Protect production environments when running Ansible by @sjpb in #883
Fixes & CI changes
- Fix appliances_extra_packages_default missing stackhpc fatimage by @elelaysh in #856
- Fix empty nodelist in ondemand apps by @sjpb in #861
- Attempt to fix permissions for trivy scan on main by @JohnGarbutt in #867
- Fix sssd & sshd for slurm-controlled rebuild by @sjpb in #866
- .caas: fix missing path in docs for ANSIBLE_INVENTORY by @elelaysh in #865
- CI: Cleanup OpenStack resources from previous attempts by @sjpb in #875
- Fix TOFU ssh key prompt in OpenOnDemand web shell for IPA hosts by @sjpb in #864
- alertmanager: Fix Slack integration by @priteau in #878
- Update tf s3 backend instructions for better ec2 credential behaviour by @sjpb in #871
- Fix hostkeys in IPA not matching host when persisting keys by @sjpb in #863
- Fix Markdown formatting issues by @priteau in #881
Full Changelog: v2.9...v2.10
Images
Two new images are available:
- RL8: openhpc-RL8-251213-1133-31273766
- RL9: openhpc-RL9-251213-1133-31273766
v2.9
What's Changed
- Add stub of manual workflow for CI cluster cleanup by @claudia-lola in #850
- Ensure OnDemand app installs during image build do not need a cluster to be deployed by @bertiethorpe in #843
- Add manual workflow for cleanup of CI clusters by @claudia-lola in #852
- Fix image-set-properties.sh: hw_disk_bus=virtio by @elelaysh in #857
- Bump v0.7.0 azimuth-cloud terraform collection by @bertiethorpe in #855
Images
Two new images are available:
- RockyLinux 8: openhpc-RL8-251119-1833-cb477455
- RockyLinux 9: openhpc-RL9-251119-1834-cb477455
New Contributors
Full Changelog: v2.8.2...v2.9
v2.8.2
v2.8.1
What's Changed
This is a minor update on v2.8 to avoid unnecessary delete/recreate of compute nodes:
- Avoid compute node replacements due to optional user_data by @MoteHue in #847
- Use 'nodegroups' param in controlled rebuild docs by @MoteHue in #849
For images see release v2.8.
Full Changelog: v2.8...v2.8.1
v2.8
What's Changed
Improved GPU configuration/support
- Support automatic GRES configuration for NVIDIA GPUs by @sjpb in #820
- Add option to install nvidia-fabricmanager by @claudia-lola in #836
- Add support for GRES to ondemand apps by @sjpb in #837
- Adds bandwidth.yml playbook for NVIDIA nvbandwidth by @claudia-lola in #834
- Make eessi configure gpu node automatically by @claudia-lola in #841
- Bump CUDA to 13.0.2 and NVIDIA driver to 580.105.08 by @priteau in #823
Slurm configuration
- Bump OpenHPC role to v1.4.0 by @sjpb in #818. Adds:
- Enable use of custom Slurm builds by @sjpb in stackhpc/ansible-role-openhpc#163
- CI: Switch to latest rockylinux/rockylinux images by @priteau in stackhpc/ansible-role-openhpc#198
- Add support for mpi.conf templating by @bertiethorpe in stackhpc/ansible-role-openhpc#201
- Bump OpenHPC role to v1.4.1 by @bertiethorpe in #822 - fixes mpi.conf templating
New/improved features
- Add support for InfiniBand interfaces to NHC by @sjpb in #821
- Add tool to set image properties by @sjpb in #829
Docs and other
- Improve pulp docs by @sjpb in #819
- Fix gpg check for cernvmfs installs by @bertiethorpe in #816
- Remove ansible-lint warnings by @bertiethorpe in #817
- Replace whitespace in NHC mount checks by @sjpb in #824
- Allow fixed ip lists to be longer than nodes list by @sjpb in #830
- Add retries to CI tofu apply by @bertiethorpe in #833
- Don't install hpl source during extra builds by @sjpb in #828
- Add docs for eessi by @claudia-lola in #827
- Fix ansible-ssh changes due to linting by @sjpb in #838
- Use Ark repofiles for additional repos by @bertiethorpe in #832
- Set image properties for CI image build and sync by @bertiethorpe in #839
- Run trivy scans on main, to help reporting by @JohnGarbutt in #842
- Describe buildenv in EESSI docs by @claudia-lola in #845
Full Changelog: v2.7...v2.8
Images
Two new images are available:
- RL8: openhpc-RL8-251119-1202-332ac921
- RL9: openhpc-RL9-251119-1202-332ac921