- CEEMS (Compute Energy & Emissions Monitoring Stack) is a solution to monitor performance, energy and emission metrics of individual compute workloads managed by SLURM, Openstack, and Kubernetes resource managers.
- CEEMS provides three principal components:
- A Prometheus exporter and profiling agent that runs on all compute nodes to export usage metrics to Prometheus and application profiling data to Pyroscope servers.
- An API server that acts as an abstraction of resource manager to store a list of compute workloads from different resource managers.
- An optional load balancer to support multi-tenancy and impose strict access control on Prometheus and Pyroscope servers.
Demo: https://ceems-demo.myaddr.tools
Docs: https://ceems-dev.github.io/ceems/docs/
API Docs: https://ceems-dev.github.io/ceems/docs/category/api