Add autoscaling module by PizieDust · Pull Request #250 · robur-coop/mollymawk

PizieDust · 2026-05-29T06:21:57Z

This module file implements the logic for monitoring cpu usage of the currently running unikernels and decides when to scale up or scale down.

Configuration

poll_interval: how frequently the system expects to evaluate the unikernel stats.
scale_up_threshold_percent: cpu load (e.g., 90.0%) that triggers a scale up.
scale_up_trigger_ticks: how many consecutive stats reports must exceed the scale up threshold before a clone is created, so we don't have false positives.
scale_down_threshold_percent: cpu load (e.g., 40.0%) that triggers a scale down event if we already had clones.
scale_down_trigger_ticks: how many consecutive stats reports must fall below the scale down threshold before a clone is destroyed.
cooldown_period: the grace period after any scaling action where the system pauses scaling to prevent rapid creation/destruction of clones.
death_timeout: the maximum time to wait for stats from a vm. If a vm stays silent longer than this, it is considered dead and pruned from the system.

Modules

`Cpu_monitor`

This module takes care of how we convert the raw stats from albatross into a float we can use against the thresholds to determine if to scale.

`Cluster_manager`

This module tracks "groups" (a primary unikernel and all of its clones) and evaluates their combined load.

get_or_create: finds an existing group for a primary unikernel or creates an empty one.
find_group_by_name: finds an existing group using the primary unikernel's name.
extract_name_and_clone_id: parses a vm name to separate the primary name from the clone ID
find_or_create_group: the main entry point when stats arrive. it parses the incoming vm's name to route the stats to the correct group.
next_clone_name: generates a unique name for a new clone based on the group's next_id.
register_clone: adds a newly created clone to the group and triggers the cooldown period.
remove_clone: removes a dead or intentionally destroyed clone from the group and triggers the cooldown period.
prune_dead_clusters: prunes individual dead clones or destroys the entire group if the primary dies.
in_cooldown: checks if the group recently had a scale operation and should ignore load spikes/drops.
check_group_average: calculates the average cpu load across all the active instances (the primary and all its active clones) to determine the true load of the cluster.
check_group_status: takes the average load and compares it against the thresholds to decide if the cluster should scale or not

hannesm · 2026-06-11T18:12:15Z

+      let pct = cpu_delta /. elasped_time_in_seconds *. 100.0 in
+      (* TODO: use numcpus to cap it at 100.0% if the vm has more than 1 cpu. Now most
+         vms use 1 cpu, so capping at 100% is fine. *)
+      Float.min 100.0 pct


I don't quite understand why floating point values are used all over here, and what would be wrong to use microseconds as int instead. But I guess we have other things to do than to argue about that.

hannesm · 2026-06-11T18:13:22Z

+      Ptime.Span.to_float_s elasped_time_difference
+    in
+    if elasped_time_in_seconds <= 0.000001 then 0.0
+    else if cpu_delta < 0.0 then 0.0


how can this happen?

hannesm · 2026-06-11T18:14:43Z

+type t = {
+  mutable monitor : Cpu_monitor.t;
+  mutable last_cpu_usage : float;
+  mutable last_stats_received : Ptime.t;
+}


A type with all-mutable fields... this smells a bit... could this instead be pure immutable values, and you pass in/out a t (i.e. always construct a fresh one when you want to modify a field)?

hannesm · 2026-06-11T18:25:24Z

+            m "[Cluster Manager] Invalid clone name '%s'." (fst clone));
+        Error "Invalid clone name"
+
+  let check_group_average group key now rusage =


Would you mind to elaborate what this function is supposed to do, and what are the input arguments?

I had the impression from the name, it should compute the average CPU usage!?

But then, what does the key argument do? And why is there a if String.equal ...? What is the case for your group.primary :: group.clones that any element of this list is not name = key?

thanks for your comment. I'm still wondering what the actions should be and triggered when.

so, for a new measurement that arrives certainly we want to compute the cpu load.

now, the iterating over all clones and the primary, this should be done once when all measurements arrived, or? so, shouldn't 1 and 2-4 be separate? maybe once the primary measurement is received is the time when to compute the group average?

hannesm · 2026-06-11T18:27:59Z

+        in
+        Ok (average_usage, state)
+
+  let check_group_status group key now rusage =


I've no clue what check_group_status is supposed to do with the input arguments that exceed the group? Why is there a key and a now and a rusage?

hannesm · 2026-06-11T18:32:12Z

+        Logs.debug ~src:a_logs (fun m ->
+            m "[Cluster Manager] Pruning dead cluster: %s" key);
+        Hashtbl.remove clusters key)
+      dead_keys


As discussed on matrix, I'm not a big fan of Hashtbl and background tasks. Can we design something that is a bit more robust and doesn't rely on "prune_dead_clusters"?

I don't quite understand the semantics. What should happen if a unikernel (the primary) disappears, which has been scaled up? It looks like it is then removed from the clusters hash table, but what happens with all the clones? Won't they be re-inserted? How's the code dealing with a "group" that doesn't have a "primary"?

hannesm · 2026-06-17T13:43:18Z

+  let get_total_cpu_time (r : Vmm_core.Stats.rusage) =
+    let user_t = timeval_to_float r.utime in
+    let sys_t = timeval_to_float r.stime in
+    user_t +. sys_t


there's as well a runtime field in kinfo_mem. any reason why rusage is used here? (I'm curious, there's no need to change any code.)

so stats is type t = rusage * kinfo_mem option * vmm option * ifdata list, since kinfo_mem is optional, I decided to use rusage since i was sure it's always present

hannesm · 2026-06-17T13:54:00Z

I think this is fine,still I am missing some more high-level view of what should happen and what is being computed when and what is being kept in memory...

PizieDust requested a review from reynir May 29, 2026 06:34

autoscaling core

ba040e8

PizieDust force-pushed the scale_up branch from c274d51 to ba040e8 Compare May 29, 2026 14:27

PizieDust added 3 commits May 30, 2026 07:57

refactor autoscaling core

708f0d4

use clearer terminology

c48be28

remove redundant next-clone-id

9c8511c

PizieDust mentioned this pull request Jun 4, 2026

Module for autoscaling logic of unikernels #203

Closed