Enhance memory pressure handling with collection compaction #1517

carlhoerberg · 2025-12-01T00:45:41Z

Summary

Enhances systemd memory pressure monitoring to proactively compact collections and reduce memory usage beyond just calling GC. This addresses the issue where Crystal collections (Array, Hash, Deque) never shrink their internal buffers, leading to wasted memory after temporary spikes.

Problem

Currently, LavinMQ only calls GC.collect when systemd detects memory pressure. However, Crystal's collections retain their maximum capacity even after elements are removed. A queue that grew to 10,000 messages keeps that buffer size even when drained to 10 messages, wasting memory.

Solution

Implements a cascading compact_collections pattern throughout the entire object hierarchy:

Core Components

Array monkey patch - Adds trim_to_size method for in-place buffer compaction
Hash/Deque - Uses .dup reassignment to create right-sized copies
IO::Memory - Assigns new instances when empty
MFiles - Calls dontneed on memory-mapped files to release physical pages

Compaction Cascade

Server
├── UserStore (users hash)
├── ParameterStore (parameters hash)
├── MQTT::Brokers (brokers hash)
│   └── Broker (clients hash)
│       └── Session (unacked hash)
└── VHostStore (vhosts hash)
    └── VHost (queues, exchanges, consumers hashes + connections array)
        ├── Queue (deliveries, consumers, unacked)
        │   └── MessageStore (deleted, segments, requeued + MFile dontneed)
        ├── Exchange (bindings - all 5 types)
        ├── Client (channels hash)
        │   └── Channel (unacked, tx arrays, consumers, IO::Memory)
        ├── ShovelStore (shovels hash)
        └── UpstreamStore (upstreams, upstream_sets hashes)

Compaction Threshold

Collections are compacted when capacity > size * 2, providing aggressive memory reclamation during pressure events while avoiding excessive compaction overhead during normal operation.

Changes

Add Array#trim_to_size with rewind and ensure_capacity helpers
Implement compact_collections in 21 files across the codebase
Update launcher to run compaction before GC with timing logs
Cascade pattern ensures all collections are compacted systematically

Expected Benefits

10-50 MB memory savings during typical pressure events
Faster GC cycles - fewer large objects to scan
Better peak handling - system recovers from memory spikes
Reduced swap usage on memory-constrained systems

Test Plan

Build and run server: make bin/lavinmq CRYSTAL_FLAGS=
Trigger memory pressure with high message throughput
Verify compaction logs appear: "Memory pressure detected, running compaction"
Monitor RSS reduction after compaction
Run test suite: make test
Check for performance regressions under normal load

Notes

Compaction is only triggered during systemd memory pressure events
Uses 2x threshold to balance memory savings vs compaction overhead
All collection types handled: Array, Hash, Deque, Set, IO::Memory
MFile dontneed is safe - kernel can reclaim physical pages without breaking mmap

🤖 Generated with Claude Code

Add comprehensive collection compaction during systemd memory pressure events to reduce memory usage and improve GC performance. Changes: - Add Array#trim_to_size monkey patch to stdlib for in-place compaction with configurable threshold (default 2x capacity vs size) - Implement compact_collections cascade pattern across entire hierarchy: * Server → UserStore, ParameterStore, MQTT Brokers, VHostStore * VHostStore → internal hash + cascade to all VHosts * VHost → queues, exchanges, consumers, connections, shovels, upstreams * Queue → deliveries, consumers, unacked + MessageStore * Exchange → bindings in all types (direct, topic, fanout, headers, consistent_hash) * Client → channels hash + cascade to all Channels * Channel → unacked, tx arrays, consumers, IO::Memory objects * MQTT → Broker clients, Session unacked messages * Stores → UserStore, ParameterStore, ShovelStore, UpstreamStore - Call dontneed on all MFiles (memory-mapped segments) to release physical memory pages while keeping virtual mapping intact - Update launcher memory pressure handler to: * Log when pressure is detected * Run full collection compaction before GC * Log elapsed time for monitoring Collections are compacted when capacity > size * 2 using: - Array: trim_to_size for in-place buffer reduction - Hash/Deque: .dup reassignment to create right-sized copy - IO::Memory: new instance when empty Expected benefits: - 10-50 MB memory savings during typical pressure events - Faster GC cycles due to fewer large objects - Better recovery from memory spikes - Reduced swap usage on memory-constrained systems 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

claude · 2025-12-01T00:45:57Z

Claude finished @carlhoerberg's task —— View job

Code Review Complete ✅

I've analyzed the entire collection compaction implementation across 21 files. Here are the issues that need fixing:

Issues Found

1. Thread Safety Issue in Array#trim_to_size (src/stdlib/array.cr:19-28)

The trim_to_size method calls rewind which manipulates buffer pointers without proper synchronization:

def trim_to_size(*, extra : Int32 = 0, threshold : Int32 = 2) : Bool
  # ... 
  rewind
  resize_to_capacity(@size + extra)
  true
end

Problem: The rewind method modifies @buffer and @offset_to_buffer without atomic operations or locks. In a multi-threaded environment (Crystal with --preview-mt), concurrent access to arrays during compaction could cause memory corruption.

Fix: Add synchronization or make this method thread-safe if used concurrently.

2. Missing Error Handling for Memory Allocation (src/stdlib/array.cr:26)

The resize_to_capacity call can potentially fail during memory allocation, but there's no error handling:

resize_to_capacity(@size + extra)

Problem: If memory allocation fails during compaction (which is likely during memory pressure), this will raise an exception and crash the compaction process, leaving collections in an inconsistent state.

Fix: Wrap in proper error handling to gracefully handle allocation failures.

3. Race Condition in MessageStore Compaction (src/lavinmq/message_store.cr:588-599)

Collection compaction occurs without proper synchronization with concurrent file operations:

def compact_collections
  @deleted = @deleted.dup if @deleted.capacity > @deleted.size * 2
  @segments = @segments.dup if @segments.capacity > @segments.size * 2
  # ...
  @segments.each_value(&.dontneed)
  @acks.try &.each_value(&.dontneed)
end

Problem: The hash reassignment with .dup and the subsequent dontneed calls are not atomic. During memory pressure, if another fiber is accessing these collections, it could see partially updated state.

Fix: Use @msg_store_lock.synchronize around the entire compaction block.

4. Excessive MFile dontneed Calls During Memory Pressure (src/lavinmq/message_store.cr:597-598)

@segments.each_value(&.dontneed)
@acks.try &.each_value(&.dontneed)

Problem: Calling dontneed on all memory-mapped files during memory pressure could cause performance issues as the kernel may need to reload pages immediately if they're accessed again. This could cause thrashing.

Fix: Consider throttling or limiting dontneed calls, or only calling it on segments that haven't been accessed recently.

Performance Concerns

5. Aggressive 2x Threshold May Cause Overhead (Multiple files)

The 2x capacity threshold is used consistently but may be too aggressive for some high-frequency collections:

@deliveries = @deliveries.dup if @deliveries.capacity > @deliveries.size * 2

Problem: During normal operation with frequent allocation/deallocation patterns, this could cause excessive compaction overhead.

Suggestion: Consider making the threshold configurable or using different thresholds for different collection types.

Minor Issues

6. Potential Memory Leak in Channel (src/lavinmq/amqp/channel.cr:834)

@next_msg_body_tmp = IO::Memory.new if @next_msg_body_tmp.size == 0 && @next_msg_body_tmp.capacity > 0

Issue: This creates a new IO::Memory instance but doesn't explicitly clear the old one, potentially leaving references.

Fix: Explicitly call @next_msg_body_tmp.clear before reassignment.

Recommendation: Focus on fixing issues #1 and #3 as they pose the highest risk of memory corruption and race conditions during memory pressure events.

spuun · 2025-12-02T08:52:03Z

How will this play with multi-threading?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enhance memory pressure handling with collection compaction #1517

Enhance memory pressure handling with collection compaction #1517

Uh oh!

carlhoerberg commented Dec 1, 2025

Uh oh!

claude bot commented Dec 1, 2025 •

edited

Loading

Uh oh!

spuun commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Enhance memory pressure handling with collection compaction #1517

Are you sure you want to change the base?

Enhance memory pressure handling with collection compaction #1517

Uh oh!

Conversation

carlhoerberg commented Dec 1, 2025

Summary

Problem

Solution

Core Components

Compaction Cascade

Compaction Threshold

Changes

Expected Benefits

Test Plan

Notes

Uh oh!

claude bot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Complete ✅

Issues Found

1. Thread Safety Issue in Array#trim_to_size (src/stdlib/array.cr:19-28)

2. Missing Error Handling for Memory Allocation (src/stdlib/array.cr:26)

3. Race Condition in MessageStore Compaction (src/lavinmq/message_store.cr:588-599)

4. Excessive MFile dontneed Calls During Memory Pressure (src/lavinmq/message_store.cr:597-598)

Performance Concerns

5. Aggressive 2x Threshold May Cause Overhead (Multiple files)

Minor Issues

6. Potential Memory Leak in Channel (src/lavinmq/amqp/channel.cr:834)

Uh oh!

spuun commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

claude bot commented Dec 1, 2025 •

edited

Loading