Skip to content

Add Roaring Bitmap support (issue #1270)#1741

Open
crprashant wants to merge 1 commit intomicrosoft:devfrom
crprashant:feature/issue-1270-roaring-bitmap
Open

Add Roaring Bitmap support (issue #1270)#1741
crprashant wants to merge 1 commit intomicrosoft:devfrom
crprashant:feature/issue-1270-roaring-bitmap

Conversation

@crprashant
Copy link
Copy Markdown

Add Roaring Bitmap support (issue #1270)

Resolves #1270.

Summary

Adds a Roaring Bitmaps extension to Garnet that introduces a new compressed
bitmap object type plus four R.* RESP commands. Implemented entirely as a
host extension in main/GarnetServer/Extensions/RoaringBitmap/zero
changes to libs/server/
— so this is a clean, reviewable foundation that
can be deepened in follow-up PRs.

Why Roaring?

A naive uint32 bitmap is 512 MiB. Roaring partitions the universe into
65 536 chunks of 65 536 bits and represents each chunk as either:

  • Array container — sorted ushort[], used while a chunk holds ≤ 4 096
    set bits (~2·count bytes).
  • Bitmap containerulong[1024] (8 KiB exactly), used once a chunk
    exceeds the threshold.

Empty chunks consume zero memory. Chunks promote (array → bitmap) and demote
(bitmap → array) automatically as cardinality changes.

Commands

Command Description
R.SETBIT key offset value Set bit at offset[0, 2³²-1] to 0/1. Returns previous bit.
R.GETBIT key offset Returns bit at offset. 0 for missing keys.
R.BITCOUNT key Population count. 0 for missing keys.
R.BITPOS key bit [from] First bit (0/1) at or after from. -1 if none.

Docs: website/docs/commands/roaring-bitmap.md.

Design notes

  • The data structure (RoaringBitmap.cs, Containers/*) is a pure C# library
    with no Garnet dependencies → independently unit-testable.
  • RoaringBitmapObject (CustomObjectBase) wraps the structure and tracks
    size deltas via bitmap.ByteSize for the per-object size accounting.
  • All four commands are registered as CommandType.ReadModifyWrite. The reads
    (R.GETBIT / R.BITCOUNT / R.BITPOS) do not mutate state, but the RMW
    path is required so that NeedInitialUpdate is invoked on missing keys —
    the framework's Read path simply returns nil otherwise. Missing-key
    responses (0 / -1) are written from NeedInitialUpdate which then
    returns false to decline key creation.
  • NeedInitialUpdate error paths use writer.WriteError(...) + return false
    rather than AbortWithErrorMessage (which returns true and would cause the
    framework to proceed into InitialUpdater/Updater, double-writing the
    response and corrupting the protocol stream).

Tests

Suite Count Status
RoaringBitmapDataTests (pure data structure) 27 ✅ Pass
RespRoaringBitmapTests (RESP integration via SE.Redis) 14 ✅ Pass

Coverage highlights:

  • Empty bitmap, single bit, idempotent set/clear.
  • Promotion threshold (40964097) and demotion across both directions.
  • 100 K random ops vs HashSet<uint> oracle.
  • Boundary offsets 0, 65535, 65536, 2³¹, 2³²-1.
  • Serialize → deserialize round-trip equality (empty / sparse / dense / mixed).
  • RESP-level: R.SETBIT/R.GETBIT parity with oracle, R.BITCOUNT,
    R.BITPOS (set / unset / from offset), large offsets, persistence across
    restart, concurrent setbits from multiple clients, error paths
    (bad offset, bad bit, bad value, wrong arity), and the two-store key
    separation property.
$ dotnet test test\Garnet.test\Garnet.test.csproj -c Debug -f net8.0 \
    --filter "FullyQualifiedName~RoaringBitmap" --nologo
Passed!  - Failed: 0, Passed: 43, Skipped: 0, Total: 43

Known limitations (intentional v1 scope)

  • Run container is not implemented. Adds ~30% extra compression on
    contiguous ranges; the array/bitmap pair captures the bulk of real-world
    savings.
  • R.BITOP AND/OR/XOR/NOT is not exposed. The data structure supports
    these natively; only command surface is needed.
  • Empty-key removal: clearing the last bit leaves an empty bitmap object
    rather than removing the key. This is a property of the custom-object
    framework's tombstone path (output.HasRemoveKey is honoured only on the
    built-in path) and is best fixed in libs/server/Storage/Functions/ObjectStore
    in a separate PR.

Files

main/GarnetServer/Extensions/RoaringBitmap/
  Containers/IContainer.cs
  Containers/ArrayContainer.cs
  Containers/BitmapContainer.cs
  RoaringBitmap.cs
  RoaringBitmapObject.cs
  RoaringBitmapCommands.cs
main/GarnetServer/Program.cs                    (registration only)
test/Garnet.test/RoaringBitmapDataTests.cs      (27 tests)
test/Garnet.test/RespRoaringBitmapTests.cs      (14 tests)
website/docs/commands/roaring-bitmap.md         (docs)

Copilot AI review requested due to automatic review settings April 27, 2026 18:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new host-level Roaring Bitmap custom object extension to Garnet, including a compressed bitmap data structure, a RoaringBitmapObject wrapper, and four new R.* RESP commands, along with docs and tests.

Changes:

  • Introduces a pure C# Roaring Bitmap implementation with array/bitmap containers and versioned serialization.
  • Adds a Garnet custom object + RESP command implementations for R.SETBIT, R.GETBIT, R.BITCOUNT, and R.BITPOS, and registers them in the default server host.
  • Adds end-to-end RESP tests and data-structure unit tests, plus documentation for the new commands.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
website/docs/commands/roaring-bitmap.md Adds user-facing documentation for the new Roaring Bitmap object and R.* commands.
test/Garnet.test/RoaringBitmapDataTests.cs Adds unit tests for the standalone RoaringBitmap data structure (promotion/demotion, bitpos, serialization).
test/Garnet.test/RespRoaringBitmapTests.cs Adds RESP-level integration tests for the new commands via StackExchange.Redis.
main/GarnetServer/Program.cs Registers the Roaring Bitmap custom type and R.* commands in the default host.
main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmapObject.cs Implements the Garnet custom-object wrapper (clone/serialize/size tracking).
main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmapCommands.cs Implements argument parsing and the four RESP commands.
main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmap.cs Implements the roaring bitmap core, bit operations, enumeration, and serialization format.
main/GarnetServer/Extensions/RoaringBitmap/Containers/IContainer.cs Defines the internal container abstraction and serialization kind enum.
main/GarnetServer/Extensions/RoaringBitmap/Containers/BitmapContainer.cs Implements dense bitmap container behavior, popcount, and serialization.
main/GarnetServer/Extensions/RoaringBitmap/Containers/ArrayContainer.cs Implements sparse sorted-array container behavior, promotion logic, and serialization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// uint32 universe — past the set range any value qualifies, so
// accept any value >= from that's not in the bits array.
ClassicAssert.GreaterOrEqual(actual0, from);
ClassicAssert.IsTrue(actual0 == 0 || (actual0 < bits.LongLength ? !bits[actual0] : true));
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bits[actual0] will not compile because array indices must be int but actual0 is a long. Cast to int after checking bounds (or keep actual0 as an int for this bounded-universe test).

Suggested change
ClassicAssert.IsTrue(actual0 == 0 || (actual0 < bits.LongLength ? !bits[actual0] : true));
ClassicAssert.IsTrue(actual0 == 0 || (actual0 >= 0 && actual0 < bits.LongLength ? !bits[(int)actual0] : true));

Copilot uses AI. Check for mistakes.
Comment on lines +181 to +189
/// <summary>Estimated heap byte cost. Excludes the .NET object header overhead and the SortedDictionary node overhead.</summary>
public long ByteSize
{
get
{
long sum = 24; // base object overhead estimate
foreach (var kv in chunks)
{
// Per-entry: key (2B), reference (8B), red-black tree node overhead (~40B), and container.
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ByteSize comment says it excludes SortedDictionary node overhead, but the implementation adds an estimate per entry (sum += 50 + kv.Value.ByteSize, including an RB-tree node estimate). Update the comment (or the calculation) so documentation matches what is reported/used for size accounting.

Suggested change
/// <summary>Estimated heap byte cost. Excludes the .NET object header overhead and the SortedDictionary node overhead.</summary>
public long ByteSize
{
get
{
long sum = 24; // base object overhead estimate
foreach (var kv in chunks)
{
// Per-entry: key (2B), reference (8B), red-black tree node overhead (~40B), and container.
/// <summary>Estimated heap byte cost, including an approximate base-object cost and approximate per-entry <see cref="SortedDictionary{TKey, TValue}"/> node overhead.</summary>
public long ByteSize
{
get
{
long sum = 24; // Approximate RoaringBitmap instance/base-object cost.
foreach (var kv in chunks)
{
// Per-entry estimate: key (2B), reference (8B), red-black tree node overhead (~40B), and container.

Copilot uses AI. Check for mistakes.
Comment on lines +50 to +51
public override bool NeedInitialUpdate(ReadOnlyMemory<byte> key, ref ObjectInput input, ref RespMemoryWriter writer) => true;

Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSetBit.NeedInitialUpdate always returns true. For a missing key, this means the object is created before Updater runs; if parsing fails, AbortWithErrorMessage still returns true, so the command errors but the empty key can remain created/size-tracked. Validate args in NeedInitialUpdate and return false (after writer.WriteError(...)) on bad input to avoid creating keys on invalid commands.

Suggested change
public override bool NeedInitialUpdate(ReadOnlyMemory<byte> key, ref ObjectInput input, ref RespMemoryWriter writer) => true;
public override bool NeedInitialUpdate(ReadOnlyMemory<byte> key, ref ObjectInput input, ref RespMemoryWriter writer)
{
var validationInput = input;
int offset = 0;
var offsetArg = GetNextArg(ref validationInput, ref offset);
var bitArg = GetNextArg(ref validationInput, ref offset);
if (!RoaringBitmapArgs.TryParseUInt32(offsetArg, out _))
{
writer.WriteError(ErrOffset);
return false;
}
if (!RoaringBitmapArgs.TryParseBit(bitArg, out _))
{
writer.WriteError(ErrValue);
return false;
}
return true;
}

Copilot uses AI. Check for mistakes.
Comment thread main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmapCommands.cs
Comment thread modules/RoaringBitmap/RoaringBitmapObject.cs
@crprashant
Copy link
Copy Markdown
Author

Thanks for the thorough review! Pushed 65ac9f1d4 addressing each comment. Summary:

# File / line Resolution
1 RoaringBitmapDataTests.cs:175bits[actual0] index Applied. (int)actual0 cast plus an explicit actual0 >= 0 guard. (Note: array indexers in C# do accept long per the language spec, which is why the original compiled — but the cast makes intent obvious and removes the runtime OverflowException risk.)
2 RoaringBitmap.cs:189ByteSize doc/impl mismatch Applied. Updated the XML doc to state that the per-entry SortedDictionary node overhead is included (the implementation was already correct; only the comment was misleading).
3 RoaringBitmapCommands.cs:51RSetBit.NeedInitialUpdate always returns true Applied. NeedInitialUpdate now validates offset and value against a copy of the ObjectInput (it's a struct, so var validation = input; snapshots it without disturbing what Updater sees). On bad input it writes the error and returns false, so a malformed R.SETBIT no longer creates an empty tombstone-style key.
4 RoaringBitmapCommands.cs:6using Garnet.common; flagged as unused Disagreed (with evidence). RespMemoryWriter lives in Garnet.common, and removing the using produces 8 × CS0246: The type or namespace name 'RespMemoryWriter' could not be found errors across all NeedInitialUpdate / Updater / Reader signatures. Restored.
5 RoaringBitmapObject.cs:29 — inconsistent default-ctor Size Applied. Default ctor now sets this.Size = ObjectOverhead + bitmap.ByteSize, matching the deserialized constructor so freshly-created and round-tripped objects report identical memory baselines and ByteSize-based mutation deltas don't double-count the empty-bitmap baseline.

All 27 data-structure tests + 14 RESP integration tests still pass:

Passed!  - Failed: 0, Passed: 43, Skipped: 0, Total: 43

@crprashant
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree company="Microsoft"

2 similar comments
@crprashant
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree company="Microsoft"

@crprashant
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree company="Microsoft"

@badrishc
Copy link
Copy Markdown
Collaborator

Thanks for your contribution! This extension is interesting, but instead of putting it in main, we should place it in https://github.com/microsoft/garnet/tree/main/modules (where e.g., GarnetJSON is kept) so that it is not bundled by default in the server.

@badrishc
Copy link
Copy Markdown
Collaborator

badrishc commented Apr 28, 2026

Also, main is closed to new features, so we would request that you retarget your PR to the dev (v2) branch.

@crprashant crprashant force-pushed the feature/issue-1270-roaring-bitmap branch from 02743e5 to 66af335 Compare April 29, 2026 02:45
@crprashant crprashant changed the base branch from main to dev April 29, 2026 02:46
@crprashant
Copy link
Copy Markdown
Author

Thanks for the review! Both points addressed in the latest force-push:

  1. Moved extension into modules/RoaringBitmap/ — mirrors GarnetJSON: new GarnetRoaringBitmap.csproj, new RoaringBitmapModule : ModuleBase entry point that registers the factory + R.SETBIT / R.GETBIT / R.BITCOUNT / R.BITPOS, namespace renamed Garnet.Extensions.RoaringBitmapGarnetRoaringBitmap to avoid the namespace/class collision, wired into Garnet.slnx and test/Garnet.test/Garnet.test.csproj. Nothing in main/GarnetServer references it anymore — it is no longer bundled by default.
  2. Retargeted PR base to dev. Branch was rebased onto current upstream/dev and the CustomObjectFunctions / CustomObjectBase overrides updated to the new scoped ReadOnlySpan<byte> and HeapMemorySize APIs.

Local validation: all 43 RoaringBitmap tests (29 data + 14 RESP) pass on net8.0, dotnet format is clean.

Addresses PR review feedback from @badrishc:
- Move the extension from main/GarnetServer/Extensions/RoaringBitmap to
  modules/RoaringBitmap so it isn't bundled by default (mirrors GarnetJSON).
- Retarget the PR to dev (companion change).

Implementation changes for the move:
- New modules/RoaringBitmap/GarnetRoaringBitmap.csproj (mirrors GarnetJSON.csproj,
  signs assembly, exposes InternalsVisibleTo Garnet.test).
- New RoaringBitmapModule : ModuleBase entry point that registers the
  factory and the four R.SETBIT/R.GETBIT/R.BITCOUNT/R.BITPOS commands.
- Renamed namespace Garnet.Extensions.RoaringBitmap -> GarnetRoaringBitmap
  to avoid the namespace/class collision with class RoaringBitmap.
- Updated CustomObjectFunctions overrides to dev-branch
  scoped ReadOnlySpan<byte> signatures for NeedInitialUpdate / Updater.
- Updated RoaringBitmapObject to dev-branch CustomObjectBase ctor and
  HeapMemorySize accounting.
- Wired the module into Garnet.slnx and Garnet.test.csproj.
- Tests still register via server.Register.NewCommand in [SetUp] (in-process),
  matching the existing custom-object test pattern.
- Updated StringKeyAndCustomObjectKey_AreSeparate to expect WRONGTYPE on the
  unified store on dev.
@crprashant crprashant force-pushed the feature/issue-1270-roaring-bitmap branch from 66af335 to dd3e2da Compare April 30, 2026 02:27

<ItemGroup>
<ProjectReference Include="..\..\libs\server\Garnet.server.csproj" />
</ItemGroup>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also include our threading analyzers for consistency:
<PackageReference Include="Microsoft.VisualStudio.Threading.Analyzers" PrivateAssets="all" IncludeAssets="analyzers"/>

/// ascending order by high-key for deterministic serialization and efficient
/// scans (e.g., bit-position queries).
///
/// This class is NOT thread-safe; the parent RoaringBitmapObject (added in a
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated.

/// <summary>
/// Sets bit <paramref name="value"/> to 1. Returns the previous value (0 or 1).
/// </summary>
public int Add(uint value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange to use an int for this, I'd switch to bool.

/// <summary>
/// Clears bit <paramref name="value"/>. Returns the previous value (0 or 1).
/// </summary>
public int Remove(uint value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same note, prefer bool.

}

/// <summary>Convenience wrapper used by RESP SETBIT: dispatches to <see cref="Add"/> or <see cref="Remove"/>.</summary>
public int SetBit(uint offset, bool set) => set ? Add(offset) : Remove(offset);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same note, prefer bool.

@@ -0,0 +1,313 @@
// Copyright (c) Microsoft Corporation.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd would like to see a test that does concurrent reads while writes are in progress - I see a test for concurrent writes which is not quite enough.

[Test]
public void EmptyBitmap_HasZeroCardinalityAndIsEmpty()
{
var rb = new global::GarnetRoaringBitmap.RoaringBitmap();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove all this global:: stuff.

{
private static ReadOnlySpan<byte> ErrOffset => "ERR bit offset is not an unsigned 32-bit integer"u8;

public override bool NeedInitialUpdate(scoped ReadOnlySpan<byte> key, ref ObjectInput input, ref RespMemoryWriter writer)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more a question for @badrishc as I haven't played around with custom objects too much - is validation expected to get in NeedInitialUpdate like this?

It's unfortunate as it forces this read command to act like a write which will hurt throughput.

/// </summary>
public sealed class RBitCount : CustomObjectFunctions
{
public override bool NeedInitialUpdate(scoped ReadOnlySpan<byte> key, ref ObjectInput input, ref RespMemoryWriter writer)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar Q for @badrishc (and again in RBitPos) - using NeedInitialUpdate for "missing" is messy; can this be phrased as a Reader op instead?

remaining fast for membership and population-count queries.

The extension lives in `main/GarnetServer/Extensions/RoaringBitmap/` and is wired
into the default `GarnetServer` host. It introduces a new object type and four
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect RoaringBitmaps must be loaded manually (which is correct).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bitmap compression

5 participants