Skip to content

Garnet attribute query optimization for inline filter#1727

Draft
hailangx wants to merge 2 commits intomainfrom
two_queue_filter
Draft

Garnet attribute query optimization for inline filter#1727
hailangx wants to merge 2 commits intomainfrom
two_queue_filter

Conversation

@hailangx
Copy link
Copy Markdown
Member

@hailangx hailangx commented Apr 22, 2026

Garnet-Side: Attribute Storage Design for Inline Filtering (Current Change)

Existing Attribute Store

The existing Garnet attribute store was designed for general-purpose access — attributes are stored as raw JSON keyed by external (user-facing) ID. This is the natural choice for a key-value store: the user inserts a vector with key "doc:42" and attributes {"year": 2021, "genre": "action"}, so the attributes are stored under that same key. This store serves RESP command operations (e.g., VGETATTR) and remains unchanged.

However, this store creates a mismatch with how DiskANN's graph traversal operates during inline filtering. DiskANN works entirely in internal ID space — every candidate is a uint32 internal ID. To evaluate a filter using only the existing store, the callback must:

  1. Read ExternalIdMap[internal_id] → translate the internal ID to the external key (one Garnet store read)
  2. Read Attributes[external_key] → fetch the raw JSON payload (second Garnet store read)
  3. Parse JSON at query timeExtractFields() runs a JSON tokenizer to locate and parse the fields referenced by the filter expression

With inline filtering, this callback runs on every candidate the graph traversal considers (potentially thousands per query). The two store reads and JSON parsing per candidate become the dominant cost on the hot path.

Solution: Add a second attribute store optimized for query-time filter evaluation

The current change adds a new attribute store alongside the existing one. The two stores serve different purposes:

Store Keyed by Format Purpose
Existing External ID (user key) Raw JSON RESP command operations (VGETATTR, VSETATTR, etc.)
New Internal ID (DiskANN ID) Binary Inline filter evaluation at query time

The existing external ID keyed JSON store is untouched — it continues to serve all RESP command operations. The new internal ID keyed binary store is a write-time derived projection of the same data, optimized purely for the inline filter callback's access pattern.

Why key by internal ID

DiskANN hands the callback an internal ID; the existing attribute store expects an external key. Bridging this gap requires reading the ExternalIdMap — a store read that exists purely because of the keying mismatch. By adding a store keyed by internal ID, the filter callback can look up attributes directly without any ID translation. This eliminates the ExternalIdMap read entirely — one fewer store read per candidate.

Why store in binary format

Raw JSON forces parsing on every candidate at query time. Extracting a numeric field like .year requires scanning for the key, skipping whitespace, and parsing a number string into a double. This work is repeated identically for every candidate, every query. The JSON structure does not change between queries — this is wasted work.

The binary store shifts the cost of JSON parsing from query time to ingestion time:

  • At ingestion (vector insert/update): JSON is parsed once and converted to binary via ConvertJsonToBinary(). The binary format is [0xFF marker][field count][per-field: name_len, name, type_tag, value_len, value_bytes], with numbers pre-converted to 8-byte LE f64. This is a one-time cost, written to the new store alongside the existing JSON store.
  • At query time (per-candidate): ExtractFieldsBinary() performs a direct scan over length-prefixed fields. No JSON tokenizer. Field names compared as raw byte spans. Numbers read directly as f64 — no string parsing. ~10× faster than JSON extraction.

Since each vector is inserted once but may be evaluated as a candidate across thousands of queries, this tradeoff — pay more at write, pay less at read — is the correct one for a read-heavy similarity search workload.

Per-candidate callback comparison

Without binary attribute store (2 store reads + JSON parse per candidate):
  1. Read ExternalIdMap[internal_id] → external key       ← ID translation
  2. Read Attributes[external_key] → JSON bytes           ← existing JSON store
  3. ExtractFields(json, selectors) → field values         ← JSON parse at query time
  4. ExprRunner.Run(program) → bool

With binary attribute store (1 store read + binary scan per candidate):
  1. Read BinaryAttributes[internal_id] → binary bytes     ← new store, direct lookup
  2. ExtractFieldsBinary(binary, selectors) → field values ← pre-parsed, ~10× faster
  3. ExprRunner.Run(program) → bool

Summary of inline filter per-candidate cost

Aspect Only external ID keyed JSON attribute store Current change (internal ID keyed binary attribute) Further optimization (co-locate binary attribute with vector data)
Store reads per candidate 2 (ExternalIdMap + Attributes) 1 (Attributes only) 0 (already accessible during traversal)
ID translation Required (internal → external) Eliminated (keyed by internal ID) Eliminated
Field extraction JSON parse at query time Binary scan (~10× faster) Binary scan (~10× faster)
Parse cost paid at Query time (per candidate, per query) Ingestion time (once per insert) Ingestion time (once per insert)
Total per-candidate overhead 2 reads + JSON parse + eval 1 read + binary scan + eval Binary scan + eval

Further optimization: Co-locate attributes with vector data

@hailangx hailangx changed the title commit the draft for two queue filter Inline filter attribute query optimizaiton Apr 22, 2026
@hailangx hailangx changed the title Inline filter attribute query optimizaiton attribute query optimizaiton for inline filter Apr 22, 2026
@hailangx hailangx changed the title attribute query optimizaiton for inline filter Garnet attribute query optimization for inline filter Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant