Introduce RFC for RDB format #28

murphyjacob4 · 2025-01-23T22:39:10Z

No description provided.

Signed-off-by: Jacob Murphy <[email protected]>

PingXie · 2025-01-24T02:36:02Z

rfc/rdb-format.md

+
+## Design considerations
+
+- If no new features are used, we should support full forwards compatibility


If no new features are used, we should support full forwards compatibility

This statement has an implication that we shouldn't change the encoding outside of the new features and would preclude future occurrences of ziplist to listpack conversion.

Just noticed that the rfc is filed under the valkey search repo so I assume the scope is limited to this particular module? Or you are thinking about other modules or even perhaps RDB in general?

With a thoughtful design of RDB format, we can support both backward and forward compatibility even when new features are introduced. For example, we can save data section by section. Each section is identified by a unique opcode. Each section's data format could be: opcode-count-blobs, where count means number of following blobs. There will be one loader function per opcode. Backward compatibility can be achieved by always maintaining the past opcodes and loader functions. Forward compatibility can be achieved by skipping the section whose opcode is unknown.

Just noticed that the rfc is filed under the valkey search repo so I assume the scope is limited to this particular module? Or you are thinking about other modules or even perhaps RDB in general?

Right - this is just for ValkeySearch. Valkey engine RDB format is not included in this proposal.

This statement has an implication that we shouldn't change the encoding outside of the new features

Right - I think for ValkeySearch this will be the default stance. If we need to change the encoding in a way that would prevent clean downgrade - we will have to bump semantic versioning.

PingXie · 2025-01-24T02:39:30Z

rfc/rdb-format.md

+
+## Motivation
+
+Our existing RDB format is a good start, but it also is fairly rigid, not supporting new types of data being stored in the RDB other than index definitions and index contents.


I think we need to articulate the real use cases here. Being rigid on its own is neutral IMO - we do want the car's frame to be rigid :)

There are items that have been discussed that could all lead to changes in the RDB format.

Save operations that retain the inflight-state of the ingestion pipeline.

Support for sharing of data between the search module and core (initially Vector data, but potentially other data) -- aka "dedup".

Unlike a car frame which is replaced when you buy a new car - we will need to support the same RDB format for the foreseeable future. It is more like if the same car frame for the 2000 Honda Civic had to work for the 2001, 2002, 2003, etc :)

But let me add some examples here. Allen has a good start

rfc/rdb-format.md

yairgott · 2025-01-24T02:50:49Z

rfc/rdb-format.md

+## Design considerations
+
+- If no new features are used, we should support full forwards compatibility
+- If new features are used, we should ensure this lack of compatibility is understood on previous versions, and fast fail the RDB load


We should support the concept of mandatory and non-mandatory fields. Present of a mandatory field leads to load failure on older instances while passing for the non-mandatory fields.

I think it's always difficult to predict the future. Having an extensible format for storing things like index definitions allows old code to ignore fields it doesn't understand -- here using protobufs for index defs was a good choice as it explicitly includes this design concept. Extending this concept to enable additional ignorable sections of the RDB area on a global, per index-schema and per-field-index will also increase flexibility there.

I would also propose that we include a version number that's structured according to the principals of semantic versioning. Thus new when new fields or sections are added which can be safely ignored by older version, we can simply increment a minor version number. Changes or additions that can't be read by old code can increment the major version number.

I would also propose that we include a version number that's structured according to the principals of semantic versioning. Thus new when new fields or sections are added which can be safely ignored by older version, we can simply increment a minor version number. Changes or additions that can't be read by old code can increment the major version number.

I agree. I didn't call out semver here but I think we should follow it. We also should have the flexibility to change index representation as we see fit, but it may require index rebuild in the case of downgrade. I think this is fine so long as semver is bumped and we provide a way to mark index contents as "not required" (perhaps via a config, as to not accidentally result in an expense loss of index?)

yairgott · 2025-01-24T03:24:10Z

rfc/rdb-format.md

+      IndexSchema index_schema_contents = 2;
+      ...
+   };
+   uint32 supplemental_count = 3;


should it be uint64?

yairgott · 2025-01-24T03:44:55Z

rfc/rdb-format.md

+   required: true,
+   enc_version: 1,
+}
+<key_to_id_mapping_dump>


how would the reader know when the key_to_id_mapping_dump ends, especially if the reader may not support the SupplementalContentHeader type? it may make sense that the dump would be chunked in which the chunk length is kept in the chunk header. Chunk length zero indicates last chunk.

say that a new field is added to each map value. How would this schema maintain backward/forward comparability?

My proposed format easily supports this.

allenss-amazon

In general I'd like to see this approached more abstractly. I think as outlined here that it's relatively easy to add new 1-dimensional sections. But I believe we might need the ability to add multi-dimensional sections in the future, for example, suppose I wanted to insert a two dimensional table into the RDB. Am I forced to manually serialize this into a series of strings or can we have a format that's more extensible. Examples include RESP, JSON, etc.

allenss-amazon · 2025-01-27T23:28:24Z

rfc/rdb-format.md

+## Design considerations
+
+- If no new features are used, we should support full forwards compatibility
+- If new features are used, we should ensure this lack of compatibility is understood on previous versions, and fast fail the RDB load


I think it's always difficult to predict the future. Having an extensible format for storing things like index definitions allows old code to ignore fields it doesn't understand -- here using protobufs for index defs was a good choice as it explicitly includes this design concept. Extending this concept to enable additional ignorable sections of the RDB area on a global, per index-schema and per-field-index will also increase flexibility there.

I would also propose that we include a version number that's structured according to the principals of semantic versioning. Thus new when new fields or sections are added which can be safely ignored by older version, we can simply increment a minor version number. Changes or additions that can't be read by old code can increment the major version number.

rfc/rdb-format.md

allenss-amazon · 2025-01-28T00:45:10Z

rfc/rdb-format.md

+message RDBSection {
+   RDBSectionType type = 1;
+   bool required;
+   uint32 encoding_version;
+   oneof contents {
+      IndexSchema index_schema_contents = 2;
+      ...
+   };
+   uint32 supplemental_count = 3;


I think the per-section definition should be more generic than a protobuf. For example, this could make streaming of a new section essentially impossible. I think it makes more sense to generically design a new section as something easily extensible like an array of RDB strings. This is easily skipped by any code that doesn't understand the leading enum type-marker. It easily supports streaming and nothing prevents a new section from being 1 string that's interpreted as a protobuf when that's the right encoding.

I'll start by saying I am not opposed to dropping protobuf from the top-level. But for this proposal my goal was to encourage use of protocol buffers whenever possible to force contributors to think about backwards and forwards compatibility as we add to these RDB contents. This design is based on the assumption that having it all wrapped in protocol buffer will make using protocol buffers easier than not using protocol buffers. If the top level concept is a binary blob, the path of least resistance (and thus the defacto approach) for new content will be just to dump some strings into the RDB. If the top level concept is a protocol buffer, the path of least resistance will be including the contents into that protocol buffer, and adding binary blobs will be the exceptional case for when that doesn't make sense. For the reasons previously stated, I would prefer if the defacto approach is using protocol buffer, but I am open to counterpoints.

this could make streaming of a new section essentially impossible

The supplemental section is supposed to address the concerns about streaming of new sections. The supplemental section is essentially what you are proposing with the array of RDB strings (although the design uses an EOF marker for simplicity, but conecptually similar)

First, I'm fully in favor of having protobufs in certain places -- they are a good solution for a wide range of problems. I'm objecting to having them be required everywhere because there are known places where this will result in 100's of millions of useless data copying serialization, etc. Which I think will ultimately have a material on load times -- which is bad for availability for all of us.

For example, the HNSW key fixup table will have 1 or two small values for each HSNW slot. Doing a protobuf for each and every slot will cost substantial performance. This is a case where the flexibility of protobufs is unlikely to be advantageous. With the current proposal the only way to recover this performance would be to batch up blocks of slots into a single blob, which complicates code unnecessarily, when simplicity of using the Raw RDB I/O functions is much faster and clearer.

If we found out that in the future we wanted to add to this struct (maybe by pre-computing 1/sqrt per slot -- just an example) we could easily just use a different supplement section opcode and write the table out twice.

allenss-amazon · 2025-01-28T00:48:42Z

rfc/rdb-format.md

+
+`message IndexSchema` follows from the existing definition in src/index_schema.proto.
+
+The goal of breaking into sections is to support skipping optional sections if they are not understood. New sections should be introduced in a manner where failure to understand the new section will generally load fine without loss. Any time that failure to load the section would result in some form of lossiness or inconsistency, we will mark `required` as true and it will result in a downgrade failure. This would only be desired in cases where operators have used new features, and need to think about the downgrade more critically, potentially removing (or, once supported, altering) indexes that will not downgrade gracefully. Similarly, for section types that would like to introduce new required fields, we will include an encoding version, which is conditionally bumped when these new fields are written.


Conceptually this is nice. However, at the implementation level I am concerned that as the number of "required" fields multiplies that the testing matrix becomes unaffordable. IMO, semantic versioning works well enough here.

Can you explain how you would remove "required" by introducing semantic versioning? I would also like to reduce the section header bloat if possible (for test matrix and for complexity reduction)

Here's one thought. Suppose that we tagged ALL headers with a minimum semver. Under the assumption that all headers -- even headers for sections that are skipped because they are unknown -- are visited. Then we can do a global MIN on all of the semver's that we encounter. If the current code base is below that semver, then it's not able to process this RDB.

I do dislike the section header bloat BUT it collapses the testing matrix for backward compatibility.

It also solves another problem. Image a section that has a feature added, that when you use it forces the 'required' bit for that section. So you have version A and B. What if you add even MORE functionality to that section, i.e., version C. the old code that supports A and B can't tell that it doesn't support C. Replacing the required with a semver solves that problem.

Yeah I think this is a good simplification. Basically, synthesize all the "Required" and "encoding version" headers into a single "minimum semantic version". Can we do the "max of all mins" on the writer side though? I think it should be pretty easy to synthesize it once per dump, rather than once per section. WDYT?

In practice, it probably makes sense to just have a ComputeMinimumSemanticVersion function that iterates all indexes and computes the minimum semantic version based on feature usage. This centralization of the logic probably makes it easier to understand than having to venture into each section's logic and understand how it computes the semantic version.

I think the "save" framework should force the individual sections to declare a semver. Then the framework can take care of doing the "max of all mins" and dump a value at the end. Similarly, the restore framework should tell each section the version it found as well as doing a global check at the end.

allenss-amazon · 2025-01-28T00:50:33Z

rfc/rdb-format.md

+`message IndexSchema` follows from the existing definition in src/index_schema.proto.
+
+The goal of breaking into sections is to support skipping optional sections if they are not understood. New sections should be introduced in a manner where failure to understand the new section will generally load fine without loss. Any time that failure to load the section would result in some form of lossiness or inconsistency, we will mark `required` as true and it will result in a downgrade failure. This would only be desired in cases where operators have used new features, and need to think about the downgrade more critically, potentially removing (or, once supported, altering) indexes that will not downgrade gracefully. Similarly, for section types that would like to introduce new required fields, we will include an encoding version, which is conditionally bumped when these new fields are written.
+


I don't the advantage in differentiating between protobuf content and supplemental content when it's trivially easy to have one uber-format that easily supports both types of data.

We could flatten the two - but my assumption here are that 90% of our data will belong to the index schema. What I like about the current proposal is that adding new sections to the index schema is supported natively by adding supplemental content sections. If we flatten this and have everything at the top level, we need some way to map payloads to index schemas. It's not hard to just dump the index schema name and DB number (which will compose a unique key for us to lookup the index schema), but I feel like given the majority of changes are likley to be part of the index schema contents, having first-class support of composition in the RDB format through supplemental sections will reduce the complexity.

An example might help. To me, the nested example seems less complex. But it may be a matter of preference:

Nested Flattened

RDBSection { type: RDB_SECTION_INDEX_SCHEMA required: true encoding_version: 1 index_schema_definition: { name: "my_index", db_num: 0, ... } supplemental_count: 1 } SupplementalContentHeader { type: SUPPLEMENTAL_CONTENT_MY_NEW_PAYLOAD required: ... encoding_version: 1 my_new_payload_header: {} } SupplementalContentChunk { contents: "[my_new_payload_chunk_1]" } SupplementalContentChunk { contents: "[my_new_payload_chunk_2]" } SupplementalContentChunk { contents: "[my_new_payload_chunk_3]" } ... SupplementalContentChunk { contents: "" (EOF) }

RDBSectionHeader { type: RDB_SECTION_INDEX_SCHEMA required: true encoding_version: 1 } /* Count of RDB strings */ 1 /* Serialized index schema proto */ "IndexSchema{name:\"my_index\", db_num: 0, ...}" RDBSectionHeader { type: RDB_SECTION_MY_NEW_PAYLOAD required: ... encoding_version: 1 } /* Count of RDB strings */ 10 /* Reference to what index this belongs to */ "my_index" /* name */ "0" /* db_num */ /* Now the contents begin */ "[my_new_payload_chunk_1]" "[my_new_payload_chunk_2]" "[my_new_payload_chunk_3]" ...

allenss-amazon · 2025-01-28T00:52:37Z

rfc/rdb-format.md

+   required: true,
+   enc_version: 1,
+}
+<key_to_id_mapping_dump>


My proposed format easily supports this.

allenss-amazon · 2025-01-28T00:59:10Z

rfc/rdb-format.md

+
+#### Example: Adding Vector Quantization
+
+With the above design, suppose that we are substantially changing the index to support a vector quantization option on `FT.CREATE`. For simplicity, suppose this is just a boolean "on" or "off" flag.


I think if we do VQ, it'll present itself as a new algo, i.e., HNSW, FLAT and VQ (or something like that). That new vector index sub-type will likely have lots of blobs and special data structures. I doubt we can accurate predict that. Rather, we should focus on the more generic ability to add sections on a per-field-index basis.

re:VQ - The goal was just to demonstrate what happens when the index format may change. But it is a simplified example. I am not proposing this is how quantization would be added.

If you have a better example, happy to change it :)

Signed-off-by: Jacob Murphy <[email protected]>

murphyjacob4 added 3 commits January 23, 2025 22:33

RFC for RDB format

c9312f3

Signed-off-by: Jacob Murphy <[email protected]>

Grammar and typos

d7d9002

Signed-off-by: Jacob Murphy <[email protected]>

Adjust diagram

89e136a

Signed-off-by: Jacob Murphy <[email protected]>

PingXie reviewed Jan 24, 2025

View reviewed changes

yairgott requested changes Jan 24, 2025

View reviewed changes

allenss-amazon requested changes Jan 28, 2025

View reviewed changes

murphyjacob4 added 4 commits January 28, 2025 18:29

Add chunking of binary dump

81cf8a7

Signed-off-by: Jacob Murphy <[email protected]>

Update based on discussion

d1d0ab2

Signed-off-by: Jacob Murphy <[email protected]>

Clarify Valkey module api calls

4f58dcf

Signed-off-by: Jacob Murphy <[email protected]>

Update design with semantic versioning, coding sample.

a057fe3

Signed-off-by: Jacob Murphy <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce RFC for RDB format #28

Introduce RFC for RDB format #28

murphyjacob4 commented Jan 23, 2025

PingXie Jan 24, 2025

PingXie Jan 24, 2025

joehu21 Jan 26, 2025

murphyjacob4 Jan 28, 2025

PingXie Jan 24, 2025

allenss-amazon Jan 27, 2025 •

edited

Loading

murphyjacob4 Jan 28, 2025

yairgott Jan 24, 2025

allenss-amazon Jan 27, 2025

murphyjacob4 Jan 28, 2025

yairgott Jan 24, 2025

yairgott Jan 24, 2025 •

edited

Loading

allenss-amazon Jan 28, 2025

allenss-amazon left a comment

allenss-amazon Jan 27, 2025

allenss-amazon Jan 28, 2025

murphyjacob4 Jan 28, 2025

allenss-amazon Jan 29, 2025 •

edited

Loading

allenss-amazon Jan 28, 2025

murphyjacob4 Jan 28, 2025

allenss-amazon Jan 28, 2025

murphyjacob4 Feb 18, 2025

allenss-amazon Feb 18, 2025

allenss-amazon Jan 28, 2025

murphyjacob4 Jan 28, 2025

allenss-amazon Jan 28, 2025

allenss-amazon Jan 28, 2025

murphyjacob4 Jan 28, 2025


		## Design considerations

		- If no new features are used, we should support full forwards compatibility


		## Motivation

		Our existing RDB format is a good start, but it also is fairly rigid, not supporting new types of data being stored in the RDB other than index definitions and index contents.


		`message IndexSchema` follows from the existing definition in src/index_schema.proto.

		The goal of breaking into sections is to support skipping optional sections if they are not understood. New sections should be introduced in a manner where failure to understand the new section will generally load fine without loss. Any time that failure to load the section would result in some form of lossiness or inconsistency, we will mark `required` as true and it will result in a downgrade failure. This would only be desired in cases where operators have used new features, and need to think about the downgrade more critically, potentially removing (or, once supported, altering) indexes that will not downgrade gracefully. Similarly, for section types that would like to introduce new required fields, we will include an encoding version, which is conditionally bumped when these new fields are written.


		#### Example: Adding Vector Quantization

		With the above design, suppose that we are substantially changing the index to support a vector quantization option on `FT.CREATE`. For simplicity, suppose this is just a boolean "on" or "off" flag.

Introduce RFC for RDB format #28

Are you sure you want to change the base?

Introduce RFC for RDB format #28

Conversation

murphyjacob4 commented Jan 23, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

allenss-amazon Jan 27, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yairgott Jan 24, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

allenss-amazon left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

allenss-amazon Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

allenss-amazon Jan 27, 2025 •

edited

Loading

yairgott Jan 24, 2025 •

edited

Loading

allenss-amazon Jan 29, 2025 •

edited

Loading