-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce RFC for RDB format #28
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
|
||
## Design considerations | ||
|
||
- If no new features are used, we should support full forwards compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no new features are used, we should support full forwards compatibility
This statement has an implication that we shouldn't change the encoding outside of the new features and would preclude future occurrences of ziplist
to listpack
conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed that the rfc is filed under the valkey search repo so I assume the scope is limited to this particular module? Or you are thinking about other modules or even perhaps RDB in general?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a thoughtful design of RDB format, we can support both backward and forward compatibility even when new features are introduced. For example, we can save data section by section. Each section is identified by a unique opcode. Each section's data format could be: opcode-count-blobs, where count means number of following blobs. There will be one loader function per opcode. Backward compatibility can be achieved by always maintaining the past opcodes and loader functions. Forward compatibility can be achieved by skipping the section whose opcode is unknown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed that the rfc is filed under the valkey search repo so I assume the scope is limited to this particular module? Or you are thinking about other modules or even perhaps RDB in general?
Right - this is just for ValkeySearch. Valkey engine RDB format is not included in this proposal.
This statement has an implication that we shouldn't change the encoding outside of the new features
Right - I think for ValkeySearch this will be the default stance. If we need to change the encoding in a way that would prevent clean downgrade - we will have to bump semantic versioning.
rfc/rdb-format.md
Outdated
|
||
## Motivation | ||
|
||
Our existing RDB format is a good start, but it also is fairly rigid, not supporting new types of data being stored in the RDB other than index definitions and index contents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to articulate the real use cases here. Being rigid on its own is neutral IMO - we do want the car's frame to be rigid :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are items that have been discussed that could all lead to changes in the RDB format.
- Save operations that retain the inflight-state of the ingestion pipeline.
- Support for sharing of data between the search module and core (initially Vector data, but potentially other data) -- aka "dedup".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike a car frame which is replaced when you buy a new car - we will need to support the same RDB format for the foreseeable future. It is more like if the same car frame for the 2000 Honda Civic had to work for the 2001, 2002, 2003, etc :)
But let me add some examples here. Allen has a good start
rfc/rdb-format.md
Outdated
## Design considerations | ||
|
||
- If no new features are used, we should support full forwards compatibility | ||
- If new features are used, we should ensure this lack of compatibility is understood on previous versions, and fast fail the RDB load |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should support the concept of mandatory and non-mandatory fields. Present of a mandatory field leads to load failure on older instances while passing for the non-mandatory fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's always difficult to predict the future. Having an extensible format for storing things like index definitions allows old code to ignore fields it doesn't understand -- here using protobufs for index defs was a good choice as it explicitly includes this design concept. Extending this concept to enable additional ignorable sections of the RDB area on a global, per index-schema and per-field-index will also increase flexibility there.
I would also propose that we include a version number that's structured according to the principals of semantic versioning. Thus new when new fields or sections are added which can be safely ignored by older version, we can simply increment a minor version number. Changes or additions that can't be read by old code can increment the major version number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also propose that we include a version number that's structured according to the principals of semantic versioning. Thus new when new fields or sections are added which can be safely ignored by older version, we can simply increment a minor version number. Changes or additions that can't be read by old code can increment the major version number.
I agree. I didn't call out semver here but I think we should follow it. We also should have the flexibility to change index representation as we see fit, but it may require index rebuild in the case of downgrade. I think this is fine so long as semver is bumped and we provide a way to mark index contents as "not required" (perhaps via a config, as to not accidentally result in an expense loss of index?)
IndexSchema index_schema_contents = 2; | ||
... | ||
}; | ||
uint32 supplemental_count = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be uint64?
rfc/rdb-format.md
Outdated
required: true, | ||
enc_version: 1, | ||
} | ||
<key_to_id_mapping_dump> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- how would the reader know when the key_to_id_mapping_dump ends, especially if the reader may not support the SupplementalContentHeader type? it may make sense that the dump would be chunked in which the chunk length is kept in the chunk header. Chunk length zero indicates last chunk.
- say that a new field is added to each map value. How would this schema maintain backward/forward comparability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My proposed format easily supports this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I'd like to see this approached more abstractly. I think as outlined here that it's relatively easy to add new 1-dimensional sections. But I believe we might need the ability to add multi-dimensional sections in the future, for example, suppose I wanted to insert a two dimensional table into the RDB. Am I forced to manually serialize this into a series of strings or can we have a format that's more extensible. Examples include RESP, JSON, etc.
rfc/rdb-format.md
Outdated
## Design considerations | ||
|
||
- If no new features are used, we should support full forwards compatibility | ||
- If new features are used, we should ensure this lack of compatibility is understood on previous versions, and fast fail the RDB load |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's always difficult to predict the future. Having an extensible format for storing things like index definitions allows old code to ignore fields it doesn't understand -- here using protobufs for index defs was a good choice as it explicitly includes this design concept. Extending this concept to enable additional ignorable sections of the RDB area on a global, per index-schema and per-field-index will also increase flexibility there.
I would also propose that we include a version number that's structured according to the principals of semantic versioning. Thus new when new fields or sections are added which can be safely ignored by older version, we can simply increment a minor version number. Changes or additions that can't be read by old code can increment the major version number.
rfc/rdb-format.md
Outdated
message RDBSection { | ||
RDBSectionType type = 1; | ||
bool required; | ||
uint32 encoding_version; | ||
oneof contents { | ||
IndexSchema index_schema_contents = 2; | ||
... | ||
}; | ||
uint32 supplemental_count = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the per-section definition should be more generic than a protobuf. For example, this could make streaming of a new section essentially impossible. I think it makes more sense to generically design a new section as something easily extensible like an array of RDB strings. This is easily skipped by any code that doesn't understand the leading enum type-marker. It easily supports streaming and nothing prevents a new section from being 1 string that's interpreted as a protobuf when that's the right encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll start by saying I am not opposed to dropping protobuf from the top-level. But for this proposal my goal was to encourage use of protocol buffers whenever possible to force contributors to think about backwards and forwards compatibility as we add to these RDB contents. This design is based on the assumption that having it all wrapped in protocol buffer will make using protocol buffers easier than not using protocol buffers. If the top level concept is a binary blob, the path of least resistance (and thus the defacto approach) for new content will be just to dump some strings into the RDB. If the top level concept is a protocol buffer, the path of least resistance will be including the contents into that protocol buffer, and adding binary blobs will be the exceptional case for when that doesn't make sense. For the reasons previously stated, I would prefer if the defacto approach is using protocol buffer, but I am open to counterpoints.
this could make streaming of a new section essentially impossible
The supplemental section is supposed to address the concerns about streaming of new sections. The supplemental section is essentially what you are proposing with the array of RDB strings (although the design uses an EOF marker for simplicity, but conecptually similar)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, I'm fully in favor of having protobufs in certain places -- they are a good solution for a wide range of problems. I'm objecting to having them be required everywhere because there are known places where this will result in 100's of millions of useless data copying serialization, etc. Which I think will ultimately have a material on load times -- which is bad for availability for all of us.
For example, the HNSW key fixup table will have 1 or two small values for each HSNW slot. Doing a protobuf for each and every slot will cost substantial performance. This is a case where the flexibility of protobufs is unlikely to be advantageous. With the current proposal the only way to recover this performance would be to batch up blocks of slots into a single blob, which complicates code unnecessarily, when simplicity of using the Raw RDB I/O functions is much faster and clearer.
If we found out that in the future we wanted to add to this struct (maybe by pre-computing 1/sqrt per slot -- just an example) we could easily just use a different supplement section opcode and write the table out twice.
rfc/rdb-format.md
Outdated
|
||
`message IndexSchema` follows from the existing definition in src/index_schema.proto. | ||
|
||
The goal of breaking into sections is to support skipping optional sections if they are not understood. New sections should be introduced in a manner where failure to understand the new section will generally load fine without loss. Any time that failure to load the section would result in some form of lossiness or inconsistency, we will mark `required` as true and it will result in a downgrade failure. This would only be desired in cases where operators have used new features, and need to think about the downgrade more critically, potentially removing (or, once supported, altering) indexes that will not downgrade gracefully. Similarly, for section types that would like to introduce new required fields, we will include an encoding version, which is conditionally bumped when these new fields are written. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually this is nice. However, at the implementation level I am concerned that as the number of "required" fields multiplies that the testing matrix becomes unaffordable. IMO, semantic versioning works well enough here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain how you would remove "required" by introducing semantic versioning? I would also like to reduce the section header bloat if possible (for test matrix and for complexity reduction)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's one thought. Suppose that we tagged ALL headers with a minimum semver. Under the assumption that all headers -- even headers for sections that are skipped because they are unknown -- are visited. Then we can do a global MIN on all of the semver's that we encounter. If the current code base is below that semver, then it's not able to process this RDB.
I do dislike the section header bloat BUT it collapses the testing matrix for backward compatibility.
It also solves another problem. Image a section that has a feature added, that when you use it forces the 'required' bit for that section. So you have version A and B. What if you add even MORE functionality to that section, i.e., version C. the old code that supports A and B can't tell that it doesn't support C. Replacing the required with a semver solves that problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think this is a good simplification. Basically, synthesize all the "Required" and "encoding version" headers into a single "minimum semantic version". Can we do the "max of all mins" on the writer side though? I think it should be pretty easy to synthesize it once per dump, rather than once per section. WDYT?
In practice, it probably makes sense to just have a ComputeMinimumSemanticVersion function that iterates all indexes and computes the minimum semantic version based on feature usage. This centralization of the logic probably makes it easier to understand than having to venture into each section's logic and understand how it computes the semantic version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the "save" framework should force the individual sections to declare a semver. Then the framework can take care of doing the "max of all mins" and dump a value at the end. Similarly, the restore framework should tell each section the version it found as well as doing a global check at the end.
`message IndexSchema` follows from the existing definition in src/index_schema.proto. | ||
|
||
The goal of breaking into sections is to support skipping optional sections if they are not understood. New sections should be introduced in a manner where failure to understand the new section will generally load fine without loss. Any time that failure to load the section would result in some form of lossiness or inconsistency, we will mark `required` as true and it will result in a downgrade failure. This would only be desired in cases where operators have used new features, and need to think about the downgrade more critically, potentially removing (or, once supported, altering) indexes that will not downgrade gracefully. Similarly, for section types that would like to introduce new required fields, we will include an encoding version, which is conditionally bumped when these new fields are written. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't the advantage in differentiating between protobuf content and supplemental content when it's trivially easy to have one uber-format that easily supports both types of data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could flatten the two - but my assumption here are that 90% of our data will belong to the index schema. What I like about the current proposal is that adding new sections to the index schema is supported natively by adding supplemental content sections. If we flatten this and have everything at the top level, we need some way to map payloads to index schemas. It's not hard to just dump the index schema name and DB number (which will compose a unique key for us to lookup the index schema), but I feel like given the majority of changes are likley to be part of the index schema contents, having first-class support of composition in the RDB format through supplemental sections will reduce the complexity.
An example might help. To me, the nested example seems less complex. But it may be a matter of preference:
Nested | Flattened |
|
|
rfc/rdb-format.md
Outdated
required: true, | ||
enc_version: 1, | ||
} | ||
<key_to_id_mapping_dump> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My proposed format easily supports this.
rfc/rdb-format.md
Outdated
|
||
#### Example: Adding Vector Quantization | ||
|
||
With the above design, suppose that we are substantially changing the index to support a vector quantization option on `FT.CREATE`. For simplicity, suppose this is just a boolean "on" or "off" flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we do VQ, it'll present itself as a new algo, i.e., HNSW, FLAT and VQ (or something like that). That new vector index sub-type will likely have lots of blobs and special data structures. I doubt we can accurate predict that. Rather, we should focus on the more generic ability to add sections on a per-field-index basis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re:VQ - The goal was just to demonstrate what happens when the index format may change. But it is a simplified example. I am not proposing this is how quantization would be added.
If you have a better example, happy to change it :)
Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
No description provided.