Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ts): initialize metadata encoding for time series #2573

Open
wants to merge 7 commits into
base: unstable
Choose a base branch
from

Conversation

jonathanc-n
Copy link
Contributor

@jonathanc-n jonathanc-n commented Oct 3, 2024

references: #2421

This uses @Beihao-Zhou 's proposal:
https://beihao-rebecca.notion.site/TimeSeries-Proposal-6c5d2f9bc0624a50a209cb8f517109a8

Time Series Main Data (TS.INFO) - in Metadata Column:
flags | expire | version | size | num_samples | total_memory | first_ts | last_ts | retention_time | chunk_count | chunk_size | chunk_type | duplicate_policy (8 bit enum) | source_key
size wasn't used to represent num_samples here because they are inherently different

Primary Subkey (Data point key):
ns | TYPE | key | version | timestamp => value

Secondary Subkey (Label Index Key):
ns | TYPE | label_key | label_value | key => null

Secondary Subkey (Aggregation Rule Key):
ns | TYPE | key | version | dest_key => aggregator | bucket_duration

  • The primary sub key were implemented for the regular operations
  • The Label Index subkey is for fast search/aggregation commands over labels.
  • The aggregation index sub key was implemented for rule related commands
    Is to be noted I didn't include the value range index (which was also suggested by beihao) from the proposal. I looked into how much the command was being used on redis, and it wasn't worth the excess storage that would be taken up.

Let me know if there are any changes or formatting changes that should be done! (I am not confident that the separate file was done properly)

A new search column will be added as it allows for isolated search and aggregation over time series data.

@jonathanc-n
Copy link
Contributor Author

@PragmaTwice @Beihao-Zhou @LindaSummer Feel free to leave changes!

@PragmaTwice PragmaTwice changed the title feat(time): Added Time Series Metadata feat(ts): initialize metadata encoding for time series Oct 3, 2024
@LindaSummer
Copy link
Contributor

LindaSummer commented Oct 3, 2024

Hi @jonathanc-n ,

For improving review efficiency, we could enable our fork repo's github actions.

It will be triggered once we push to our own fork repo and after our own repo's github action passed, we can push our changes to the PR branch.

This can accelerate our own development process and reduce review time. 😊

Best Regards,
Edward

@jonathanc-n
Copy link
Contributor Author

Took out column temporarily as it was failing tests. Will put it back in the full implementation.

@Beihao-Zhou
Copy link
Member

Thanks! Would it be possible to add details about the encoding and outline the next steps in this PR, similar to how it was done in this PR? I think it would help make everything clearer. :)

@Beihao-Zhou
Copy link
Member

Also can we move the encoding under folder types instead of storage? Maybe similar to https://github.com/apache/kvrocks/blob/0a43bade1a4c3c885947fdd42eacefac934bf1bd/src/types/redis_stream_base.h

@jonathanc-n
Copy link
Contributor Author

@Beihao-Zhou I could, but I assumed that creating the time_encoding in storage is fine as it needs its own search and querying capabilities over subkeys unlike the stream data.

Copy link

sonarcloud bot commented Oct 5, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants