You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The JSONBench dataset contains fields with big number of unique values (aka high-cardinality fields):
did (aka user_id)
commit.cid (aka commit_id)
commit.record.subject.cid
Sometimes it is needed to find all the rows for the particular rarely seen value of some field. For example, to find all the rows generated by some user. Then the following query can be used for JSONBench data:
SELECTcount(*) FROM bluesky WHEREdata.did='did:plc:stwikwzlk2mepaagokthylry'
Another practical query is to select a row for the given commit_id:
This will be a good addition, although not in the direction of the benchmark (data analytics). Let's evaluate the systems on this query and see how they stand.
The JSONBench dataset contains fields with big number of unique values (aka high-cardinality fields):
Sometimes it is needed to find all the rows for the particular rarely seen value of some field. For example, to find all the rows generated by some user. Then the following query can be used for JSONBench data:
Another practical query is to select a row for the given commit_id:
The text was updated successfully, but these errors were encountered: