Open
Description
The JSONBench dataset contains fields with big number of unique values (aka high-cardinality fields):
- did (aka user_id)
- commit.cid (aka commit_id)
- commit.record.subject.cid
Sometimes it is needed to find all the rows for the particular rarely seen value of some field. For example, to find all the rows generated by some user. Then the following query can be used for JSONBench data:
SELECT count(*) FROM bluesky WHERE data.did = 'did:plc:stwikwzlk2mepaagokthylry'
Another practical query is to select a row for the given commit_id:
SELECT * FROM bluesky WHERE data.commit.cid = 'bafyreielfqkpggsdqwtbtg5tyh7iqytp64paevfjbeufnw6kc7sgmjemhm'
Metadata
Metadata
Assignees
Labels
No labels