Skip to content

Input validation, especially for t-digest fields #141686

@not-napoleon

Description

@not-napoleon

Currently, the t-digest field, like the histogram field, just accepts whatever data the user sends. We should consider reencoding this with the digest settings the field was defined with. This would protect against users mistakenly loading very large fields into what is meant to be a compact format.

I think there are two options we can adopt:

  • Always re-encode the data; this incurs a memory and compute cost on every ingest, which may not be useful if the user sent the right thing in the first place
  • Apply some heuristic, such as if (centroids.length) > compression * 10), which makes a reasonable guess about the shape of the sketch and only re-encodes if the supplied data looks very different from that guess.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions