-
Notifications
You must be signed in to change notification settings - Fork 119
feat: commit to JSON array values individually #808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
feat: commit to JSON array values individually #808
Conversation
sinui0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need some sort of limit on how many values we will commit to by default.
Maybe it would also make sense to only commit to string and object values in an array by default. Redacting individual numbers in an array seems less useful.
…ithout_values() works.
… each element, each separator
I can think of some scenarios where committing to numbers might be useful. If an array is a list of group IDs your user is a member of, you might want to disclose that you're a member of one group, without disclosing all groups you're a member of. Or, if the endpoint you're notarizing is outputting a table of data, the individual arrays may represent rows with columns. Some of the columns values might be numbers, and you may wish to reveal only that column. |
Should I be trying to tackle that here? Having some sort of defense against a malicious server sending a parser bomb is sensible, but it's not actually an array-specific thing. The adversary could send a deeply nested array (e.g. It seems like it would be preferable to have the builder maintain a counter, and raise an exception when you blow out some limit. |
|
If the response payload isn't malicious but nonetheless contains a huge array it could cause problems. Though on second thought the storage multiplier for commitment data is worst case only ~64x for encoding commitments. Given we're likely not going to be committing more than low MB it should be fine. |
That seems like a sensible test, but also, that would create annoyingly fragile tests. I think the tests should instead focus on enforcing backwards compatibility. For instance, in my first pass, I accidentally broke the existing behaviour which commits to without_values(), which would break existing code which reveals without_values(). I output the commits that the current implementation produces, and here they are, ordered by start index. A valid test may be to check that the whole value, the without_values() and the individual elements are committed. However, I'm not 100% sure what a good test for the separators would be. |
IMO, that would be a strong example of where complicating the default implementation for an edge case isn't sensible. If someone encounters that case, they could implement a custom committer which skips particular paths, for instance. |
|
Not sure what the difference is between what I initially said and what you just suggested. But yes, we can add tests which at the very least let us know when the behavior has been modified.
Yeah, dealing with separators or other tokens more generally is harder. When I finally get a time to tackle |
|
Need to run rustfmt |
|
We can move circuit breaker behavior out of scope |
…lues, and every value
|
I think I've implemented the needed tests now. As a matter of interest, I realized that the visitor produces a value which has no span content, and no ranges. I don't think this is an issue that I introduced, and I see no cause for it. I skip that in my commit checker, because it only happens once, and it's an issue out of scope. |
|
Weird that the formatting checks are insisting that the new json fixture macro calls need to be formatted differently from the http ones. |
crates/core/src/transcript/commit.rs
Outdated
| &self, | ||
| direction: Direction, | ||
| range: &dyn ToRangeSet<usize>, | ||
| kind: Option<TranscriptCommitmentKind>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional arguments are typically avoided because you can't at a glance know what is being omitted at the call site (None tells you nothing about the type). Instead it would be typical to add a contains_with_kind function which accepts this argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated it to now look for any commit kind with the specified direction and ranges, and added the _with_kind for looking for specific kinds.
Change its logic to look for any commit type with direction, index. Add contains_with_kind. Tidying.
| .any(|(_, kind)| matches!(kind, TranscriptCommitmentKind::Encoding)) | ||
| } | ||
|
|
||
| /// Returns whether the builder has a commitment for the given direction and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// Returns whether the builder has a commitment for the given direction and | |
| /// Returns `true` if the configuration contains the provided ranges. |
| /// Returns whether the builder has a commitment for the given direction and | ||
| /// range, with the given kind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// Returns whether the builder has a commitment for the given direction and | |
| /// range, with the given kind. | |
| /// Returns `true` if the configuration contains the provided ranges with the given commitment kind. |
| self.commits | ||
| .iter() | ||
| .any(|((d, i), _)| *d == direction && *i == idx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This being O(N) isn't great, but ok for now.
| }; | ||
| use tlsn_data_fixtures::json as fixtures; | ||
|
|
||
| #[rstest] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #[rstest] | |
| // Asserts the behavior that every value in the transcript is committed individually. | |
| #[rstest] |
Closes #807
Need to add a test to check this behaviour. I've run the whole test suite locally, and nothing breaks - but there's no coverage on the new behaviour.