-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List equality in contracts can cause some big slowdowns when applied several times #1930
Comments
Thanks for the report. Beside understanding why the behavior isn't linear, this example should also be the poster child for the contract deduplication optimization, so there's a problem here as well with the optimization not firing. |
So, trying more than two contracts then seems to grow linearly. Unfortunately I think this is expected, because just one contract with a default value won't trigger anything else but just forcing the term and exporting it. However, with contracts, to ensure the impotency of merging, the interpreter needs to insert a It should still be linear in the size of Whether we can optimize this is one question. One possibility is to use physical equality when comparing elements, but this is unfortunately not sound, because equality fails on e.g. functions, so physical equality could work when actual full-fledged equality would fail with an error. Maybe it's still a reasonable trade-off, because I don't think physical equality still always return a "reasonable" answer (a form of strict reflexivity of the equality relation). On the other hand, it doesn't explain why contract de-duplication doesn't fire, which is precisely useful to avoid those kind of stupid self-equality tests. Let me investigate that. |
A bit more investigation. It turns out the contract deduplication doesn't fire, but it's to be expected actually, looking at this again. The idea behind contract dedup is to avoid having several merges combining the same contract many times. However, it's not expected that users are going to write Now, if we split the multiple contract applications across records, they are deduplicated as expected. More on that later - I provide a full example with timing below.
Here are my timing for each output:
Thinking of it, if pure equality is so fast, I don't see any drawback to having |
(EDIT: I've added the missing timings in the table) |
I tried with the fast contract equality, and it indeed makes this example much faster (0.09s for In particular, I tried on the largest codebase that I have and it didn't made any difference (actually was a bit slower, but the numbers vary quite a bit, so it's hard to tell if this is an artifact of something unrelated to the change - but anyway, it didn't got much speed-up). I guess deduplication does it work correctly there, because array contracts used to be an issue on this codebase and deduplication brought a big improvement. I'm not sure what to do here. I don't want to optimize for this example in particular. Maybe |
Thanks for the investigation @yannham ! For completeness (and to justify myself from “it's not expected that users are going to write foo | MyContract | MyContract in an obvious way” 😄 ), the original code was closer to
That's indeed an issue. Maybe custom merge functions could solve this in the long run? Since I know here that I want to equal these lists eagerly I could write a merge function using the “fast equality”, and leave the lazy defaults otherwise |
Oh, I didn't think about custom merge functions for such a use-case, but indeed, you could define an alternate one for arrays that would just use |
The std.contract.Equal contract needs to be lazy, which means that it re-implement equality logic in pure Nickel code, which has proven to be very slow (#1930). This commit replaces the field difference operation, implemented using a left fold, by the new builtin operator `%record/split_pair%`, which shows a performance improvement of around 300% for the example of #1930.
The std.contract.Equal contract needs to be lazy, which means that it re-implement equality logic in pure Nickel code, which has proven to be very slow (#1930). This commit replaces the field difference operation, implemented using a left fold, by the new builtin operator `%record/split_pair%`, which shows a performance improvement of around 300% for the example of #1930.
Describe the bug
When a contract sets a record field to a list, applying that contract several times can have a huge performance impact
To Reproduce
“Minimal” reproducer:
Evaluating
two_contracts
takes vastly longer than evaluatingone_contract
, although the only difference between them is that theMySchema
contract is applied twice instead of once:nickel export project.ncl --field one_contract
nickel export project.ncl --field two_contracts
The time ratio can be made arbitrarily big by making the
complex_record
more complex.Expected behavior
Applying the contract twice should ideally have a negligible impact, and at the very worse make the overall program run twice as slow.
Environment
Additional context
I have no idea whether lists are really the culprit here, but I couldn't reproduce the issue without involving one
The text was updated successfully, but these errors were encountered: