-
Notifications
You must be signed in to change notification settings - Fork 188
Start a fuzzing suite to test for consistency of lints #2818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@DavisVaughan / @lionel-, I'm curious if any of your recent work on {air}/{treesitter.r} & friends could be re-used here. The idea is we want to make random injections/edits to the R AST (and then do stuff), my approach here felt quite manual/labor-intensive. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2818 +/- ##
=======================================
Coverage 99.26% 99.26%
=======================================
Files 127 127
Lines 7115 7116 +1
=======================================
+ Hits 7063 7064 +1
Misses 52 52 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The closest thing I can think of is It performs an incremental reparse and hands you back a new tree with the edit you made. It's quite verbose in terms of what you have to provide (bytes and points) but this is a tree-sitter limitation (and I'm sure they had some reason for it). Note that For Air, I don't think Rowan trees are really intended to be edited. At least, there doesn't seem to be any easy to use API for that (and plus Rowan isn't exposed in any R package right now anyways) |
Interesting, yea the incremental re-parse is very neat, but more useful in an IDE setting. Here the latency of re-parse is not much concern as the edits are only a tiny fraction of the action run time.
Hmm, I guess in my mind something like {styler} / {air} is well-suited here if it exposes a friendly API for transformers, i.e. instead of the usual helpful rule like 'make this line <80 characters', we have a more adversarial rule like "add comments in random but syntactically valid places". I not be understanding where in the stack behind the format-on-save tool such an analog would sit. |
I may be misunderstanding what you mean but AFAIK the red trees in Rowan are modifiable. I believe this is used for refactorings in rust-analyzer. |
Bumping for review :) |
R/expect_lint.R
Outdated
}) | ||
} | ||
if (is.null(file)) on.exit(unlink(file), add = TRUE) | ||
file <- maybe_write_content(file, content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could use a comment explaining that this is a fuzzer entrypoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, or in ast_fuzz_test.R? The latter seems more natural to me. But maybe I'm missing your point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here. The control flow does two back-to-back null-checks on file
, which has no obvious reason without knowing about the fuzzer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the second is.null()
check is inside maybe_write_content()
, so I wouldn't look at this code in a vacuum and say "that doesn't look right".
it's two steps because it's not easy to set an on.exit()
hook in a parent frame (well, we could: https://yihui.org/en/2017/12/on-exit-parent/ but I'm not sure it's worth the complexity)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me I'd at least wonder why the null check wasn't removed from the call and the call moved into the if branch.
I was thinking something like
# this line is replaced by the fuzzer in .dev/ast_fuzz_test.R
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might be able to have it both ways if you can suggest an alternative -- IIRC my original thinking was this was an independently OK way to rewrite the code here. For now I have added the comment.
lines <- readLines(f) | ||
for (fuzzer in list(function_lambda_fuzzer)) { | ||
if (reparse) { | ||
pd <- getParseData(parse(f, keep.source = TRUE)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to reparse directly from updated_lines
without a round trip to disk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be possible, though it could impose some limits on the type of fuzzes that are allowed. Without looking to deeply, I'd assume we could do so if we're only editing a valid subtree and replacing it with another valid subtree, whereas now we allow generic text replacement.
That said, I didn't see any real performance issue, so I'm not sure there's much to gain from adding this complication. I have in mind things in get_source_expressions()
that operate from the parse data.frame, which require a lot of code:
lintr/R/get_source_expressions.R
Line 679 in afb363c
fix_eq_assigns <- function(pc) { |
Part of #2191. Progress on #2737. Continuation of #2190.
For this initial PR, I've only added a rule about
FUNCTION
(function(...) { ... }
) andOP-LAMBDA
(\(...) { ... }
) equivalency.Hopefully my approach here is extensible for other rules (I tried for that). My plan is for this to only run on
main
, though it's certainly fast enough to run on every PR.The upshot is, it smoked out three linters with inconsistencies already!