Start a fuzzing suite to test for consistency of lints #2818

MichaelChirico · 2025-03-06T22:24:22Z

Part of #2191. Progress on #2737. Continuation of #2190.

For this initial PR, I've only added a rule about FUNCTION (function(...) { ... }) and OP-LAMBDA (\(...) { ... }) equivalency.

Hopefully my approach here is extensible for other rules (I tried for that). My plan is for this to only run on main, though it's certainly fast enough to run on every PR.

The upshot is, it smoked out three linters with inconsistencies already!

MichaelChirico · 2025-03-06T22:29:46Z

@DavisVaughan / @lionel-, I'm curious if any of your recent work on {air}/{treesitter.r} & friends could be re-used here.

The idea is we want to make random injections/edits to the R AST (and then do stuff), my approach here felt quite manual/labor-intensive.

codecov · 2025-03-06T22:29:48Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.26%. Comparing base (449ed5c) to head (aafc693).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2818   +/-   ##
=======================================
  Coverage   99.26%   99.26%           
=======================================
  Files         127      127           
  Lines        7115     7116    +1     
=======================================
+ Hits         7063     7064    +1     
  Misses         52       52

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

DavisVaughan · 2025-03-07T14:35:07Z

The closest thing I can think of is parser_reparse() in {treesitter}
https://davisvaughan.github.io/r-tree-sitter/reference/parser-parse.html

It performs an incremental reparse and hands you back a new tree with the edit you made. It's quite verbose in terms of what you have to provide (bytes and points) but this is a tree-sitter limitation (and I'm sure they had some reason for it).

Note that text here in parser_reparse() is the entire original source document with a single contiguous edit already applied to it (it's not the incremental bit of text)

For Air, I don't think Rowan trees are really intended to be edited. At least, there doesn't seem to be any easy to use API for that (and plus Rowan isn't exposed in any R package right now anyways)

MichaelChirico · 2025-03-07T17:24:28Z

Interesting, yea the incremental re-parse is very neat, but more useful in an IDE setting. Here the latency of re-parse is not much concern as the edits are only a tiny fraction of the action run time.

For Air, I don't think Rowan trees are really intended to be edited. At least, there doesn't seem to be any easy to use API for that (and plus Rowan isn't exposed in any R package right now anyways)

Hmm, I guess in my mind something like {styler} / {air} is well-suited here if it exposes a friendly API for transformers, i.e. instead of the usual helpful rule like 'make this line <80 characters', we have a more adversarial rule like "add comments in random but syntactically valid places". I not be understanding where in the stack behind the format-on-save tool such an analog would sit.

lionel- · 2025-03-10T09:53:38Z

For Air, I don't think Rowan trees are really intended to be edited. At least, there doesn't seem to be any easy to use API for that (and plus Rowan isn't exposed in any R package right now anyways)

I may be misunderstanding what you mean but AFAIK the red trees in Rowan are modifiable. I believe this is used for refactorings in rust-analyzer.

MichaelChirico · 2025-05-07T23:12:23Z

Bumping for review :)

.github/workflows/ast-fuzz.yaml

AshesITR · 2025-07-10T05:06:23Z

R/expect_lint.R

-    })
-  }
+  if (is.null(file)) on.exit(unlink(file), add = TRUE)
+  file <- maybe_write_content(file, content)


This could use a comment explaining that this is a fuzzer entrypoint.

Here, or in ast_fuzz_test.R? The latter seems more natural to me. But maybe I'm missing your point.

Here. The control flow does two back-to-back null-checks on file, which has no obvious reason without knowing about the fuzzer.

Well, the second is.null() check is inside maybe_write_content(), so I wouldn't look at this code in a vacuum and say "that doesn't look right".

it's two steps because it's not easy to set an on.exit() hook in a parent frame (well, we could: https://yihui.org/en/2017/12/on-exit-parent/ but I'm not sure it's worth the complexity)

To me I'd at least wonder why the null check wasn't removed from the call and the call moved into the if branch.

I was thinking something like

# this line is replaced by the fuzzer in .dev/ast_fuzz_test.R

We might be able to have it both ways if you can suggest an alternative -- IIRC my original thinking was this was an independently OK way to rewrite the code here. For now I have added the comment.

AshesITR · 2025-07-13T07:05:29Z

.dev/maybe_fuzz_content.R

+  lines <- readLines(f)
+  for (fuzzer in list(function_lambda_fuzzer)) {
+    if (reparse) {
+      pd <- getParseData(parse(f, keep.source = TRUE))


Is there a way to reparse directly from updated_lines without a round trip to disk?

It might be possible, though it could impose some limits on the type of fuzzes that are allowed. Without looking to deeply, I'd assume we could do so if we're only editing a valid subtree and replacing it with another valid subtree, whereas now we allow generic text replacement.

That said, I didn't see any real performance issue, so I'm not sure there's much to gain from adding this complication. I have in mind things in get_source_expressions() that operate from the parse data.frame, which require a lot of code:

lintr/R/get_source_expressions.R

Line 679 in afb363c

fix_eq_assigns <- function(pc) {

.dev/maybe_fuzz_content.R

MichaelChirico added 19 commits March 5, 2025 23:44

use maybe_write_content for easier 'mocking'

3882350

initial progress

c392c53

getting very close i think...

5cef281

skip Rmd files

a4e4a66

caught a live one!

0b1eaf5

need to match original file extension?

868ad30

caught another one!

0ed5cc0

simpler approach, avoid rex() due to bug

99d00a3

also ignore warnings

d3cca7a

finally getting somewhere...

59dc1b0

progressively more complicated :(

a25065f

round of fixes & first working nofuzz

491a340

looks like we got another live one... break time

92f0628

another true positive

d387a71

more ignores, need '.' in file extension, restore test

e150ffe

wrapping up

3d1fc0e

Write up the GHA config

b69b7cd

annotation

b8a06e3

comment for future work

a3dbf27

MichaelChirico marked this pull request as ready for review March 6, 2025 22:26

MichaelChirico added 8 commits March 6, 2025 22:30

vestigial

5a22050

skips on old R

76b869f

expect_no_lint

afec743

new tests

51593e4

NEWS

f4b9481

bad copy-paste

6389d55

need stop_on_failure for batch?

1550ead

delint, fix last skip for R<4.1.0

bbdac43

more extensible structure

523c218

MichaelChirico mentioned this pull request Mar 7, 2025

Add a second fuzzer for pipe equivalency #2819

Open

MichaelChirico mentioned this pull request Mar 12, 2025

Improve the nofuzz system to allow specific exclusions #2832

Open

AshesITR reviewed Jul 10, 2025

View reviewed changes

MichaelChirico mentioned this pull request Jul 10, 2025

GHA has effectively 'Rscript -e "callr::rscript(...)"' #2873

Open

MichaelChirico added 2 commits July 12, 2025 23:20

Add a comment

7186992

move to end-of-line

89ef5d2

AshesITR reviewed Jul 13, 2025

View reviewed changes

MichaelChirico added 3 commits July 15, 2025 21:34

Merge branch 'main' into fuzz

f9e1a9d

explicit encoding

2f47853

Merge remote-tracking branch 'origin/fuzz' into fuzz

aafc693

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Start a fuzzing suite to test for consistency of lints #2818

Start a fuzzing suite to test for consistency of lints #2818

Uh oh!

MichaelChirico commented Mar 6, 2025 •

edited

Loading

Uh oh!

MichaelChirico commented Mar 6, 2025

Uh oh!

codecov bot commented Mar 6, 2025 •

edited

Loading

Uh oh!

DavisVaughan commented Mar 7, 2025 •

edited

Loading

Uh oh!

MichaelChirico commented Mar 7, 2025

Uh oh!

lionel- commented Mar 10, 2025

Uh oh!

MichaelChirico commented May 7, 2025

Uh oh!

Uh oh!

AshesITR Jul 10, 2025

Uh oh!

MichaelChirico Jul 10, 2025

Uh oh!

AshesITR Jul 11, 2025

Uh oh!

MichaelChirico Jul 11, 2025

Uh oh!

AshesITR Jul 12, 2025

Uh oh!

MichaelChirico Jul 13, 2025

Uh oh!

AshesITR Jul 13, 2025

Uh oh!

MichaelChirico Jul 14, 2025

Uh oh!

Uh oh!

Uh oh!

Start a fuzzing suite to test for consistency of lints #2818

Are you sure you want to change the base?

Start a fuzzing suite to test for consistency of lints #2818

Uh oh!

Conversation

MichaelChirico commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaelChirico commented Mar 6, 2025

Uh oh!

codecov bot commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

DavisVaughan commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaelChirico commented Mar 7, 2025

Uh oh!

lionel- commented Mar 10, 2025

Uh oh!

MichaelChirico commented May 7, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MichaelChirico commented Mar 6, 2025 •

edited

Loading

codecov bot commented Mar 6, 2025 •

edited

Loading

DavisVaughan commented Mar 7, 2025 •

edited

Loading