flowR is a sophisticated, static dataflow analyzer for the R programming language, available for VSCode, Positron, RStudio, and Docker. It offers a wide variety of features, for example:
-
๐ code linting
Analyze your R scripts for common issues and potential bugs (see the wiki page for more information on the currently supported linters).Example: Linting code with flowR
To lint your code, you can use the REPL or the Visual Studio Code extension (see vscode-flowr#283).
$ docker run -it --rm eagleoutice/flowr # or npm run flowr flowR repl using flowR v2.6.1, R v4.5.0 (r-shell engine) R> :query @linter "read.csv(\"/root/x.txt\")"
Output
Query: ๏ฟฝ[;1mlinter๏ฟฝ[0m (5โms) โฐ **Deprecated Functions** (deprecated-functions): โฐ _Metadata_: <code>{"totalCalls":0,"totalFunctionDefinitions":0,"searchTimeMs":3,"processTimeMs":0}</code> โฐ **File Path Validity** (file-path-validity): โฐ certain: โฐ Path `/root/x.txt` at 1.1-23 โฐ _Metadata_: <code>{"totalReads":1,"totalUnknown":0,"totalWritesBeforeAlways":0,"totalValid":0,"searchTimeMs":1,"processTimeMs":0}</code> โฐ **Seeded Randomness** (seeded-randomness): โฐ _Metadata_: <code>{"consumerCalls":0,"callsWithFunctionProducers":0,"callsWithAssignmentProducers":0,"callsWithNonConstantProducers":0,"searchTimeMs":0,"processTimeMs":0}</code> โฐ **Absolute Paths** (absolute-file-paths): โฐ certain: โฐ Path `/root/x.txt` at 1.1-23 โฐ _Metadata_: <code>{"totalConsidered":1,"totalUnknown":0,"searchTimeMs":0,"processTimeMs":0}</code> โฐ **Unused Definitions** (unused-definitions): โฐ _Metadata_: <code>{"totalConsidered":0,"searchTimeMs":0,"processTimeMs":0}</code> โฐ **Naming Convention** (naming-convention): โฐ _Metadata_: <code>{"numMatches":0,"numBreak":0,"searchTimeMs":0,"processTimeMs":0}</code> โฐ **Network Functions** (network-functions): โฐ _Metadata_: <code>{"totalCalls":0,"totalFunctionDefinitions":0,"searchTimeMs":0,"processTimeMs":0}</code> โฐ **Dataframe Access Validation** (dataframe-access-validation): โฐ _Metadata_: <code>{"numOperations":0,"numAccesses":0,"totalAccessed":0,"searchTimeMs":0,"processTimeMs":0}</code> โฐ **Dead Code** (dead-code): โฐ _Metadata_: <code>{"consideredNodes":5,"searchTimeMs":0,"processTimeMs":0}</code> โฐ **Useless Loops** (useless-loop): โฐ _Metadata_: <code>{"numOfUselessLoops":0,"searchTimeMs":0,"processTimeMs":0}</code> ๏ฟฝ[;3mAll queries together required โ5โms (1ms accuracy, total 9โms)๏ฟฝ[0m๏ฟฝ[0mThe linter will analyze the code and return any issues found. Formatted more nicely, this returns:
[ { "type": "linter" } ](This query can be shortened to
@linterwhen used within the REPL command:query).Results (prettified and summarized):
Query: linter (221โms)
ย ย ย โฐ Deprecated Functions (deprecated-functions):
ย ย ย ย ย ย ย โฐ Metadata:{"totalCalls":0,"totalFunctionDefinitions":0,"searchTimeMs":207,"processTimeMs":1}
ย ย ย โฐ File Path Validity (file-path-validity):
ย ย ย ย ย ย ย โฐ certain:
ย ย ย ย ย ย ย ย ย ย ย โฐ Path/root/x.txtat 1.1-23
ย ย ย ย ย ย ย โฐ Metadata:{"totalReads":1,"totalUnknown":0,"totalWritesBeforeAlways":0,"totalValid":0,"searchTimeMs":4,"processTimeMs":1}
ย ย ย โฐ Seeded Randomness (seeded-randomness):
ย ย ย ย ย ย ย โฐ Metadata:{"consumerCalls":0,"callsWithFunctionProducers":0,"callsWithAssignmentProducers":0,"callsWithNonConstantProducers":0,"searchTimeMs":0,"processTimeMs":1}
ย ย ย โฐ Absolute Paths (absolute-file-paths):
ย ย ย ย ย ย ย โฐ certain:
ย ย ย ย ย ย ย ย ย ย ย โฐ Path/root/x.txtat 1.1-23
ย ย ย ย ย ย ย โฐ Metadata:{"totalConsidered":1,"totalUnknown":0,"searchTimeMs":2,"processTimeMs":0}
ย ย ย โฐ Unused Definitions (unused-definitions):
ย ย ย ย ย ย ย โฐ Metadata:{"totalConsidered":0,"searchTimeMs":0,"processTimeMs":0}
ย ย ย โฐ Naming Convention (naming-convention):
ย ย ย ย ย ย ย โฐ Metadata:{"numMatches":0,"numBreak":0,"searchTimeMs":0,"processTimeMs":0}
ย ย ย โฐ Network Functions (network-functions):
ย ย ย ย ย ย ย โฐ Metadata:{"totalCalls":0,"totalFunctionDefinitions":0,"searchTimeMs":1,"processTimeMs":0}
ย ย ย โฐ Dataframe Access Validation (dataframe-access-validation):
ย ย ย ย ย ย ย โฐ Metadata:{"numOperations":0,"numAccesses":0,"totalAccessed":0,"searchTimeMs":0,"processTimeMs":3}
ย ย ย โฐ Dead Code (dead-code):
ย ย ย ย ย ย ย โฐ Metadata:{"consideredNodes":5,"searchTimeMs":0,"processTimeMs":1}
ย ย ย โฐ Useless Loops (useless-loop):
ย ย ย ย ย ย ย โฐ Metadata:{"numOfUselessLoops":0,"searchTimeMs":0,"processTimeMs":0}
All queries together required โ222โms (1ms accuracy, total 223โms)Show Detailed Results as Json
The analysis required 222.6โms (including parsing and normalization and the query) within the generation environment.
In general, the JSON contains the Ids of the nodes in question as they are present in the normalized AST or the dataflow graph of flowR. Please consult the Interface wiki page for more information on how to get those.
{ "linter": { "results": { "deprecated-functions": { "results": [], ".meta": { "totalCalls": 0, "totalFunctionDefinitions": 0, "searchTimeMs": 207, "processTimeMs": 1 } }, "file-path-validity": { "results": [ { "range": [ 1, 1, 1, 23 ], "filePath": "/root/x.txt", "certainty": "certain" } ], ".meta": { "totalReads": 1, "totalUnknown": 0, "totalWritesBeforeAlways": 0, "totalValid": 0, "searchTimeMs": 4, "processTimeMs": 1 } }, "seeded-randomness": { "results": [], ".meta": { "consumerCalls": 0, "callsWithFunctionProducers": 0, "callsWithAssignmentProducers": 0, "callsWithNonConstantProducers": 0, "searchTimeMs": 0, "processTimeMs": 1 } }, "absolute-file-paths": { "results": [ { "certainty": "certain", "filePath": "/root/x.txt", "range": [ 1, 1, 1, 23 ] } ], ".meta": { "totalConsidered": 1, "totalUnknown": 0, "searchTimeMs": 2, "processTimeMs": 0 } }, "unused-definitions": { "results": [], ".meta": { "totalConsidered": 0, "searchTimeMs": 0, "processTimeMs": 0 } }, "naming-convention": { "results": [], ".meta": { "numMatches": 0, "numBreak": 0, "searchTimeMs": 0, "processTimeMs": 0 } }, "network-functions": { "results": [], ".meta": { "totalCalls": 0, "totalFunctionDefinitions": 0, "searchTimeMs": 1, "processTimeMs": 0 } }, "dataframe-access-validation": { "results": [], ".meta": { "numOperations": 0, "numAccesses": 0, "totalAccessed": 0, "searchTimeMs": 0, "processTimeMs": 3 } }, "dead-code": { "results": [], ".meta": { "consideredNodes": 5, "searchTimeMs": 0, "processTimeMs": 1 } }, "useless-loop": { "results": [], ".meta": { "numOfUselessLoops": 0, "searchTimeMs": 0, "processTimeMs": 0 } } }, ".meta": { "timing": 221 } }, ".meta": { "timing": 222 } } -
๐ program slicing
Given a point of interest like the visualization of a plot, flowR reduces the program to just the parts which are relevant for the computation of the point of interest.Example: Slicing with flowR
The simplest way to retrieve slices is with flowR's Visual Studio Code extension. However, you can slice using the REPL as well. This can help you if you want to reuse specific parts of an existing analysis within another context or if you want to understand what is happening in the code.
For this, let's have a look at the example file, located at test/testfiles/example.R:
sum <- 0 product <- 1 w <- 7 N <- 10 for (i in 1:(N-1)) { sum <- sum + i + w product <- product * i } cat("Sum:", sum, "\n") cat("Product:", product, "\n")
Let's suppose we are interested only in the
sumwhich is printed in line 11. To get a slice for this, you can use the following command:$ docker run -it --rm eagleoutice/flowr # or npm run flowr flowR repl using flowR v2.6.1, R v4.5.0 (r-shell engine) R> :slicer test/testfiles/example.R --criterion "11@sum"
Output
sum <- 0 w <- 7 N <- 10 for(i in 1:(N-1)) sum <- sum + i + w sum -
๐ dependency analysis
Given your analysis project, flowR offers a plethora of so-called queries to get more information about your code. An important query is the dependencies query, which shows you the library your project needs, the data files it reads, the scripts it sources, and the data it outputs.Example: Dependency Analysis with flowR
The following showcases the dependency view of the Visual Studio Code extension:
-
๐ fast data- and control-flow graphs
Within just 137.6 ms (as of Oct 21, 2025), flowR can analyze the data- and control-flow of the average real-world R script. See the benchmarks for more information, and consult the wiki pages for more details on the dataflow graph.Example: Generating a dataflow graph with flowR
You can investigate flowR's analyses using the REPL. Commands like
:dataflow*allow you to view a dataflow graph for a given R script.Let's have a look at the following example:
sum <- 0 product <- 1 w <- 7 N <- 10 for (i in 1:(N-1)) { sum <- sum + i + w product <- product * i } cat("Sum:", sum, "\n") cat("Product:", product, "\n")
To get the dataflow graph for this script, you can use the following command:
$ docker run -it --rm eagleoutice/flowr # or npm run flowr flowR repl using flowR v2.6.1, R v4.5.0 (r-shell engine) R> :dataflow* test/testfiles/example.R
Output
https://mermaid.live/view#base64:eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgMChbXCJgIzkxO1JTeW1ib2wjOTM7IHRlc3RcbiAgICAgICgwKVxuICAgICAgKjEuMS00KmBcIl0pXG4gICAgMShbXCJgIzkxO1JTeW1ib2wjOTM7IHRlc3RmaWxlc1xuICAgICAgKDEpXG4gICAgICAqMS42LTE0KmBcIl0pXG4gICAgMltbXCJgIzkxO1JCaW5hcnlPcCM5MzsgL1xuICAgICAgKDIpXG4gICAgICAqMS4xLTE0KlxuICAgICgwLCAxKWBcIl1dXG4gICAgYnVpbHQtaW46X1tcImBCdWlsdC1Jbjpcbi9gXCJdXG4gICAgc3R5bGUgYnVpbHQtaW46XyBzdHJva2U6Z3JheSxmaWxsOmxpZ2h0Z3JheSxzdHJva2Utd2lkdGg6MnB4LG9wYWNpdHk6Ljg7XG4gICAgMyhbXCJgIzkxO1JTeW1ib2wjOTM7IGV4YW1wbGUuUlxuICAgICAgKDMpXG4gICAgICAqMS4xNi0yNCpgXCJdKVxuICAgIDRbW1wiYCM5MTtSQmluYXJ5T3AjOTM7IC9cbiAgICAgICg0KVxuICAgICAgKjEuMS0yNCpcbiAgICAoMiwgMylgXCJdXVxuICAgIDIgLS0+fFwicmVhZHMsIGFyZ3VtZW50XCJ8IDBcbiAgICAyIC0tPnxcInJlYWRzLCBhcmd1bWVudFwifCAxXG4gICAgMiAtLi0+fFwicmVhZHMsIGNhbGxzXCJ8IGJ1aWx0LWluOl9cbiAgICBsaW5rU3R5bGUgMiBzdHJva2U6Z3JheTtcbiAgICA0IC0tPnxcInJlYWRzLCBhcmd1bWVudFwifCAyXG4gICAgNCAtLT58XCJyZWFkcywgYXJndW1lbnRcInwgM1xuICAgIDQgLS4tPnxcInJlYWRzLCBjYWxsc1wifCBidWlsdC1pbjpfXG4gICAgbGlua1N0eWxlIDUgc3Ryb2tlOmdyYXk7IiwibWVybWFpZCI6eyJhdXRvU3luYyI6dHJ1ZX19Following the link output should show the following:
Loadingflowchart LR 1{{"`#91;RNumber#93; 0 (1) *1.8*`"}} 0["`#91;RSymbol#93; sum (0) *1.1-3*`"] 2[["`#91;RBinaryOp#93; #60;#45; (2) *1.1-8* (0, 1)`"]] built-in:_-["`Built-In: #60;#45;`"] style built-in:_- stroke:gray,fill:lightgray,stroke-width:2px,opacity:.8; 4{{"`#91;RNumber#93; 1 (4) *2.12*`"}} 3["`#91;RSymbol#93; product (3) *2.1-7*`"] 5[["`#91;RBinaryOp#93; #60;#45; (5) *2.1-12* (3, 4)`"]] 7{{"`#91;RNumber#93; 7 (7) *3.6*`"}} 6["`#91;RSymbol#93; w (6) *3.1*`"] 8[["`#91;RBinaryOp#93; #60;#45; (8) *3.1-6* (6, 7)`"]] 10{{"`#91;RNumber#93; 10 (10) *4.6-7*`"}} 9["`#91;RSymbol#93; N (9) *4.1*`"] 11[["`#91;RBinaryOp#93; #60;#45; (11) *4.1-7* (9, 10)`"]] 12["`#91;RSymbol#93; i (12) *6.6*`"] 13{{"`#91;RNumber#93; 1 (13) *6.11*`"}} 16(["`#91;RSymbol#93; N (16) *6.14*`"]) 17{{"`#91;RNumber#93; 1 (17) *6.16*`"}} 18[["`#91;RBinaryOp#93; #45; (18) *6.14-16* (16, 17)`"]] built-in:-["`Built-In: #45;`"] style built-in:- stroke:gray,fill:lightgray,stroke-width:2px,opacity:.8; 19[["`#91;RExpressionList#93; ( (19) *6.13* (18)`"]] built-in:_["`Built-In: (`"] style built-in:_ stroke:gray,fill:lightgray,stroke-width:2px,opacity:.8; 20[["`#91;RBinaryOp#93; #58; (20) *6.11-17* (13, 19)`"]] built-in::["`Built-In: #58;`"] style built-in:: stroke:gray,fill:lightgray,stroke-width:2px,opacity:.8; 24(["`#91;RSymbol#93; sum (24, :may:36+) *7.10-12*`"]) 25(["`#91;RSymbol#93; i (25, :may:36+) *7.16*`"]) 26[["`#91;RBinaryOp#93; #43; (26, :may:36+) *7.10-16* (24, 25)`"]] 27(["`#91;RSymbol#93; w (27, :may:36+) *7.20*`"]) 28[["`#91;RBinaryOp#93; #43; (28, :may:36+) *7.10-20* (26, 27)`"]] 23["`#91;RSymbol#93; sum (23, :may:36+) *7.3-5*`"] 29[["`#91;RBinaryOp#93; #60;#45; (29, :may:36+) *7.3-20* (23, 28)`"]] 31(["`#91;RSymbol#93; product (31, :may:36+) *8.14-20*`"]) 32(["`#91;RSymbol#93; i (32, :may:36+) *8.24*`"]) 33[["`#91;RBinaryOp#93; #42; (33, :may:36+) *8.14-24* (31, 32)`"]] 30["`#91;RSymbol#93; product (30, :may:36+) *8.3-9*`"] 34[["`#91;RBinaryOp#93; #60;#45; (34, :may:36+) *8.3-24* (30, 33)`"]] 35[["`#91;RExpressionList#93; #123; (35, :may:36+) *6.20* (29, 34)`"]] 36[["`#91;RForLoop#93; for (36) *6.1-9.1* (12, 20, 35)`"]] built-in:for["`Built-In: for`"] style built-in:for stroke:gray,fill:lightgray,stroke-width:2px,opacity:.8; 38{{"`#91;RString#93; #34;Sum#58;#34; (38) *11.5-10*`"}} 40(["`#91;RSymbol#93; sum (40) *11.13-15*`"]) built-in:sum["`Built-In: sum`"] style built-in:sum stroke:gray,fill:lightgray,stroke-width:2px,opacity:.8; 42{{"`#91;RString#93; #34; #34; (42) *11.18-21*`"}} 44[["`#91;RFunctionCall#93; cat (44) *11.1-22* (38, 40, 42)`"]] built-in:cat["`Built-In: cat`"] style built-in:cat stroke:gray,fill:lightgray,stroke-width:2px,opacity:.8; 46{{"`#91;RString#93; #34;Product#58;#34; (46) *12.5-14*`"}} 48(["`#91;RSymbol#93; product (48) *12.17-23*`"]) 50{{"`#91;RString#93; #34; #34; (50) *12.26-29*`"}} 52[["`#91;RFunctionCall#93; cat (52) *12.1-30* (46, 48, 50)`"]] 0 -->|"defined-by"| 1 0 -->|"defined-by"| 2 2 -->|"argument"| 1 2 -->|"returns, argument"| 0 2 -.->|"reads, calls"| built-in:_- linkStyle 4 stroke:gray; 3 -->|"defined-by"| 4 3 -->|"defined-by"| 5 5 -->|"argument"| 4 5 -->|"returns, argument"| 3 5 -.->|"reads, calls"| built-in:_- linkStyle 9 stroke:gray; 6 -->|"defined-by"| 7 6 -->|"defined-by"| 8 8 -->|"argument"| 7 8 -->|"returns, argument"| 6 8 -.->|"reads, calls"| built-in:_- linkStyle 14 stroke:gray; 9 -->|"defined-by"| 10 9 -->|"defined-by"| 11 11 -->|"argument"| 10 11 -->|"returns, argument"| 9 11 -.->|"reads, calls"| built-in:_- linkStyle 19 stroke:gray; 12 -->|"defined-by"| 20 16 -->|"reads"| 9 18 -->|"reads, argument"| 16 18 -->|"reads, argument"| 17 18 -.->|"reads, calls"| built-in:- linkStyle 24 stroke:gray; 19 -->|"returns, argument"| 18 19 -.->|"reads"| built-in:_ linkStyle 26 stroke:gray; 20 -->|"reads, argument"| 13 20 -->|"reads, argument"| 19 20 -.->|"reads, calls"| built-in:: linkStyle 29 stroke:gray; 24 -->|"reads"| 0 24 -->|"reads"| 23 24 -->|"CD-True"| 36 linkStyle 32 stroke:gray,color:gray; 25 -->|"reads"| 12 25 -->|"CD-True"| 36 linkStyle 34 stroke:gray,color:gray; 26 -->|"reads, argument"| 24 26 -->|"reads, argument"| 25 26 -.->|"reads, calls"| built-in:_ linkStyle 37 stroke:gray; 26 -->|"CD-True"| 36 linkStyle 38 stroke:gray,color:gray; 27 -->|"reads"| 6 27 -->|"CD-True"| 36 linkStyle 40 stroke:gray,color:gray; 28 -->|"reads, argument"| 26 28 -->|"reads, argument"| 27 28 -.->|"reads, calls"| built-in:_ linkStyle 43 stroke:gray; 28 -->|"CD-True"| 36 linkStyle 44 stroke:gray,color:gray; 23 -->|"defined-by"| 28 23 -->|"defined-by"| 29 23 -->|"CD-True"| 36 linkStyle 47 stroke:gray,color:gray; 29 -->|"argument"| 28 29 -->|"returns, argument"| 23 29 -.->|"reads, calls"| built-in:_- linkStyle 50 stroke:gray; 29 -->|"CD-True"| 36 linkStyle 51 stroke:gray,color:gray; 31 -->|"reads"| 3 31 -->|"reads"| 30 31 -->|"CD-True"| 36 linkStyle 54 stroke:gray,color:gray; 32 -->|"reads"| 12 32 -->|"CD-True"| 36 linkStyle 56 stroke:gray,color:gray; 33 -->|"reads, argument"| 31 33 -->|"reads, argument"| 32 33 -.->|"reads, calls"| built-in:_ linkStyle 59 stroke:gray; 33 -->|"CD-True"| 36 linkStyle 60 stroke:gray,color:gray; 30 -->|"defined-by"| 33 30 -->|"defined-by"| 34 30 -->|"CD-True"| 36 linkStyle 63 stroke:gray,color:gray; 34 -->|"argument"| 33 34 -->|"returns, argument"| 30 34 -.->|"reads, calls"| built-in:_- linkStyle 66 stroke:gray; 34 -->|"CD-True"| 36 linkStyle 67 stroke:gray,color:gray; 35 -->|"argument"| 29 35 -->|"returns, argument"| 34 35 -.->|"reads, calls"| built-in:_ linkStyle 70 stroke:gray; 35 -->|"CD-True"| 36 linkStyle 71 stroke:gray,color:gray; 36 -->|"argument"| 12 36 -->|"reads, argument"| 20 36 -->|"argument, non-standard-evaluation"| 35 36 -.->|"reads, calls"| built-in:for linkStyle 75 stroke:gray; 40 -->|"reads"| 0 40 -->|"reads"| 23 40 -.->|"reads"| built-in:sum linkStyle 78 stroke:gray; 44 -->|"argument"| 38 44 -->|"reads, argument"| 40 44 -->|"argument"| 42 44 -.->|"reads, calls"| built-in:cat linkStyle 82 stroke:gray; 48 -->|"reads"| 3 48 -->|"reads"| 30 52 -->|"argument"| 46 52 -->|"reads, argument"| 48 52 -->|"argument"| 50 52 -.->|"reads, calls"| built-in:cat linkStyle 88 stroke:gray;(The analysis required 16.4โms (including parse and normalize, using the r-shell engine) within the generation environment.)
If you want to use flowR and the features it provides, feel free to check out the:
- Visual Studio Code/Positron: provides access to flowR directly in VS Code and Positron (or vscode.dev)
- RStudio Addin: integrates flowR into RStudio
- R package: use flowR in your R scripts
- Docker image: run flowR in a container, this also includes flowR's server
- NPM package: include flowR in your TypeScript and JavaScript projects
If you are already using flowR and want to give feedback, please consider filling out our feedback form.
To get started with flowR and its features, please check out the Overview wiki page. The Setup wiki page explains how you can download and setup flowR on your system. With dockerย ๐ณ๏ธ, the following line should be enough (and drop you directly into the read-eval-print loop):
docker run -it --rm eagleoutice/flowrYou can enter :help to gain more information on its capabilities.
Example REPL session
If you want to use the same commands:
- First this runs
docker run -it --rm eagleoutice/flowrin a terminal to start the REPL. - In the REPL, it runs
:slicer -c '11@prod' demo.R --diffto slice the example filedemo.Rfor the print statement in line 11. Please note that the11refers to the 11th line number to slice for!
For more details on how to use flowR please refer to the wiki pages, as well as the deployed code documentation.
If you are interested in the theoretical background of flowR, please check out the following publications (if you find that a paper is missing here, please open a new issue):
-
Statically Analyzing the Dataflow of R Programs (OOPSLA '25)
Please cite this paper if you are using flowR in your research.BibTeX
@article{10.1145/3763087, author = {Sihler, Florian and Tichy, Matthias}, title = {Statically Analyzing the Dataflow of R Programs}, year = {2025}, issue_date = {October 2025}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, volume = {9}, number = {OOPSLA2}, url = {https://doi.org/10.1145/3763087}, doi = {10.1145/3763087}, abstract = {The R programming language is primarily designed for statistical computing and mostly used by researchers without a background in computer science. R provides a wide range of dynamic features and peculiarities that are difficult to analyze statically like dynamic scoping and lazy evaluation with dynamic side effects. At the same time, the R ecosystem lacks sophisticated analysis tools that support researchers in understanding and improving their code. In this paper, we present a novel static dataflow analysis framework for the R programming language that is capable of handling the dynamic nature of R programs and produces the dataflow graph of given R programs. This graph can be essential in a range of analyses, including program slicing, which we implement as a proof of concept. The core analysis works as a stateful fold over a normalized version of the abstract syntax tree of the R program, which tracks (re-)definitions, values, function calls, side effects, external files, and a dynamic control flow to produce one dataflow graph per program. We evaluate the correctness of our analysis using output equivalence testing on a manually curated dataset of 779 sensible slicing points from executable real-world R scripts. Additionally, we use a set of systematic test cases based on the capabilities of the R language and the implementation of the R interpreter and measure the runtimes well as the memory consumption on a set of 4,230 real-world R scripts and 20,815 packages available on Rโs package manager CRAN. Furthermore, we evaluate the recall of our program slicer, its accuracy using shrinking, and its improvement over the state of the art. We correctly analyze almost all programs in our equivalence test suite, preserving the identical output for 99.7\% of the manually curated slicing points. On average, we require 576ms to analyze the dataflow and around 213kB to store the graph of a research script. This shows that our analysis is capable of analyzing real-world sources quickly and correctly. Our slicer achieves an average reduction of 84.8\% of tokens indicating its potential to improve program comprehension.}, journal = {Proc. ACM Program. Lang.}, month = oct, articleno = {309}, numpages = {29}, keywords = {Dataflow Analysis, R Programming Language, Static Analysis} }
-
flowR: A Static Program Slicer for R (ASE '24, Tool)
This refers to the tool-demonstration of the VS Code Extension.BibTeX
@inproceedings{DBLP:conf/kbse/SihlerT24, author = {Florian Sihler and Matthias Tichy}, editor = {Vladimir Filkov and Baishakhi Ray and Minghui Zhou}, title = {flowR: {A} Static Program Slicer for {R}}, booktitle = {Proceedings of the 39th {IEEE/ACM} International Conference on Automated Software Engineering, {ASE} 2024, Sacramento, CA, USA, October 27 - November 1, 2024}, pages = {2390--2393}, publisher = {{ACM}}, year = {2024}, url = {https://doi.org/10.1145/3691620.3695359}, doi = {10.1145/3691620.3695359}, timestamp = {Mon, 03 Mar 2025 21:16:51 +0100}, biburl = {https://dblp.org/rec/conf/kbse/SihlerT24.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
-
On the Anatomy of Real-World R Code for Static Analysis (MSR '24)
This paper lays the foundation for flowR by analyzing the characteristics of real-world R code.BibTeX
@inproceedings{DBLP:conf/msr/SihlerPSTDD24, author = {Florian Sihler and Lukas Pietzschmann and Raphael Straub and Matthias Tichy and Andor Diera and Abdelhalim Hafedh Dahou}, editor = {Diomidis Spinellis and Alberto Bacchelli and Eleni Constantinou}, title = {On the Anatomy of Real-World {R} Code for Static Analysis}, booktitle = {21st {IEEE/ACM} International Conference on Mining Software Repositories, {MSR} 2024, Lisbon, Portugal, April 15-16, 2024}, pages = {619--630}, publisher = {{ACM}}, year = {2024}, url = {https://doi.org/10.1145/3643991.3644911}, doi = {10.1145/3643991.3644911}, timestamp = {Sun, 19 Jan 2025 13:31:27 +0100}, biburl = {https://dblp.org/rec/conf/msr/SihlerPSTDD24.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Works using flowR include: Computational Reproducibility of R Code Supplements on OSF and Multi-View Structural Graph Summaries.
We welcome every contribution! Please check out the developer onboarding section in the wiki for all the information you will need.
flowr is actively developed by Florian Sihler and (since October 1st 2025) Oliver Gerstl under the
GPLv3 License.
It is partially supported by the German Research Foundation (DFG) under the grant 504226141 ("CodeInspector").
Please notice that this file was generated automatically using the file src/documentation/print-readme.ts as a source.
If you want to make changes please edit the source file (the CI will take care of the rest).
In fact, many files in the wiki are generated, so make sure to check for the source file if you want to make changes.


