Skip to content

Commit 11f5f6a

Browse files
committed
Tweaks
1 parent 6046c8f commit 11f5f6a

File tree

1 file changed

+5
-7
lines changed

1 file changed

+5
-7
lines changed

vignettes/duckplyr.Rmd

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ If the duckplyr data.frame is accessed by...
4242
Therefore, duckplyr can be **both lazy** (within itself) **and not lazy** (for the outside world).
4343

4444
Now, the default materialization can be problematic if dealing with large data: what if the materialization eats up all RAM?
45-
Therefore, the duckplyr package has a **safeguard called funneling** (in the current development version of the package).
45+
Therefore, the duckplyr package has a **safeguard called funneling**.
4646
A funneled data.frame cannot be materialized by default, it needs a call to a `compute()` function.
4747
By default, duckplyr frames are _unfunneled_, but duckplyr frames created from Parquet data (presumedly large) are _funneled_.
4848

@@ -62,8 +62,7 @@ conflict_prefer("filter", "dplyr", quiet = TRUE)
6262

6363
- convert individual data.frames to duck frames which allows you to control their automatic materialization parameters. To do that, you use `duckdb_tibble()`, `as_duckdb_tibble()` or read data using `read_*()` functions like `read_csv_duckdb()`.
6464

65-
In both cases, if an operation cannot be performed
66-
by duckplyr (see `vignettes("limits")`), it will be outsourced to dplyr.
65+
In both cases, if an operation cannot be performed by duckplyr (see `vignettes("limits")`), it will be outsourced to dplyr.
6766
You can choose to be informed about fallbacks to dplyr, see `?fallback_config`.
6867
You can disable fallbacks by turning off automatic materialization.
6968
In that case, if an operation cannot be performed by duckplyr, your code will error.
@@ -75,13 +74,12 @@ With large datasets, you want:
7574
- input data in an efficient format, like Parquet files. Therefore you might input data using `read_parquet_duckdb()`.
7675
- efficient computation, which duckplyr provides via DuckDB's holistic optimization, without your having to use another syntax than dplyr.
7776
- the output to not clutter all the memory. Therefore you can make use of these features:
78-
- funneling see vignette TODO ADD CURRENT NAME to disable automatic materialization completely or to disable automatic materialization up to a certain output size.
77+
- funneling (see `vignette("funnel")`) to disable automatic materialization completely or to disable automatic materialization up to a certain output size.
7978
- computation to files using `compute_parquet()` or `compute_csv()`.
80-
8179

8280

83-
A drawback of analyzing large data with duckplyr is that the limitations of duckplyr
84-
(unsupported verbs or data types, see `vignette("limits")`) won't be compensated by fallbacks since fallbacks to dplyr necessitate putting data into memory.
81+
82+
A drawback of analyzing large data with duckplyr is that the limitations of duckplyr (unsupported verbs or data types, see `vignette("limits")`) won't be compensated by fallbacks since fallbacks to dplyr necessitate putting data into memory.
8583

8684
## How to improve duckplyr
8785

0 commit comments

Comments
 (0)