-
Notifications
You must be signed in to change notification settings - Fork 38
Open
futureverse/globals
#97Description
We've gotten a few reports of performance issues appearing in furrr over the years (#267, #268).
Nothing has changed in furrr, so I suspected future or globals has had changes.
Here's a furrr example of a suspected regression:
With CRAN future
library(future)
library(furrr)
# 50,000 instances of `mtcars`
xs <- rep(list(mtcars), 50000)
# Two workers
plan(multisession, workers = 2)
# Should send 25,000 to each worker
# CRAN future 1.69.0
system.time({
future_map(xs, identity)
})
#> user system elapsed
#> 5.420 0.336 5.985
# Old future 1.33.2
system.time({
future_map(xs, identity)
})
#> user system elapsed
#> 2.153 0.237 2.739Yes there's a lot of data shuffling going on here, but users do this quite a bit anyways (despite our warnings against it) and it's generally much faster than this.
It also gets much worse if you add tibbles into the mix.
library(future)
library(furrr)
library(tibble)
# 50,000 instances of `mtcars` but as a tibble
xs <- rep(list(as_tibble(mtcars)), 50000)
# Two workers
plan(multisession, workers = 2)
# Should send 25,000 to each worker
# CRAN future 1.69.0
system.time({
future_map(xs, identity)
})
#> user system elapsed
#> 10.451 0.523 11.175
# Old future 1.33.2
system.time({
future_map(xs, identity)
})
#> user system elapsed
#> 3.761 0.302 4.297Ironically, if we naively try and strip out furrr, then CRAN future is actually faster.
So this is probably going to require some investigation to see if we can create a pure future call that shows the slowdown
library(future)
x1 <- rep(list(mtcars), 25000)
x2 <- rep(list(mtcars), 25000)
# Two workers
plan(multisession, workers = 2)
# CRAN future 1.69.0
system.time({
f1 <- future(
lapply(x, FUN = identity),
globals = as.FutureGlobals(list(x = x1))
)
f2 <- future(
lapply(x, FUN = identity),
globals = as.FutureGlobals(list(x = x2))
)
value(list(f1, f2))
})
#> user system elapsed
#> 0.547 0.216 0.999
# Old future 1.33.2
system.time({
f1 <- future(
lapply(x, FUN = identity),
globals = as.FutureGlobals(list(x = x1))
)
f2 <- future(
lapply(x, FUN = identity),
globals = as.FutureGlobals(list(x = x2))
)
value(list(f1, f2))
})
#> user system elapsed
#> 2.485 0.245 3.009Metadata
Metadata
Assignees
Labels
No labels