Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base R generic 'sort_by' for data.tables #6662

Closed
rikivillalba opened this issue Dec 13, 2024 · 2 comments
Closed

Base R generic 'sort_by' for data.tables #6662

rikivillalba opened this issue Dec 13, 2024 · 2 comments
Milestone

Comments

@rikivillalba
Copy link
Contributor

I wonder whether it is worth to implement sort_by.data.table (R 4.4.0) with internal forder instead of base order, and possibly an 'in_place' argument

Some improvement can be seen in my little test:

library(data.table)
set.seed(1234)
DT <- data.table(x=runif(1e7),y=runif(1e7))

# sort_by.data.frame with "forder" instead of "order"
sort_by2 <- function (x, y, ...) 
{
  if (inherits(y, "formula")) 
    y <- .formula2varlist(y, x)
  if (!is.list(y)) 
    y <- list(y)
  o <- do.call(data.table:::forder, c(unname(y), list(...)))
  x[o, , drop = FALSE]
}

microbenchmark::microbenchmark(times = 1, 
  sort_by2(DT, ~ x + y),
  sort_by(DT, ~ x + y)
)
#> Unit: seconds
#>                  expr      min       lq     mean   median       uq      max
#>  sort_by2(DT, ~x + y) 1.068101 1.068101 1.068101 1.068101 1.068101 1.068101
#>   sort_by(DT, ~x + y) 2.317193 2.317193 2.317193 2.317193 2.317193 2.317193

Created on 2024-12-12 with reprex v2.1.1

@rikivillalba rikivillalba changed the title base R generic sort_by for data.tables base R generic 'sort_by' for data.tables Dec 13, 2024
@rikivillalba rikivillalba changed the title base R generic 'sort_by' for data.tables Base R generic 'sort_by' for data.tables Dec 13, 2024
@MichaelChirico
Copy link
Member

Good idea. I wouldn't offer in_place (you can use setorder() for that, with a slightly different API).

It also looks like .formula2varlist() is key for allowing much more complicated formulas to WAI:

mtcars |> sort_by(~I(am + mpg))

It will be a lot harder to try and translate generic formulas into, e.g., a setorderv() call.

@MichaelChirico MichaelChirico added this to the 1.18.0 milestone Dec 13, 2024
@aitap
Copy link
Contributor

aitap commented Feb 26, 2025

Can this be fixed by now-closed #6679?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants