-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Milestone
Description
Using case_when in a mutate call with a grouping variable is much, much slower in v1.1.0 compared to v1.0.10. The code works but it's causing a tremendous slowdown in many of the packages I maintain (see here, many examples have elapsed time >5s).
Here's a reprex for v1.1.0.
library(dplyr, warn.conflicts = F)
library(microbenchmark)
n <- 1000
dat <- data.frame(
x = seq(1:n),
y = rnorm(n)
)
microbenchmark(
dat %>%
group_by(x) %>%
mutate(
z = case_when(
y < 0 ~ '-',
T ~ '+',
)
),
times = 100
)
#> Unit: seconds
#> expr
#> dat %>% group_by(x) %>% mutate(z = case_when(y < 0 ~ "-", T ~ "+", ))
#> min lq mean median uq max neval
#> 2.376748 2.537896 2.650869 2.625663 2.723655 3.170204 100Created on 2023-02-01 with reprex v2.0.2
Session info
sessioninfo::session_info()
#> - Session info --------------------------------------------------------------
#> hash: person in steamy room: medium-dark skin tone, goat, black small square
#>
#> setting value
#> version R version 4.1.3 (2022-03-10)
#> os Windows 10 x64 (build 22000)
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate English_United States.1252
#> ctype English_United States.1252
#> tz America/New_York
#> date 2023-02-01
#> pandoc 2.19.2 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#>
#> - Packages -------------------------------------------------------------------
#> package * version date (UTC) lib source
#> cli 3.6.0 2023-01-09 [1] CRAN (R 4.1.3)
#> digest 0.6.31 2022-12-11 [1] CRAN (R 4.1.3)
#> dplyr * 1.1.0 2023-01-29 [1] CRAN (R 4.1.3)
#> evaluate 0.20 2023-01-17 [1] CRAN (R 4.1.3)
#> fansi 1.0.4 2023-01-22 [1] CRAN (R 4.1.3)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.2)
#> fs 1.6.0 2023-01-23 [1] CRAN (R 4.1.3)
#> generics 0.1.3 2022-07-05 [1] CRAN (R 4.1.3)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.1.3)
#> htmltools 0.5.4 2022-12-07 [1] CRAN (R 4.1.3)
#> knitr 1.42 2023-01-25 [1] CRAN (R 4.1.3)
#> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.1.3)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.1.3)
#> microbenchmark * 1.4.9 2021-11-09 [1] CRAN (R 4.1.3)
#> pillar 1.8.1 2022-08-19 [1] CRAN (R 4.1.3)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.2)
#> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.1.3)
#> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.1.3)
#> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.1.1)
#> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.1.1)
#> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.1.3)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.2)
#> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.1.3)
#> rlang 1.0.6 2022-09-24 [1] CRAN (R 4.1.3)
#> rmarkdown 2.20 2023-01-19 [1] CRAN (R 4.1.3)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.2)
#> sessioninfo 1.2.1 2021-11-02 [1] CRAN (R 4.1.2)
#> styler 1.7.0 2022-03-13 [1] CRAN (R 4.1.3)
#> tibble 3.1.8 2022-07-22 [1] CRAN (R 4.1.3)
#> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.1.3)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.2)
#> vctrs 0.5.2 2023-01-23 [1] CRAN (R 4.1.3)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.1.3)
#> xfun 0.36 2022-12-21 [1] CRAN (R 4.1.3)
#> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.1.3)
#>
#> [1] C:/Users/mbeck/R/win-library
#> [2] C:/Program Files/R/R-4.1.3/library
#>
#> ------------------------------------------------------------------------------And here's a reprex for v1.0.10 (note that the times for this one are in milliseconds, above was seconds).
library(dplyr, warn.conflicts = F)
library(microbenchmark)
n <- 1000
dat <- data.frame(
x = seq(1:n),
y = rnorm(n)
)
microbenchmark(
dat %>%
group_by(x) %>%
mutate(
z = case_when(
y < 0 ~ '-',
T ~ '+',
)
),
times = 100
)
#> Unit: milliseconds
#> expr
#> dat %>% group_by(x) %>% mutate(z = case_when(y < 0 ~ "-", T ~ "+", ))
#> min lq mean median uq max neval
#> 114.9103 120.9102 126.9423 123.889 128.7439 167.7735 100Created on 2023-02-01 with reprex v2.0.2
Session info
sessioninfo::session_info()
#> - Session info --------------------------------------------------------------
#> hash: open mailbox with raised flag, love-you gesture: medium skin tone, snowboarder: light skin tone
#>
#> setting value
#> version R version 4.1.3 (2022-03-10)
#> os Windows 10 x64 (build 22000)
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate English_United States.1252
#> ctype English_United States.1252
#> tz America/New_York
#> date 2023-02-01
#> pandoc 2.19.2 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#>
#> - Packages -------------------------------------------------------------------
#> package * version date (UTC) lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.2)
#> cli 3.6.0 2023-01-09 [1] CRAN (R 4.1.3)
#> DBI 1.1.3 2022-06-18 [1] CRAN (R 4.1.3)
#> digest 0.6.31 2022-12-11 [1] CRAN (R 4.1.3)
#> dplyr * 1.0.10 2022-09-01 [1] CRAN (R 4.1.3)
#> evaluate 0.20 2023-01-17 [1] CRAN (R 4.1.3)
#> fansi 1.0.4 2023-01-22 [1] CRAN (R 4.1.3)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.2)
#> fs 1.6.0 2023-01-23 [1] CRAN (R 4.1.3)
#> generics 0.1.3 2022-07-05 [1] CRAN (R 4.1.3)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.1.3)
#> htmltools 0.5.4 2022-12-07 [1] CRAN (R 4.1.3)
#> knitr 1.42 2023-01-25 [1] CRAN (R 4.1.3)
#> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.1.3)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.1.3)
#> microbenchmark * 1.4.9 2021-11-09 [1] CRAN (R 4.1.3)
#> pillar 1.8.1 2022-08-19 [1] CRAN (R 4.1.3)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.2)
#> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.1.3)
#> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.1.3)
#> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.1.1)
#> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.1.1)
#> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.1.3)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.2)
#> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.1.3)
#> rlang 1.0.6 2022-09-24 [1] CRAN (R 4.1.3)
#> rmarkdown 2.20 2023-01-19 [1] CRAN (R 4.1.3)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.2)
#> sessioninfo 1.2.1 2021-11-02 [1] CRAN (R 4.1.2)
#> styler 1.7.0 2022-03-13 [1] CRAN (R 4.1.3)
#> tibble 3.1.8 2022-07-22 [1] CRAN (R 4.1.3)
#> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.1.3)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.2)
#> vctrs 0.5.2 2023-01-23 [1] CRAN (R 4.1.3)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.1.3)
#> xfun 0.36 2022-12-21 [1] CRAN (R 4.1.3)
#> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.1.3)
#>
#> [1] C:/Users/mbeck/R/win-library
#> [2] C:/Program Files/R/R-4.1.3/library
#>
#> ------------------------------------------------------------------------------brendenconnors
Metadata
Metadata
Assignees
Labels
No labels