-
Couldn't load subscription status.
- Fork 1k
Description
Submitted by: Matt Weller; Assigned to: Nobody; R-Forge link
When using .SDcols (for the purpose of applying a function to multiple columns) I cannot reference other columns in the original table (v1) using the following syntax:
dt = data.table(grp=c(2,3,3,1,1,2,3), v1=1:7, v2=7:1, v3=10:16)
dt.out = dt[, c(v1 = sum(v1), lapply(.SD,mean)), by = grp, .SDcols = v2:v3]
# Error in `[.data.table`(dt, , list(v1 = sum(v1), lapply(.SD, mean)), by = grp, :
# object 'v1' not foundA similar error happens when I use c instead of list, clearly the column v1 cannot be accessed within the j clause.
I resorted to the following code which includes column v1, even though I do not want that to be included in the lapply portion, having to drop it after computation.
sd.cols = c("v1","v2", "v3")
dt.out = dt[, c(sum.v1 = sum(v1), lapply(.SD,mean)), by = grp, .SDcols = sd.cols]According to eddi on Stackoverflow this is a bug and he has asked me to report it. I cannot provide much more detail as I'm not exactly sure which part he thinks was a bug, looking at the accepted answer by Arun and their ensuing discussion will highlight where but the problem lies.
Here is the relevant SO post.