Skip to content

[Request] subset DT on integer vector stored as matrix the same way as data.frame #826

@jangorecki

Description

@jangorecki

Kind of minor FR.
Quite common use case when using caret package for machine learning.
Bellow command (random partition indices for training and testing machine learning model):
idx_training <- createDataPartition(dt$response, p = 0.75, list = FALSE)
will produce integer vector but with length of second dimension equal to 1.

> str(idx_training)
 int [1:2188, 1] 6 12 15 22 25 33 45 46 50 62 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr "Resample1"

when subsetting this on data.frame as follows:
as.data.frame(dt)[idx_training,]
it will works just fine.

But when trying to subset on data.table:

dt[idx_training]
dt[idx_training,] # even tried this way on data.table

it will throw an error like that:

Error in `[.data.table`(dt, idx_training) : 
  i is invalid type (matrix). Perhaps in future a 2 column matrix could return a list of elements of DT (in the spirit of A[B] in FAQ 2.14). Please let datatable-help know if you'd like this, or add your comments to FR #1611.

So the finally the FR is:

  • support for subset integer vector stored as matrix when dim(int_vector)[2]==1
  • behave the same as data.frame - in terms of subsetting in such case

I think it is not a big deal and better integration with data.frame is always good.

Of course it currently can be done just by dt[idx_training[,1]], that's why it's minor.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions