- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1k
Description
Submitted by: Matt Dowle; Assigned to: Nobody; R-Forge link
CJ() could return an irregular column length data.table (its inputs unchanged), marked somehow so that binarysearch knows to %% index through the CJ columns, rather than allocating long vectors and populating them as CJ currently does. Purely for speed, with no changes for users unless perhaps CJ() has been used outside [.data.table.
Or, more generally, data.table columns could be marked so that indexing to them knows they should be recycled, without actually allocating and recycling the columns. Discuss on datatable-help needed as under-the-hood code e.g. DT$foo[903] might not work if it was internally a short vector. In reality it's probably only useful for CJ() as regular data.table's only have recycled data perhaps to start with creating them with say an NA column but then are quickly populated. If a large table has a column with the same value (say NA or 0) for every row, perhaps the column is redundant. Can't imagine that recycling >1 items is often useful (other than in CJ()).