Skip to content

Inconsistent data.table assignment by reference behaviour #759

@mchen402

Description

@mchen402

Original SO post: http://stackoverflow.com/questions/25145112/inconsistent-data-table-assignment-by-reference-behaviour


When assigning by reference with a data.table using a column from a second data.table, the results are inconsistent. When there are no matches by the key columns of both data.tables, it appears the assigment expression y := y is totally ignored - not even NAs are returned.

library(data.table)
dt1 <- data.table(id = 1:2, x = 3:4, key = "id")
dt2 <- data.table(id = 3:4, y = 5:6, key = "id")
print(dt1[dt2, y := y])
##    id x     # Would have also expected column:   y
## 1:  1 3     #                                   NA
## 2:  2 4     #                                   NA

However, when there is a partial match, non-matching columns have a placeholder NA.

dt2[, id := 2:3]
print(dt1[dt2, y := y])
##    id x  y
## 1:  1 3 NA    # <-- placeholder NA here
## 2:  2 4  5

This wreaks havoc on later code that assumes a y column exists in all cases. Otherwise I keep having to write cumbersome additional checks to take into account both cases.

Is there an elegant way around this inconsistency?

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions