-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Labels
breaking-changeissues whose solution would require breaking existing behaviorissues whose solution would require breaking existing behaviorenhancementjoinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsquestion
Milestone
Description
Currently data.table joins are consistent with base R.
This is somehow awkward for some queries.
library(data.table)
x = data.table(a=1:3, w=letters[1:3])
y = data.table(b=3:5, z=6:4)
x[y, on=c(a="b")]
# a w z
#1: 3 c 6
#2: 4 NA 5
#3: 5 NA 4
x[y, .(a, b), on=c(a="b")]
# a b
#1: 3 3
#2: 4 4
#3: 5 5Join consistency to base R could be kept in merge.data.table method for base R merge generic, while the joins within [.data.table could be consistent to SQL - which does not impose limitation as base R. [.data.frame does not allow joins so it wouldn’t break consistency here.
Change would generally break the code which relies on invalid base R join behavior.
For reference SQL output from postgres:
#$`SELECT * FROM x RIGHT OUTER JOIN y ON x.a = y.b;`
# a w b z
#1: 3 c 3 6
#2: NA NA 4 5
#3: NA NA 5 4
#
#$`SELECT a, b FROM x RIGHT OUTER JOIN y ON x.a = y.b;`
# a b
#1: 3 3
#2: NA 4
#3: NA 5aryoda
Metadata
Metadata
Assignees
Labels
breaking-changeissues whose solution would require breaking existing behaviorissues whose solution would require breaking existing behaviorenhancementjoinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsquestion