Skip to content

filter by variable bug #278

@thomasjwood

Description

@thomasjwood

When we filter a table by a variable, unexpected behavior results. Imagine this tbl

d1 <- tbl_df(data.frame(num1 = as.character(sample(1:10, 1000, T)),
                        var1 = runif(1000)))
d1$num1 <- as.character(d1$num1)

which I want to subset according to values I find in data frame d2

d2 <- data.frame(num1 = 1:3)
d2$num1 <- as.character(d2$num1) 

If I pass the num1 values as a separate character vector, everything works great. For instance, this

d1 %.%
  filter(num1 %in% c("1", "2", "3")) %.%
  group_by(num1) %.%
  summarise(mu = mean(var1))

returns

Source: local data frame [3 x 2]

  num1        mu
1    2 0.4662081
2    1 0.4810027
3    3 0.4920704

But when filtering with a variable:

d1 %.%
   filter(num1 %in% d2$num1) %.%
   group_by(num1) %.%
   summarise(mu = mean(var1))

dplyr returns this in version 0.1.1

Source: local data frame [0 x 2]

and this in version 0.1.2

Error in list(num1 = c("1", "2", "3"))$c(7L, 2L, 3L, 4L, 10L, 6L, 7L,  : 
  invalid subscript type 'integer'

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions