Skip to content

Error in using aggregation functions (any,all etc) with nomatch = 0L #813

@nigmastar

Description

@nigmastar

Please consider:

dt <- data.table(id = c("a", "a", "b", "b"),
                 var = c(1.1, 2.5, 6.3, 4.5),
                 key = "id")

#1.9.3 latest revision
dt["c", list(id, check = any(var > 3)), nomatch=0L]
# Error in if (mn%%n[i] != 0) warning("Item ", i, " is of size ", n[i],  : 
#   missing value where TRUE/FALSE needed

#1.9.3 revision 1200 (last 'stable')
dt["c", list(check = any(var > 3)), nomatch=0L]
# Empty data.table (0 rows) of 2 cols: id,check

This happens only generating a brand-new column and not when simply selecting existing ones or when 'aggregation function' are not used. These

> dt["c", list(id, check = var), nomatch=0L]
Empty data.table (0 rows) of 2 cols: id,check
> dt["c", list(id, check = var > 3), nomatch=0L]
Empty data.table (0 rows) of 2 cols: id,check

evaluate as expected. This is caused from these lines of as.data.table.list function data.table.R:

    n = vapply(x, length, 0L)
    mn = max(n)
    x = copy(x)
    if (any(n<mn)) 
    for (i in which(n<mn)) {
        if (!is.null(x[[i]])) {# avoids warning when a list element is NULL
            # Implementing FR #4813 - recycle with warning when nr %% nrows[i] != 0L
            if (mn %% n[i] != 0) 
                warning("Item ", i, " is of size ", n[i], " but maximum size is ", mn, " (recycled leaving a remainder of ", mn%%n[i], " items)")
            x[[i]] = rep(x[[i]], length.out=mn)
        }
    }

I believe the following happens:

x <- as.list(dt["c", list(id, var), nomatch=0L])
x$var <- max(x$var) # to simulate the internal process
n = vapply(x, length, 0L)
mn = max(n)
if (any(n<mn)) 
  for (i in which(n<mn)) {
    if (!is.null(x[[i]])) {# avoids warning when a list element is NULL
      # Implementing FR #4813 - recycle with warning when nr %% nrows[i] != 0L
      if (mn %% n[i] != 0) # <----------------- HERE
        warning("Item ", i, " is of size ", n[i], " but maximum size is ", mn, " (recycled leaving a remainder of ", mn%%n[i], " items)")
      x[[i]] = rep(x[[i]], length.out=mn)
    }
  }
# Error in if (mn%%n[i] != 0) warning("Item ", i, " is of size ", n[i],  : 
#   missing value where TRUE/FALSE needed

1L %% 0L is NA, hence the error. In fact, if there are no zero length columns in the result it works

> dt["c", list(check = any(var > 3)), nomatch=0L]
   check
1: FALSE

Even though I believe that the result of the last expression should be Empty data.table (0 rows) of 1 col: check. If any vector has zero length, can't as.data.table.list just set the table rows to zero? I don't think there is a case when a column has zero length and i evaluates to a positive (non zero) number of rows.

Thanks,
Michele.

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions