Add `n_unique()` = length(unique(x)) - useful while grouping

The [SO question](http://stackoverflow.com/a/26345246/559784) that triggered the thought:

Instead of having to do:

``` R
DT[, length(unique(.)), by=.]
```

We could do with: 

``` R
DT[, n_unique(.), by=.]
```

This'll especially be faster for data.tables though because we don't have to subset the entire data.table to know the number of unique values.

Here's a quick benchmark:

``` R
require(data.table)
x = sample(1e2, 1e7, TRUE)
system.time(ans1 <- length(unique(x))) # 0.667 seconds
system.time(ans2 <- length(attr(data.table:::forderv(x, retGrp=TRUE), 'starts'))) # 0.1 seconds
```

We could, in addition, also internally optimise `length(unique(.))` to `n_unique(.)`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `n_unique()` = length(unique(x)) - useful while grouping #884

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add n_unique() = length(unique(x)) - useful while grouping #884

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add `n_unique()` = length(unique(x)) - useful while grouping #884