Skip to content

Conversation

xiaodaigh
Copy link
Collaborator

Adding some tests, it seems that tidytable works quite different to dplyr to the point where

the below doesn't work with {disk.frame} but the dplyr equivalent does. It seems that obtaining some function or variables from the environment is not working

  fn = function(a, b) {
    a+b
  }
  
  df3 = value %>%
    dt_mutate(b =  fn(num, num)) %>%
    collect

This is either a {disk.frame} issue or an implementation detail of tidytable. Need to investigate

@markfairbanks
Copy link

Interesting. Those examples work outside of {disk.frame}, I wonder what's causing the issue. I'll take a look at {tidytable} as well and see if I can find anything

@xiaodaigh
Copy link
Collaborator Author

Actually, the below seems to work, so we probably not that far from an integration! Group-by are a bit trickier, but can be done.

As you can see, integrating many of funcitons like dt_mutate is a simple as calling the create_chunk_mapper method

fn = function(a, b) {
  a+b
}

dt_mutate = disk.frame::create_chunk_mapper(tidytable::dt_mutate)

library(disk.frame)
value = as.disk.frame(data.frame(num = runif(1e6)))
df3 = value %>%
  dt_mutate(b =  fn(num, num)) %>%
  collect

df3

@markfairbanks
Copy link

Nice! Glad it's looking like it will work

@xiaodaigh
Copy link
Collaborator Author

This doesn't work

fn = function(a, b) {
  a+b
}

dt_mutate = disk.frame::create_chunk_mapper(tidytable::dt_mutate)

library(disk.frame)
setup_disk.frame() # this makes it fail
value = as.disk.frame(data.frame(num = runif(1e6)))
df3 = value %>%
  dt_mutate(b =  fn(num, num)) %>%
  collect

df3

@xiaodaigh
Copy link
Collaborator Author

In the latest branch of disk.frame; this works flawlessly

library(tidytable)
library(disk.frame)
setup_disk.frame(2)

fn <- function(a,b) a+b

mutate..disk.frame = create_chunk_mapper(tidytable::mutate.)

value = as.disk.frame(data.frame(num = runif(1e6)))

df3 = value %>%
  mutate.(b =  fn(num, num)) %>% 
  collect

So should be doable soon

@bnicenboim
Copy link

Hi, any updates regarding this branch?

@xiaodaigh
Copy link
Collaborator Author

this is pending updates to the disk.frame NSE mechanism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants