Skip to content

Various enhancements to print.data.table #1523

@MichaelChirico

Description

@MichaelChirico

Current task list:


Some Notes

3 (tabled pending clarification)

As I understand it, this issue is a request to prevent the console output from wrapping around (i.e., to force all columns to appear parallel, regardless of how wide the table is).

If that's the case, this is (AFAICT) impossible, since that's something done by RStudio/R itself. I for one certainly don't know of any way to alter this behavior.

If someone does know of a way to affect this, or if they think I'm mis-interpreting, please pipe up and we can have this taken care of.

7

As I see it there are two options here. One is to treat all key columns the same; the other is to treat secondary, tertiary, etc. keys separately.

Example output:

set.seed(01394)
DT <- data.table(key1 = rep(c("A","B"), each = 4),
                 key2 = rep(c("a","b"), 4),
                 V1 = nrorm(8), key = c("key1","key2"))

# Only demarcate key columns
DT
#    | key1 | | key2 |         V1
#1: |    A | |    a |  0.5994579
#2: |    A | |    a | -1.0898775
#3: |    A | |    b | -0.2285326
#4: |    A | |    b | -1.7858472
#5: |    B | |    a | -0.6269875
#6: |    B | |    a | -0.6633084
#7: |    B | |    b |  1.0367084
#8: |    B | |    b |  0.7364276

# Separately "emboss" keys based on key order
DT
#    | key1 | || key2 ||         V1
#1: |    A | ||    a ||  0.5994579
#2: |    A | ||    a || -1.0898775
#3: |    A | ||    b || -0.2285326
#4: |    A | ||    b || -1.7858472
#5: |    B | ||    a || -0.6269875
#6: |    B | ||    a || -0.6633084
#7: |    B | ||    b ||  1.0367084
#8: |    B | ||    b ||  0.7364276

And of course, add an option for deciding whether to demarcate with | or some other user's-choice character (*, +, etc.)

9 [DONE]

Some feedback from a closed PR that was a first stab at solving this:

From Arun regarding preferred options:

col.names = c("auto", "top", "none")

"auto": current behaviour

"top": only on top, data.frame-like

"none": no column names -- exclude rows in which column names would have been printed.

10 [DONE]

It would be nice to have an option to print a row under the row of column names which gives each column's stored type, as is currently (I understand) the default for the output of dplyr operations.

Example from dplyr:

library(dplyr)
DF <- data.frame(n = numeric(1), c1 = complex(1), i = integer(1),
                 f = factor(1), D = as.Date("2016-02-06"), c2 = character(1),
                 stringsAsFactors = FALSE)
tbl_df(DF)
# Source: local data frame [1 x 6]
#
#       n     c1     i      f          D    c2
#   (dbl) (cmpl) (int) (fctr)     (date) (chr) # <- this row
#1     0   0+0i     0      1 2016-02-06      

Current best alternative is to do sapply(DF, class), but it's nice to have a preview of the data wit this extra information.

11

This seems closely related to 3. Current plan is to implement this as an alternative to 3 since it seems more tangible/doable.

Via @nverno:

Would it be useful for head.data.table to have an option to print only the head of columns that fit the screen width, and summarise the rest? I was imagining something like the printed output from the head of a tbl_df in dplyr. I think it is nice for tables with many columns.

and the guiding example from Arun:

require(data.table)
dt = setDT(lapply(1:100, function(x) 1:3))
dt
dplyr::tbl_dt(dt)

12

Currently covered by @jangorecki's PR #1448; Jan, assuming #1529 is merged first, could you edit the print.data.table man page for your PR?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions