Skip to content

fread fails with SIGABRT when printing "Expected sep (',') but new line or EOF ends field 14 on line 33 when reading data" error #802

@vlsi

Description

@vlsi

It looks like fread does not like long lines when printing error messages.
When the line gets long, fread just crashes.

The sample data can be found in this gist: https://gist.github.com/vlsi/3b9e9e986bf952360397

The input CSV is not well formed, however I expect fread would pin-point the wrong pieces.
From the comma_sequence_per_line.csv it looks like I have non-teriminated quoted field at line 9.
Ultimately I would like fread to report exactly that: "possible missing quote for the field started at line 9".

Here's the proper (at least it does not crash R) error message (I've shortened all the words and it made fread to work):

R version 3.1.0 (2014-04-10) -- "Spring Dance"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.2.0 (64-bit)

> library(data.table)
data.table 1.9.3  For help type: help("data.table")
> fread('fails_with_proper_error.csv')
Error in fread("fails_with_proper_error.csv") : 
  Expected sep (',') but new line or EOF ends field 14 on line 33 when reading data: 6,3,6,,3,2,7,W,J,5,2,6,"X  #, ,D,B,A,P,,,,,2,,.
0,8,,,,,,F,Z,6,2,1,,,,,,,,,,5,,.
8,2,,,,,,I,M,0,1,2,,,,,,,,,,5,,.
8,2,,,,,,A,W,6,8,3,,,,,,,,,,8,,#,I,N,L,C,D,K,L,Q,R,J,L,V,E,F,O,N,E,B,Q,Z,S,Y,J
8,3,3,8,2,1,3,Y,S,2,5,4,H,,K,,L,,,,,4,,.
8,7,7,,6,7,0,L,B,1,0,8,K,Q,A,L,Q,,,,,7,,.
8,8,3,7,4,2,5,M,N,3,1,6,I,K,S,L,Q,,,,,5,,.
7,7,0,,6,1,4,V,K,7,6,2,W,S,S,J,P,,,,,1,Y,.
2,3,6,5,8,7,1,Q,H,8,1,4,F,X,V,O,M,,,,,8,A,.
6,8,5,8,4,6,7,S,J,8,7,4,R,B,Y,X,I,,,,,3,Y,.
2,2,0,8,6,4,2,Q,O,6,8,2,I,N,S,M,C,,,,,3,Z,.
6,8,1,3,4,0,1,P,V,6,7,4,J,F,Q,L,E,,,,,1,K,.
6,3,7,0,3,4,7,E,B,5,4,3,D,V,N,L,O,,,,,8,"P",.

Here's abort case:

> fread('fails_with_abort.csv')
Abort trap: 6
bash-3.2$

Here's lldb backtrace. I am sorry I have no idea how to enable debug support to make local variables visible to lldb.

(lldb) bt
* thread #1: tid = 0x40ea86, 0x00007fff82b8d866 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00007fff82b8d866 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff8aa7935c libsystem_pthread.dylib`pthread_kill + 92
    frame #2: 0x00007fff8b71db1a libsystem_c.dylib`abort + 125
    frame #3: 0x00007fff8b71dc91 libsystem_c.dylib`abort_report_np + 181
    frame #4: 0x00007fff8b741860 libsystem_c.dylib`__chk_fail + 48
    frame #5: 0x00007fff8b741830 libsystem_c.dylib`__chk_fail_overflow + 16
    frame #6: 0x00007fff8b741d5a libsystem_c.dylib`__sprintf_chk + 205
    frame #7: 0x000000011177953b datatable.so`readfile + 12465
    frame #8: 0x000000010ea3256e libR.dylib`do_dotcall + 1146
    frame #9: 0x000000010ea623dc libR.dylib`bcEval + 10059
    frame #10: 0x000000010ea5f637 libR.dylib`Rf_eval + 358
    frame #11: 0x000000010ea6a9b4 libR.dylib`Rf_applyClosure + 1482
    frame #12: 0x000000010ea5fa74 libR.dylib`Rf_eval + 1443
    frame #13: 0x000000010ea6d37f libR.dylib`do_set + 245
    frame #14: 0x000000010ea5fac3 libR.dylib`Rf_eval + 1522
    frame #15: 0x000000010ea6cf35 libR.dylib`do_begin + 465
    frame #16: 0x000000010ea5fac3 libR.dylib`Rf_eval + 1522
    frame #17: 0x000000010ea6a9b4 libR.dylib`Rf_applyClosure + 1482
    frame #18: 0x000000010ea5fa74 libR.dylib`Rf_eval + 1443
    frame #19: 0x000000010ea6a152 libR.dylib`Rf_evalList + 326
    frame #20: 0x000000010ea5f821 libR.dylib`Rf_eval + 848
    frame #21: 0x000000010ea6d37f libR.dylib`do_set + 245
    frame #22: 0x000000010ea5fac3 libR.dylib`Rf_eval + 1522
    frame #23: 0x000000010ea6cf35 libR.dylib`do_begin + 465
    frame #24: 0x000000010ea5fac3 libR.dylib`Rf_eval + 1522
    frame #25: 0x000000010ea6a9b4 libR.dylib`Rf_applyClosure + 1482
    frame #26: 0x000000010ea5fa74 libR.dylib`Rf_eval + 1443
    frame #27: 0x000000010ea6d37f libR.dylib`do_set + 245
    frame #28: 0x000000010ea5fac3 libR.dylib`Rf_eval + 1522
    frame #29: 0x000000010ea6cf35 libR.dylib`do_begin + 465
    frame #30: 0x000000010ea5fac3 libR.dylib`Rf_eval + 1522
    frame #31: 0x000000010ea6a9b4 libR.dylib`Rf_applyClosure + 1482
    frame #32: 0x000000010ea5fa74 libR.dylib`Rf_eval + 1443
    frame #33: 0x000000010ea6d37f libR.dylib`do_set + 245
    frame #34: 0x000000010ea5fac3 libR.dylib`Rf_eval + 1522
    frame #35: 0x000000010ea8ff92 libR.dylib`Rf_ReplIteration + 1082
    frame #36: 0x000000010ea911c4 libR.dylib`R_ReplConsole + 147
    frame #37: 0x000000010ea91102 libR.dylib`run_Rmainloop + 73
    frame #38: 0x000000010e9c2f54 R`main + 27
(lldb) frame select 7
frame #7: 0x000000011177953b datatable.so`readfile + 12465
datatable.so`readfile + 12465:
-> 0x11177953b:  addq   $0x20, %rsp
   0x11177953f:  callq  0x11177643c               ; EXIT
   0x111779544:  movq   %r14, %rsi
   0x111779547:  cmpl   $0x5, %r13d
(lldb) register read
General Purpose Registers:
       rbx = 0x000000000000000a
       rbp = 0x00007fff5123a5b0
       rsp = 0x00007fff5123a2d0
       r12 = 0x0000000000000012
       r13 = 0x0000000000000004
       r14 = 0x00007fff5123a350
       r15 = 0x0000000000000012
       rip = 0x000000011177953b  datatable.so`readfile + 12465
13 registers were unavailable.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions