-
Couldn't load subscription status.
- Fork 1k
Description
Which part is this question about
Regarding the library API usage.
Describe your question
I am using high-level API (FileReader and FileDecoder) to read IPC files via mmap. I have noticed that validate_data() in the Array building process (here) adds significant overhead.
I am targeting an ultra-low-latency scenario. With validate_data I got 290ms for reading a 2.2GB IPC file (via mmap), and 3.8ms without validate_data, which I tested locally by commenting that out. 3.8ms latency is pretty much identical to c++ arrow implementation I tested, and I suspect c++ codebase didn't do this sanity check (not entirely sure).
The functions for the "unchecked" building are here in the codebase, but they are not accessible from high-level API, where I can easily disable them without creating my own array and everything on top of it.
I wonder if there is any better way to achieve that?
Additional context
Low latency is critical in my case. Thus, I am trying to avoid any additional overhead (C++ codebase as the baseline, maybe?)