@@ -6,7 +6,138 @@ fast-float
66[ ![ Documentation] ( https://docs.rs/fast-float/badge.svg )] ( https://docs.rs/fast-float )
77[ ![ Apache 2.0] ( https://img.shields.io/badge/License-Apache%202.0-blue.svg )] ( https://opensource.org/licenses/Apache-2.0 )
88[ ![ MIT] ( https://img.shields.io/badge/License-MIT-blue.svg )] ( https://opensource.org/licenses/MIT )
9- [ ![ Rust 1.47+] ( https://img.shields.io/badge/rustc-1.47+-lightgray.svg )] ( https://blog.rust-lang.org/2020/10/08/Rust-1.47.html )
9+ [ ![ Rustc 1.47+] ( https://img.shields.io/badge/rustc-1.47+-lightgray.svg )] ( https://blog.rust-lang.org/2020/10/08/Rust-1.47.html )
10+
11+ This crate provides a super-fast decimal number parser from strings into floats.
12+
13+ ``` toml
14+ [dependencies ]
15+ fast-float = " 0.1"
16+ ```
17+
18+ There are no dependencies and the crate can be used in a no_std context by disabling the "std" feature.
19+
20+ * Compiler support: rustc 1.47+.*
21+
22+ ## Usage
23+
24+ There's two top-level functions provided:
25+ [ ` parse() ` ] ( https://docs.rs/fast-float/latest/fast_float/fn.parse.html ) and
26+ [ ` parse_partial() ` ] ( https://docs.rs/fast-float/latest/fast_float/fn.parse_partial.html ) , both taking
27+ either a string or a bytes slice and parsing the input into either ` f32 ` or ` f64 ` :
28+
29+ - ` parse() ` treats the whole string as a decimal number and returns an error if there are
30+ invalid characters or if the string is empty.
31+ - ` parse_partial() ` tries to find the longest substring at the beginning of the given input
32+ string that can be parsed as a decimal number and, in the case of success, returns the parsed
33+ value along the number of characters processed; an error is returned if the string doesn't
34+ start with a decimal number or if it is empty. This function is most useful as a building
35+ block when constructing more complex parsers, or when parsing streams of data.
36+
37+ Example:
38+
39+ ``` rust
40+ // Parse the entire string as a decimal number.
41+ let s = " 1.23e-02" ;
42+ let x : f32 = fast_float :: parse (s ). unwrap ();
43+ assert_eq! (x , 0.0123 );
44+
45+ // Parse as many characters as possible as a decimal number.
46+ let s = " 1.23e-02foo" ;
47+ let (x , n ) = fast_float :: parse_partial :: <f32 , _ >(s ). unwrap ();
48+ assert_eq! (x , 0.0123 );
49+ assert_eq! (n , 8 );
50+ assert_eq! (& s [n .. ], " foo" );
51+ ```
52+
53+ ## Details
54+
55+ This crate is a direct port of Daniel Lemire's [ ` fast_float ` ] ( https://github.com/fastfloat/fast_float )
56+ C++ library (valuable discussions with Daniel while porting it helped shape the crate and get it to
57+ the performance level it's at now), with some Rust-specific tweaks. Please see the original
58+ repository for many useful details regarding the algorithm and the implementation.
59+
60+ The parser is locale-independent. The resulting value is the closest floating-point values (using either
61+ ` f32 ` or `f64), using the "round to even" convention for values that would otherwise fall right in-between
62+ two values. That is, we provide exact parsing according to the IEEE standard.
63+
64+ Infinity and NaN values can be parsed, along with scientific notation.
65+
66+ Both little-endian and big-endian platforms are equally supported, with extra optimizations enabled
67+ on little-endian architectures.
68+
69+ ## Performance
70+
71+ The presented parser seems to beat all of the existing C/C++/Rust float parsers known to us at the
72+ moment by a large margin, in all of the datasets we tested it on so far – see detailed benchmarks
73+ below (the only exception being the original fast_float C++ library, of course – performance of
74+ which is within noise bounds of this crate). On modern machines, parsing throughput can reach
75+ up to 1GB/s.
76+
77+ In particular, it is faster than Rust standard library's ` FromStr::from_str() ` by a factor of 2-8x
78+ (larger factor for longer float strings).
79+
80+ While various details regarding the algorithm can be found in the repository for the original
81+ C++ library, here are few brief notes:
82+
83+ - The parser is specialized to work lightning-fast on inputs with at most 19 significant digits
84+ (which constitutes the so called "fast-path"). We believe that most real-life inputs should
85+ fall under this category, and we treat longer inputs as "degenerate" edge cases since it
86+ inevitable causes overflows and loss of precision.
87+ - If the significand happens to be longer than 19 digits, the parser falls back to the "slow path",
88+ in which case its performance roughly matches that of the top Rust/C++ libraries (and still
89+ beats them most of the time, although not by a lot).
90+ - On little-endian systems, there's additional optimizations for numbers with more than 8 digits
91+ after the decimal point.
92+
93+ ## Benchmarks
94+
95+ Below is the table of average timings in nanoseconds for parsing a single number
96+ into a 64-bit float.
97+
98+ ```
99+ | | `canada` | `mesh` | `uniform` | `iidi` | `iei` | `rec32` |
100+ | ---------------- | -------- | -------- | --------- | ------ | ------ | ------- |
101+ | fast-float | 22.08 | 11.10 | 20.04 | 40.77 | 26.33 | 29.84 |
102+ | lexical | 61.63 | 25.10 | 53.77 | 72.33 | 53.39 | 72.40 |
103+ | lexical/lossy | 61.51 | 25.24 | 54.00 | 71.30 | 52.87 | 71.71 |
104+ | from_str | 175.07 | 22.58 | 103.00 | 228.78 | 115.76 | 211.13 |
105+ | fast_float (C++) | 22.78 | 10.99 | 20.05 | 41.12 | 27.51 | 30.85 |
106+ | abseil (C++) | 42.66 | 32.88 | 46.01 | 50.83 | 46.33 | 49.95 |
107+ | netlib (C++) | 57.53 | 24.86 | 64.72 | 56.63 | 36.20 | 67.29 |
108+ | strtod (C) | 286.10 | 31.15 | 258.73 | 295.73 | 205.72 | 315.95 |
109+ ```
110+
111+ Parsers:
112+
113+ - ` fast-float ` - this very crate
114+ - ` lexical ` – from ` lexical_core ` crate, v0.7
115+ - ` lexical/lossy ` - from ` lexical_core ` crate, v0.7 (lossy parser)
116+ - ` from_str ` – Rust standard library, ` FromStr ` trait
117+ - ` fast_float (C++) ` – original C++ implementation of 'fast-float' method
118+ - ` abseil (C++) ` – Abseil C++ Common Libraries
119+ - ` netlib (C++) ` – C++ Network Library
120+ - ` strtod (C) ` – C standard library
121+
122+ Datasets:
123+
124+ - ` canada ` – numbers in ` canada.txt ` file
125+ - ` mesh ` – numbers in ` mesh.txt ` file
126+ - ` uniform ` – uniform random numbers from 0 to 1
127+ - ` iidi ` – random numbers of format ` %d%d.%d `
128+ - ` iei ` – random numbers of format ` %de%d `
129+ - ` rec32 ` – reciprocals of random 32-bit integers
130+
131+ Notes:
132+
133+ - Test environment: macOS 10.14.6, clang 11.0, Rust 1.49, 3.5 GHz i7-4771 Haswell.
134+ - The two test files referred above can be found in
135+ [ this] ( https://github.com/lemire/simple_fastfloat_benchmark ) repository.
136+ - The Rust part of the table (along with a few other benchmarks) can be generated via
137+ the benchmark tool that can be found under ` extras/simple-bench ` of this repo.
138+ - The C/C++ part of the table (along with a few other benchmarks and parsers) can be
139+ generated via a C++ utility that can be found in [ this] ( https://github.com/lemire/simple_fastfloat_benchmark )
140+ repository.
10141
11142<br >
12143
0 commit comments