-
Notifications
You must be signed in to change notification settings - Fork 128
Introduce thread pool for writing goto binaries in parallel #4236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce thread pool for writing goto binaries in parallel #4236
Conversation
|
Running |
|
I also don't think it's unexpected that this introduced regressions on the @tautschnig Does it make sense to cap the number of threads used (taken from the value of |
Yes, this capping sounds like a good idea! |
afd4f52 to
ba2a7e8
Compare
* thread pools are now dynamically sized * that size is set based on a const var capped by the # of harnesses
ba2a7e8 to
14336e3
Compare
based on available parallelism, # of harnesses, and the rate the compiler can generate goto files
|
I ended up implementing the full calculation for the size of the thread pool as something along the lines of Our compiler generates goto files at a fixed rate (currently around twice as fast as they can be exported), so adding any more than 3-4 threads to the export pool seems to have no real performance benefit as the extra threads are mostly sitting around without work to do. Since the internal performance of our compiler isn't known by Kani users, it feels like this might be better kept as an internal calculation rather than taking it as an input. Does that seem alright @tautschnig? |
|
The inspiration from this change came from #2505, specifically #2505 (comment). @AlexanderPortland can you check if these changes resolve the issue? Of course resolution here is subjective, since we have to decide what reasonable performance is, but let's do some measurements and see if this version of Kani builds in a more reasonable amount of time. (This repo may serve as a useful benchmark for some of the other compiler perf improvements you've made as well, in addition to the standard library). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this, but would appreciate attention to Carolyn's comment with regard to #2505.
|
Just ran the tests and, on my local machine, this change brings the |
8adc279
from the autogenerated : ## What's Changed * Ensure that contract closures are FnOnce by @vonaka in #4151 * Adjust sized hierarchy for Kani's memory predicates by @tautschnig in #4193 * Update to Rust edition 2024 by @tautschnig in #4197 * `ptr_offset_from`: Replace arithmetic over pointers by offset arithmetic by @tautschnig in #4180 * Automatic cargo update to 2025-07-07 by @github-actions[bot] in #4208 * Bump tests/perf/s2n-quic from `b8f8cca` to `8715fdf` by @dependabot[bot] in #4209 * Upgrade Rust toolchain to 2025-07-04 by @tautschnig in #4199 * Upgrade Rust toolchain to 2025-07-10 by @thanhnguyen-aws in #4215 * Update CBMC dependency to 6.7.1 by @tautschnig in #4178 * Split compiler flags to avoid dependency recompilation by @AlexanderPortland in #4211 * Fix the bug that assign clause cannot be inferred for the inner loop of nested loops by @thanhnguyen-aws in #4179 * Upgrade Rust toolchain to 2025-07-11 by @thanhnguyen-aws in #4219 * Automatic toolchain upgrade to nightly-2025-07-12 by @github-actions[bot] in #4222 * Fix bug: `goto-cc` crash when there are two quantifers in one proof by @thanhnguyen-aws in #4221 * Automatic toolchain upgrade to nightly-2025-07-13 by @github-actions[bot] in #4223 * Automatic cargo update to 2025-07-14 by @github-actions[bot] in #4224 * Cleanup links to issues that have been addressed by @tautschnig in #4200 * Selectively enable and fix (slow) Tokio tests by @tautschnig in #4203 * Bump tests/perf/s2n-quic from `32ba87d` to `1cbd879` by @dependabot[bot] in #4227 * Implement support for Cargo.toml's default-members by @tautschnig in #4201 * Do not invoke memset with count of zero by @tautschnig in #4205 * Support bitwuzla, cvc5, z3 as solver attribute values by @tautschnig in #4218 * Use CBMC's shuffle_vector expression by @tautschnig in #4204 * Move tests from slow/kani back to regular suite by @tautschnig in #4202 * Automatic toolchain upgrade to nightly-2025-07-14 by @github-actions[bot] in #4225 * Enable GitHub Linux/Arm runners in CI by @tautschnig in #3841 * Automatic cargo update to 2025-07-21 by @github-actions[bot] in #4231 * Skip codegen for unneeded harnesses by @AlexanderPortland in #4213 * Strongly type differing compiler args for clarity by @AlexanderPortland in #4220 * Remove StableMIR ICE workaround by @carolynzech in #4235 * Fix bug: Kani unwinds loops with contract in generic function (with -Z loop-contracts) by @thanhnguyen-aws in #4232 * Automatic cargo update to 2025-07-28 by @github-actions[bot] in #4238 * Bump tests/perf/s2n-quic from `1cbd879` to `4938450` by @dependabot[bot] in #4242 * Upgrade Rust toolchain to 2025-07-21 by @tautschnig in #4241 * Remove `pretty_ty` and use rustc_public's formatter instead by @tautschnig in #4243 * Upgrade Rust toolchain to 2025-07-24 by @tautschnig in #4244 * Documentation cleanup of UB detected by Kani by @tautschnig in #4245 * Upgrade Rust toolchain to 2025-07-29 by @tautschnig in #4247 * Automatic toolchain upgrade to nightly-2025-07-30 by @github-actions[bot] in #4253 * Add unstable option prove-safety-only by @tautschnig in #4239 * Set bits_per_byte in byte_extract expressions by @tautschnig in #4255 * `KaniAttributes` Path Resolution Refactor by @carolynzech in #4249 * Automatic toolchain upgrade to nightly-2025-07-31 by @github-actions[bot] in #4256 * Support contracts & stubs in trait implementations (partial fix) by @carolynzech in #4250 * [Breaking Changes] Remove unstable list feature and default memory checks by @carolynzech in #4258 * Upgrade Rust toolchain to 2025-08-01 by @tautschnig in #4261 * Autoharness: add support for references by @tautschnig in #4234 * Turn off debug assertions under `--prove-safety-only` by @tautschnig in #4262 * Automatic toolchain upgrade to nightly-2025-08-02 by @github-actions[bot] in #4264 * Automatic toolchain upgrade to nightly-2025-08-03 by @github-actions[bot] in #4265 * Automatic cargo update to 2025-08-04 by @github-actions[bot] in #4267 * Automatic toolchain upgrade to nightly-2025-08-04 by @github-actions[bot] in #4266 * Introduce thread pool for writing goto binaries in parallel by @AlexanderPortland in #4236 * Major-version update cargo dependencies by @tautschnig in #4240 * Bump tests/perf/s2n-quic from `4938450` to `8f510f0` by @dependabot[bot] in #4270 * Automatic toolchain upgrade to nightly-2025-08-05 by @github-actions[bot] in #4271 * Automatic toolchain upgrade to nightly-2025-08-06 by @github-actions[bot] in #4272 * Avoid updating irrelevant symbols when handling quantifiers by @AlexanderPortland in #4268 * Lazily evaluate debug info by @AlexanderPortland in #4269 * Clone a template `BodyTransformer` to avoid re-initialization by @AlexanderPortland in #4259 * Ensuring that MIR constants are marked as static consts by @vonaka in #4233 * Fix release job dependencies by @tautschnig in #4273 ## New Contributors * @vonaka made their first contribution in #4151 **Full Changelog**: kani-0.64.0...kani-0.65.0 --------- Co-authored-by: Zyad Hassan <[email protected]>
Serializing and writing goto binaries is a serious bottleneck for the Kani compiler, with profiling indicating that it takes around half of the total codegen time.
This PR introduces a variable size thread pool specifically for serializing goto binaries and writing them to disk. Now, instead of having to do everything itself, the main compiler thread just has collect all the data needed for serialization (requiring relatively inexpensive clones of some local state) and dispatch it to the thread pool's work queue. This allows the main thread to move on quickly while the pool's worker threads handle generating the binaries off of the critical path of compilation.
Results
The table below shows wall clock end-to-end compile times before and after this change. This metric corresponds to how long a user would have to wait for Kani's compilation to finish before verification can begin.
177d0fd)prostcrate (from #2505)from local runs on a 12 core M3 Mac
Resolves #2505.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.