-
Notifications
You must be signed in to change notification settings - Fork 4k
tests: use admission.io.overload in admission-control/elastic-io #155413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The previously used sub-level metric was flawed, in that the IO overload score could stay low even at higher sub-level counts if L0 had very few bytes (which is a deliberate choice in admission control). So admission control would not throttle elastic work as aggressively as the test expected it to. Running with this change, we don't exceed a score of 0.15, while the previously used sub-level count metric spikes higher. For example: ``` 2025/10/14 22:13:15 admission_control_elastic_io.go:105: admission_io_overload(store=1): 0.100000 2025/10/14 22:13:25 admission_control_elastic_io.go:105: admission_io_overload(store=1): 0.100000 I251014 22:13:27.665296 868 util/admission/io_load_listener.go:780 ⋮ [T1,Vsystem,n1,s1] 2918 IO overload: compaction score 0.150 (131 ssts, 9 sub-levels), L0 growth 551 MiB (write 551 MiB (ignored 0 B) ingest 0 B (ignored 0 B)): requests 15985 (0 bypassed) with 505 MiB acc-write (0 B bypassed) + 0 B acc-ingest (0 B bypassed) + 551 MiB adjusted-LSM-writes + 4.2 GiB adjusted-disk-writes + write-model 1.09x+1 B (smoothed 1.08x+1 B) + l0-ingest-model 0.00x+0 B (smoothed 0.75x+1 B) + ingest-model 0.00x+0 B (smoothed 1.00x+1 B) + write-amp-model 7.87x+1 B (smoothed 8.01x+1 B) + at-admission-tokens 126 B, compacted 550 MiB [≈545 MiB], flushed 799 MiB [≈838 MiB] (mult 0.77); admitting 649 MiB (rate 43 MiB/s) (elastic 519 MiB rate 35 MiB/s) due to memtable flush (multiplier 0.775) (used total: 543 MiB elastic 541 MiB); write stalls 0; diskBandwidthLimiter (unlimited) (tokenUtilization 0.00, tokensUsed (elastic 0 B, snapshot 0 B, regular 0 B) tokens (write 0 B (prev 0 B), read 0 B (prev 0 B)), writeBW 0 B/s, readBW 0 B/s, provisioned 0 B/s) 2025/10/14 22:13:35 admission_control_elastic_io.go:105: admission_io_overload(store=1): 0.050000 I251014 22:13:42.666326 868 util/admission/io_load_listener.go:780 ⋮ [T1,Vsystem,n1,s1] 2926 IO overload: compaction score 0.050 (70 ssts, 5 sub-levels), L0 growth 498 MiB (write 498 MiB (ignored 0 B) ingest 0 B (ignored 0 B)): requests 15228 (0 bypassed) with 480 MiB acc-write (0 B bypassed) + 0 B acc-ingest (0 B bypassed) + 498 MiB adjusted-LSM-writes + 4.2 GiB adjusted-disk-writes + write-model 1.04x+1 B (smoothed 1.06x+1 B) + l0-ingest-model 0.00x+0 B (smoothed 0.75x+1 B) + ingest-model 0.00x+0 B (smoothed 1.00x+1 B) + write-amp-model 8.57x+1 B (smoothed 8.29x+1 B) + at-admission-tokens 153 B, compacted 498 MiB [≈522 MiB], flushed 883 MiB [≈860 MiB] (mult 0.77); admitting 667 MiB (rate 44 MiB/s) (elastic 533 MiB rate 36 MiB/s) due to memtable flush (multiplier 0.775) (used total: 519 MiB elastic 517 MiB); write stalls 0; diskBandwidthLimiter (unlimited) (tokenUtilization 0.00, tokensUsed (elastic 0 B, snapshot 0 B, regular 0 B) tokens (write 0 B (prev 0 B), read 0 B (prev 0 B)), writeBW 0 B/s, readBW 0 B/s, provisioned 0 B/s) ``` Fixes cockroachdb#148786 Epic: none Release note: None
tbg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
bors r+
155413: tests: use admission.io.overload in admission-control/elastic-io r=tbg a=sumeerbhola The previously used sub-level metric was flawed, in that the IO overload score could stay low even at higher sub-level counts if L0 had very few bytes (which is a deliberate choice in admission control). So admission control would not throttle elastic work as aggressively as the test expected it to. Running with this change, we don't exceed a score of 0.15, while the previously used sub-level count metric spikes higher. For example: ``` 2025/10/14 22:13:15 admission_control_elastic_io.go:105: admission_io_overload(store=1): 0.100000 2025/10/14 22:13:25 admission_control_elastic_io.go:105: admission_io_overload(store=1): 0.100000 I251014 22:13:27.665296 868 util/admission/io_load_listener.go:780 ⋮ [T1,Vsystem,n1,s1] 2918 IO overload: compaction score 0.150 (131 ssts, 9 sub-levels), L0 growth 551 MiB (write 551 MiB (ignored 0 B) ingest 0 B (ignored 0 B)): requests 15985 (0 bypassed) with 505 MiB acc-write (0 B bypassed) + 0 B acc-ingest (0 B bypassed) + 551 MiB adjusted-LSM-writes + 4.2 GiB adjusted-disk-writes + write-model 1.09x+1 B (smoothed 1.08x+1 B) + l0-ingest-model 0.00x+0 B (smoothed 0.75x+1 B) + ingest-model 0.00x+0 B (smoothed 1.00x+1 B) + write-amp-model 7.87x+1 B (smoothed 8.01x+1 B) + at-admission-tokens 126 B, compacted 550 MiB [≈545 MiB], flushed 799 MiB [≈838 MiB] (mult 0.77); admitting 649 MiB (rate 43 MiB/s) (elastic 519 MiB rate 35 MiB/s) due to memtable flush (multiplier 0.775) (used total: 543 MiB elastic 541 MiB); write stalls 0; diskBandwidthLimiter (unlimited) (tokenUtilization 0.00, tokensUsed (elastic 0 B, snapshot 0 B, regular 0 B) tokens (write 0 B (prev 0 B), read 0 B (prev 0 B)), writeBW 0 B/s, readBW 0 B/s, provisioned 0 B/s) 2025/10/14 22:13:35 admission_control_elastic_io.go:105: admission_io_overload(store=1): 0.050000 I251014 22:13:42.666326 868 util/admission/io_load_listener.go:780 ⋮ [T1,Vsystem,n1,s1] 2926 IO overload: compaction score 0.050 (70 ssts, 5 sub-levels), L0 growth 498 MiB (write 498 MiB (ignored 0 B) ingest 0 B (ignored 0 B)): requests 15228 (0 bypassed) with 480 MiB acc-write (0 B bypassed) + 0 B acc-ingest (0 B bypassed) + 498 MiB adjusted-LSM-writes + 4.2 GiB adjusted-disk-writes + write-model 1.04x+1 B (smoothed 1.06x+1 B) + l0-ingest-model 0.00x+0 B (smoothed 0.75x+1 B) + ingest-model 0.00x+0 B (smoothed 1.00x+1 B) + write-amp-model 8.57x+1 B (smoothed 8.29x+1 B) + at-admission-tokens 153 B, compacted 498 MiB [≈522 MiB], flushed 883 MiB [≈860 MiB] (mult 0.77); admitting 667 MiB (rate 44 MiB/s) (elastic 533 MiB rate 36 MiB/s) due to memtable flush (multiplier 0.775) (used total: 519 MiB elastic 517 MiB); write stalls 0; diskBandwidthLimiter (unlimited) (tokenUtilization 0.00, tokensUsed (elastic 0 B, snapshot 0 B, regular 0 B) tokens (write 0 B (prev 0 B), read 0 B (prev 0 B)), writeBW 0 B/s, readBW 0 B/s, provisioned 0 B/s) ``` Fixes #148786 Fixes #156168 Fixes #156215 Epic: none Release note: None 155958: roachtestutil: use DETACHED option for INSPECT jobs r=spilchen a=spilchen Previously, CheckInspectDatabase used a statement timeout hack to background INSPECT jobs - it set a 5-second timeout and relied on the timeout error to leave jobs running. Now that INSPECT supports the DETACHED option, we use `INSPECT DATABASE <name> WITH OPTIONS DETACHED` to properly run jobs in the background. This provides a cleaner way to background the job. Informs #155676 Epic: CRDB-55075 Release note: none Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: Matt Spilchen <[email protected]>
|
Build failed (retrying...): |
|
Based on the specified backports for this PR, I applied new labels to the following linked issue(s). Please adjust the labels as needed to match the branches actually affected by the issue(s), including adding any known older branches. Issue #156215: branch-release-25.4. Issue #148786: branch-release-25.4. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
The previously used sub-level metric was flawed, in that the IO overload score could stay low even at higher sub-level counts if L0 had very few bytes (which is a deliberate choice in admission control). So admission control would not throttle elastic work as aggressively as the test expected it to.
Running with this change, we don't exceed a score of 0.15, while the previously used sub-level count metric spikes higher. For example:
Fixes #148786
Fixes #156168
Fixes #156215
Epic: none
Release note: None