-
Couldn't load subscription status.
- Fork 1.1k
Fix excessive exception nesting when cancelling ExpectMsgAsync #7747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix excessive exception nesting when cancelling ExpectMsgAsync #7747
Conversation
Fixes akkadotnet#7743 by preventing double AggregateException wrapping in cancelled async operations
…ation - Timeout scenarios return (false, null) following Try pattern - User cancellation throws OperationCanceledException as expected - Uses linked cancellation token for cleaner timeout/cancellation logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-Review
Problem Analysis
The issue (#7743) was that when ExpectMsgAsync was cancelled, it threw exceptions with excessive nesting - AggregateException containing another AggregateException containing the actual OperationCanceledException. This made exception handling unpredictable and inconsistent with standard .NET async patterns.
Root Cause
The excessive nesting occurred in InternalTryReceiveOneAsync when using Task.WhenAny() with tasks that could be cancelled. The original implementation didn't properly handle the distinction between timeout and user cancellation, leading to multiple layers of exception wrapping.
Solution Approach
Key Changes in TestKitBase_Receive.cs:
- Replaced
Task.WhenAnypattern with directWaitToReadAsync: This eliminates one source of potential exception wrapping - Used linked cancellation token: Combines the timeout and user cancellation into a single token, making the logic cleaner
- Distinguished timeout from cancellation:
- Timeout (normal case) → returns
(false, null)following the Try pattern - User cancellation (exceptional case) → throws
OperationCanceledExceptionas expected in async methods
- Timeout (normal case) → returns
The key insight is that InternalTryReceiveOneAsync follows a "Try" pattern but should still throw for cancellation (as is standard in async methods). The refined implementation:
catch (OperationCanceledException) when (\!cancellationToken.IsCancellationRequested)
{
// This was a timeout, not user cancellation - return false
take = (false, null);
}
// If cancellationToken.IsCancellationRequested is true, let the exception propagateTest Coverage
Added TestKitAsyncCancellationSpec.cs with 4 comprehensive tests that verify:
- No double-nested
AggregateExceptionwrapping - Proper exception types (
OperationCanceledExceptionor subclasses) - Consistent behavior between sync and async APIs
- The exact scenario from the original bug report now works correctly
Validation
- All 291 existing TestKit tests pass
- New tests specifically validate the fix
- The solution maintains backward compatibility while fixing the issue
This approach ensures cancellation exceptions now have minimal, predictable nesting consistent with standard .NET async patterns.
|
I assume it would fix not only my issue, but lots of subtle inconsistencies like cancellation being reported as timeout, which currently manifestins as strange-ish error messages along the lines "timout of 100 days reached after 0.03 seconds". |
The previous implementation could incorrectly trigger cancellation when using CreateLinkedTokenSource with a default (non-cancellable) token. This was causing test failures in Akka.Cluster.Metrics.Tests where test initialization would timeout incorrectly. Changes: - Create timeout CancellationTokenSource directly with the timeout value - Only create linked token source when user provides a cancellable token - Properly distinguish between timeout and user cancellation in exception handling - User cancellation exceptions are re-thrown, timeout exceptions return false This fixes the MetricValuesSpec constructor failures where CreateTestData was incorrectly timing out during metrics collection.
|
This is having some unintended consequences elsewhere in the test suite, so we might have to rethink how we're doing this. |
|
Ah, I think it's the |
- Convert test to async to avoid Thread.Sleep blocking - Increase timeout from 10s to 30s for resource-constrained CI environments - Add exception handling for MaxWorkingSet access on Linux/Mono - Remove ConfigureAwait(false) from test code
|
@Arkatufus I think this is ready to review now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary
Changes
InternalTryReceiveOneAsyncto handle cancellation properly without extra exception wrappingTest Plan
TestKitAsyncCancellationSpecwith 4 test cases covering various cancellation scenarios