Skip to content

Conversation

@Aaronontheweb
Copy link
Member

Summary

Changes

  • Modified InternalTryReceiveOneAsync to handle cancellation properly without extra exception wrapping
  • Added comprehensive test coverage for cancellation scenarios

Test Plan

  • Added TestKitAsyncCancellationSpec with 4 test cases covering various cancellation scenarios
  • All existing TestKit tests pass (291 passed, 1 skipped)
  • Verified the original bug report scenario now works correctly

Fixes akkadotnet#7743 by preventing double AggregateException wrapping in cancelled async operations
…ation

- Timeout scenarios return (false, null) following Try pattern
- User cancellation throws OperationCanceledException as expected
- Uses linked cancellation token for cleaner timeout/cancellation logic
Copy link
Member Author

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-Review

Problem Analysis

The issue (#7743) was that when ExpectMsgAsync was cancelled, it threw exceptions with excessive nesting - AggregateException containing another AggregateException containing the actual OperationCanceledException. This made exception handling unpredictable and inconsistent with standard .NET async patterns.

Root Cause

The excessive nesting occurred in InternalTryReceiveOneAsync when using Task.WhenAny() with tasks that could be cancelled. The original implementation didn't properly handle the distinction between timeout and user cancellation, leading to multiple layers of exception wrapping.

Solution Approach

Key Changes in TestKitBase_Receive.cs:

  1. Replaced Task.WhenAny pattern with direct WaitToReadAsync: This eliminates one source of potential exception wrapping
  2. Used linked cancellation token: Combines the timeout and user cancellation into a single token, making the logic cleaner
  3. Distinguished timeout from cancellation:
    • Timeout (normal case) → returns (false, null) following the Try pattern
    • User cancellation (exceptional case) → throws OperationCanceledException as expected in async methods

The key insight is that InternalTryReceiveOneAsync follows a "Try" pattern but should still throw for cancellation (as is standard in async methods). The refined implementation:

catch (OperationCanceledException) when (\!cancellationToken.IsCancellationRequested)
{
    // This was a timeout, not user cancellation - return false
    take = (false, null);
}
// If cancellationToken.IsCancellationRequested is true, let the exception propagate

Test Coverage

Added TestKitAsyncCancellationSpec.cs with 4 comprehensive tests that verify:

  • No double-nested AggregateException wrapping
  • Proper exception types (OperationCanceledException or subclasses)
  • Consistent behavior between sync and async APIs
  • The exact scenario from the original bug report now works correctly

Validation

  • All 291 existing TestKit tests pass
  • New tests specifically validate the fix
  • The solution maintains backward compatibility while fixing the issue

This approach ensures cancellation exceptions now have minimal, predictable nesting consistent with standard .NET async patterns.

@quixoticaxis
Copy link

I assume it would fix not only my issue, but lots of subtle inconsistencies like cancellation being reported as timeout, which currently manifestins as strange-ish error messages along the lines "timout of 100 days reached after 0.03 seconds".

The previous implementation could incorrectly trigger cancellation when
using CreateLinkedTokenSource with a default (non-cancellable) token.
This was causing test failures in Akka.Cluster.Metrics.Tests where
test initialization would timeout incorrectly.

Changes:
- Create timeout CancellationTokenSource directly with the timeout value
- Only create linked token source when user provides a cancellable token
- Properly distinguish between timeout and user cancellation in exception handling
- User cancellation exceptions are re-thrown, timeout exceptions return false

This fixes the MetricValuesSpec constructor failures where CreateTestData
was incorrectly timing out during metrics collection.
@Aaronontheweb
Copy link
Member Author

This is having some unintended consequences elsewhere in the test suite, so we might have to rethink how we're doing this.

@Aaronontheweb
Copy link
Member Author

Ah, I think it's the ConfigureAwait(false) - which is a major no-no during testing.

- Convert test to async to avoid Thread.Sleep blocking
- Increase timeout from 10s to 30s for resource-constrained CI environments
- Add exception handling for MaxWorkingSet access on Linux/Mono
- Remove ConfigureAwait(false) from test code
@Aaronontheweb
Copy link
Member Author

@Arkatufus I think this is ready to review now

Copy link
Contributor

@Arkatufus Arkatufus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

akka-testkit Akka.NET Testkit issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cancelling ExpectMsgAsync (Xunit) throws excessively nested exception type

3 participants