-
-
Notifications
You must be signed in to change notification settings - Fork 95
Description
subscribe: true
flag in PUT operations doesn't work correctly for initiating peers
Problem Description
When a client performs a PUT operation with subscribe: true
, the operation times out because the subscription logic blocks the PUT response from being sent. This prevents applications like River from successfully creating contracts and causes the entire operation to fail.
Root Cause: Blocking Subscription Logic
The fundamental issue: In /crates/core/src/operations/put.rs
, when handling subscribe: true
, the code calls start_subscription_request()
which can initiate a GET operation that blocks, preventing the PUT operation from completing and sending its response to the client.
Detailed Analysis
-
Blocking Call Chain:
// In put.rs line ~490 if subscribe { super::start_subscription_request(op_manager, key, false, HashSet::new()).await; } new_state = Some(PutState::Finished { key }); // This happens AFTER subscription
-
The subscription request can trigger a GET:
start_subscription_request()
may need to fetch the contract if not locally available- This GET operation can timeout (especially in distributed scenarios)
- While waiting for the GET, the PUT operation cannot transition to
Finished
state - Without the
Finished
state, no response is sent to the client - Client times out waiting for PUT response
-
Architectural Constraint:
- OpManager has lifetime constraints that prevent spawning the subscription as a truly independent async task
- Cannot use
tokio::spawn
due to'static
lifetime requirements - The subscription logic is inherently tied to the PUT operation's lifetime
Current Behavior (Causes Timeout)
When a peer initiates a PUT with subscribe: true
:
- Client sends PUT request with
subscribe: true
to local peer - Local peer processes PUT and stores contract
- BLOCKS: Local peer attempts subscription which may trigger GET operation
- GET operation times out or takes too long
- PUT never transitions to
Finished
state - Client never receives PUT response
- Client times out (e.g., "Timeout waiting for PUT response after 10 seconds")
Evidence from Testing
Gateway Test Framework Results
Starting gateway test framework with local build
Testing River multi-user chat:
- Room creation (PUT with subscribe:true): TIMEOUT
- After changing to subscribe:false: Still TIMEOUT (reveals deeper issue)
Direct Testing
Modified /crates/core/src/operations/put.rs
to move state transition before subscription:
// Mark operation as finished BEFORE subscription
new_state = Some(PutState::Finished { key });
// Start subscription if requested - do this AFTER marking as finished
if subscribe {
// Subscription logic here (still partially blocking but PUT completes)
}
Result: PUT response is sent, but subscription may not complete properly.
Why This Wasn't Caught Earlier
-
Test Mode Differences:
- Unit tests may not experience the same network delays
freenet local
mode has different code paths thanfreenet network
- Integration tests might use different timeout values
-
Subscription Complexity:
- The subscription logic involves multiple async operations
- Race conditions and timing issues are environment-dependent
- Gateway setups have additional network latency
Attempted Fixes
Fix 1: Reorder Operations (Partial Success)
Move state transition before subscription:
- ✅ PUT response is sent immediately
⚠️ Subscription may not complete properly⚠️ Still architecturally problematic
Fix 2: Async Subscription (Failed)
Attempt to spawn subscription as independent task:
- ❌ Cannot use
tokio::spawn
due to OpManager lifetime - ❌ OpManager not
'static
, contains non-Send types
Fix 3: Local-Only Subscription (Insufficient)
Only register local subscription without network request:
- ✅ No blocking
- ❌ Peer not registered in remote subscription tree
- ❌ Won't receive updates from other peers
Recommended Solution
Short-term Workaround
Applications should avoid subscribe: true
and instead:
// 1. PUT without subscribe
let put_request = ContractRequest::Put {
contract: contract_container,
state: wrapped_state,
subscribe: false, // Avoid blocking issue
};
// 2. After successful PUT, explicitly SUBSCRIBE
let subscribe_request = ContractRequest::Subscribe {
key: contract_key,
summary: None,
};
Long-term Fix Options
-
Redesign Subscription Architecture:
- Decouple subscription from PUT operation lifecycle
- Use message passing to trigger subscription after PUT completes
- Requires significant architectural changes
-
Queue-Based Approach:
- Queue subscription requests to be processed after PUT completes
- Add a subscription queue to OpManager
- Process queue periodically or after operation completion
-
Two-Phase PUT:
- Phase 1: Store contract and send response
- Phase 2: Background subscription (fire-and-forget)
- Accept that subscription might fail silently
Impact
- River chat: Room creation times out, messages cannot be sent
- Any app using
subscribe: true
: PUT operations timeout - Performance: 10+ second timeouts degrade user experience
- Reliability: Subscription state inconsistent between peers
Related Code Locations
/crates/core/src/operations/put.rs
- Lines 490-551 (blocking subscription)/crates/core/src/operations/subscribe.rs
- Line 70 (requires local contract)/crates/core/src/op_storage/mod.rs
- OpManager lifetime constraints/crates/core/src/client_events/mod.rs
- Client response handling
Test Case
Added test in /crates/core/tests/operations.rs
:
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn test_put_subscribe_enables_update() -> TestResult {
// Test verifies PUT with subscribe:true enables UPDATE
// Currently passes in network mode but River still fails
}
Questions for Core Team
- Is the blocking behavior of
subscribe: true
intentional or a bug? - Can OpManager be refactored to support spawning detached tasks?
- Should
subscribe: true
be deprecated in favor of explicit SUBSCRIBE? - Is there a way to make subscription truly non-blocking without architectural changes?
Current Status
- Immediate issue identified: Subscription blocks PUT response
- Workaround implemented in River (uses subscribe:false + explicit SUBSCRIBE)
- Core issue requires architectural decision from team
- Tests added but full River functionality still failing
Metadata
Metadata
Assignees
Labels
Type
Projects
Status