This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Description
On March 31st (roughly) there was an uptick of "message stuck as latest message" reports to the riot clients (riot-web's upstream issue is element-hq/element-web#10032 for this and other cases).
The cause looks to be because Synapse is not sending events down /sync sometimes for events we've sent. There are also some reports of events just not being received on the other end which could be related. Most reporters were on matrix.org, and the reports seemed to stop once matrix.org rolled back.
Reproduction steps appear to be time-dependent, and possibly only useful for matrix.org-sized homeservers:
- Send an event
- Start a
/sync request
- Send another event
- Receive event ID for first event (
/send/m.room.message completes)
- Receive event down
/sync
- Send yet another event
- Start a
/sync request
- Receive event ID for second and third events
- Receive the third event down
/sync, but not the second (sometimes the other way around)
- Never receive the second event through
/sync, making it stuck on Riot
Most easily reproduced during high traffic volumes on matrix.org, though one case was also reproduced at ~22:00 UTC yesterday (but never again that night).
Theory at least from my side is that the request hits a busy synchrotron which advances the stream token past the second event to the third event, invoking amnesia.
@erikjohnston's conclusions appear to be that synapse might be sending the event down /sync ahead of the /send/m.room.message request completing?