MSC4140: finalised delayed events, and more #19038

AndrewFerr · 2025-10-10T07:46:59Z

MSC4140: finalised delayed events, and more

Store sent/cancelled/failed delayed events, i.e. finalised delayed events, and support looking them up to inspect whether a delayed event was sent or not. Set limits on how many finalised events to store.
Support looking up delayed events by ID
Return 200 when retrying the same action (send/cancel) on an already-finalised delayed event, or 409 for a conflicting action
Limit how many delayed events a user may have scheduled at a time

Dev notes

Delayed events initially introduced in #17326

Pull Request Checklist

Pull request is based on the develop branch
Pull request includes a changelog file. The entry should:
- Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
- Use markdown where necessary, mostly for code blocks.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
Code style is correct (run the linters)

- Store sent/cancelled/failed delayed events, i.e. finalised delayed events, and support looking them up to inspect whether a delayed event was sent or not. Set limits on how many finalised events to store. - Support looking up delayed events by ID - Return 200 when retrying the same action (send/cancel) on an already-finalised delayed event, or 409 for a conflicting action - Limit how many delayed events a user may have scheduled at a time

MadLittleMods · 2025-10-22T17:23:24Z

changelog.d/19038.feature

@@ -0,0 +1 @@
+Add more support for MSC4140, namely the ability to inspect sent, cancelled, or failed delayed events, aka "finalised" delayed events.


For my own reference, what is delayed events being used for?

I thought this was related to VoIP stuff (calls) and the new meta is with sticky events.

Are delayed events going to be deprecated in favor of sticky events?

Sticky events are replacing not delayed events, but "owned state" i.e. MSC3757 / MSC3779.

Delayed events are still going to be used by MatrixRTC for scheduling cancellable "leave" events for disconnected clients.

https://github.com/matrix-org/matrix-spec-proposals/blob/toger5/matrixRTC/proposals/4143-matrix-rtc.md#dependencies

MadLittleMods · 2025-10-24T16:15:45Z

changelog.d/19038.feature

@@ -0,0 +1 @@
+Add more support for MSC4140, namely the ability to inspect sent, cancelled, or failed delayed events, aka "finalised" delayed events.


Are there Complement changes to go along with this?

I see some TestDelayedEvents Complement tests that are failing: https://github.com/element-hq/synapse/actions/runs/18411628889/job/52465384686?pr=19038

Are there Complement changes to go along with this?

No, not yet. I'll add some soon.

I see some TestDelayedEvents Complement tests that are failing

That's an unrelated failure, which has been flaky for a frustratingly long time. Maybe now is a good time to try to tackle it again.

It doesn't seem flaky. It's failed all 3 times for both SQLite and Postgres. And it's only TestDelayedEvents

synapse/config/experimental.py

MadLittleMods · 2025-10-24T16:33:49Z

synapse/storage/schema/main/delta/93/01_add_finalised_delayed_events.sql

+-- See the GNU Affero General Public License for more details:
+-- <https://www.gnu.org/licenses/agpl-3.0.html>.
+
+CREATE TABLE finalised_delayed_events (


What the difference between delayed_events.is_processed and the state in finalised_delayed_events?

It looks like finalised_delayed_events holds more info but I'm not immediately seeing the difference.

is_processed is for delayed events that have just timed out and are in the process of being sent / persisted to their target room's DAG. Only once a delayed event is successfully sent does it get finalised. (Prior to this PR, it would instead get deleted from the DB.)

The purpose of tracking is_processed is to handle the edge case of the server going down after a delayed event times out, but before it gets sent. Upon server restart, the sending of any is_processed delayed events will be retried.

I'm admittedly not a big fan of this, but I couldn't find a way to make an atomic action out of a delayed event timing out & being sent. I also didn't want to fiddle with that in this PR, even if finalised events now have some redundancy with is_processed events.

MadLittleMods · 2025-10-24T16:34:23Z

synapse/storage/schema/main/delta/93/01_add_finalised_delayed_events.sql

+    error bytea,
+    event_id TEXT,
+    finalised_ts BIGINT NOT NULL,


Why not just put this in delayed_events?

The intent was to reduce the number of columns in the delayed_events table, as finalised_ts is relevant only to finalised events. In general, I wanted to keep any finalised-only columns out of the non-finalised delayed_events table, especially if more finalised-only properties get added later (which would allow a schema update to leave the non-finalised delayed_events table alone).

Never mind: 2ec54e4 (meant to post this on Friday)

synapse/storage/schema/main/delta/93/01_add_finalised_delayed_events.sql

MadLittleMods · 2025-10-24T17:00:43Z

synapse/storage/databases/main/delayed_events.py

+            for user_localpart in self.db_pool.simple_select_onecol_txn(
+                txn,
+                "finalised_delayed_events",
+                keyvalues={},
+                retcol="DISTINCT(user_localpart)",
+            ):
+                self._prune_excess_finalised_delayed_events_for_user(
+                    txn, user_localpart, retention_limit
+                )


This is a N+1 query problem.

Can we do this in batches in one big query?

Not sure, because each prune is a 2-step process of:

find how many finalised delayed events the target user has

delete enough of that user's FDEs to stay within the per-user retention limit

I'm not sure of a better way to do it that would involve only a single query.

What should help is the fact this pruning is always done in a transaction, together with whatever other queries need to happen before/after it.

MadLittleMods · 2025-10-24T17:01:33Z

synapse/storage/schema/main/delta/93/01_add_finalised_delayed_events.sql

+-- See the GNU Affero General Public License for more details:
+-- <https://www.gnu.org/licenses/agpl-3.0.html>.
+
+CREATE TABLE finalised_delayed_events (


It looks like this separate table is creating the need for a lot of sub-queries (SELECT in SELECT statements) which seems like a smell.

The alternative would be to add finalised-related columns to the existing delayed_events table.

I went with the current approach under the belief that it's better to avoid adding many columns to a single table. If that is an ill-founded belief, I will move these columns into the other table.

I haven't been stuck in the details as you have but I do know that this current solution doesn't spark joy. Your current approach could be the way to solve it although my gut feeling makes me seek some better alternative.

From my eye, it seems like a delayed event should have a status enum all in one schema.

The alternative would be to add finalised-related columns to the existing delayed_events table.

Optimizing for number of columns doesn't seem like something to worry about. We should more worry about tables growing beyond their concern (trying to handle too many things).

Given that moving the columns to delayed_events also has a benefit in how we can interact with the data, it seems like this is another good reason to go that route.

Giving better advice, probably means I'd have to try solving it myself which I don't want to do. I'll leave the problem-solving on this one in your capable hands.

Thanks for the advice on this. I gave it a try with 2ec54e4, and it does end up being much cleaner.

What would have made separate tables more appropriate is if there were attributes relevant only to scheduled delayed events but not finalised ones, and vice-versa.

Instead of storing details on finalised delayed events in a new, standalone table that must be joined on the base table of delayed events, add columns to the base table to track those details.

This satisfies the check-schema-delta script. The index is present for speeding up ORDER BY on finalised_ts.

This should also fix the portdb script

AndrewFerr · 2025-11-03T19:30:20Z

synapse/storage/schema/main/delta/93/01_add_finalised_delayed_events.sql

-CREATE INDEX delayed_events_finalised_ts ON delayed_events (finalised_ts);
+INSERT INTO background_updates (update_name, progress_json) VALUES
+  ('delayed_events_finalised_ts', '{}');


This had to be added to pass the check-schema-delta job.

Given that this is an index on a new, null-by-default column, is there really a need to add it in the background?

AndrewFerr requested a review from a team as a code owner October 10, 2025 07:46

AndrewFerr added 2 commits October 10, 2025 04:03

Add changelog

bc0e1a3

Work around Postgres-only error

4a7e78c

AndrewFerr mentioned this pull request Oct 20, 2025

MSC4140: Cancellable delayed events matrix-org/matrix-spec-proposals#4140

Open

4 tasks

MadLittleMods reviewed Oct 22, 2025

View reviewed changes

MadLittleMods requested a review from a team October 22, 2025 17:23

MadLittleMods reviewed Oct 24, 2025

View reviewed changes

AndrewFerr added 4 commits October 31, 2025 10:56

Rename experimental config per suggestion

99f8501

Add description comment to new table

96ed87a

Merge finalised_delayed_events into delayed_events

2ec54e4

Instead of storing details on finalised delayed events in a new, standalone table that must be joined on the base table of delayed events, add columns to the base table to track those details.

Merge with 'develop': PEP585 and always RETURNING

048508c

AndrewFerr force-pushed the msc4140-finalised-and-filters branch from ea6b3ca to 048508c Compare November 1, 2025 04:03

AndrewFerr added 3 commits November 3, 2025 09:23

Merge with 'develop': Python 3.14 support

095eda4

Add finalised_ts index in background

5cd80ed

This satisfies the check-schema-delta script. The index is present for speeding up ORDER BY on finalised_ts.

Include DelayedEventsStore in migrated DB stores

11c79c2

This should also fix the portdb script

AndrewFerr commented Nov 3, 2025

View reviewed changes

AndrewFerr added 2 commits November 3, 2025 15:46

Fix error code string

d2aba30

Run UPDATE in transaction, and check rowcount

e5bef01

		@@ -0,0 +1 @@
		Add more support for MSC4140, namely the ability to inspect sent, cancelled, or failed delayed events, aka "finalised" delayed events.

MSC4140: finalised delayed events, and more #19038

Are you sure you want to change the base?

MSC4140: finalised delayed events, and more #19038

Uh oh!

Conversation

AndrewFerr commented Oct 10, 2025 • edited by MadLittleMods Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dev notes

Pull Request Checklist

Uh oh!

MadLittleMods Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AndrewFerr commented Oct 10, 2025 •

edited by MadLittleMods

Loading

MadLittleMods Oct 22, 2025 •

edited

Loading