-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
SerializationFailure on appservice outgoing transaction (updating last_txn) #11620
Description
Description
We are seeing AS' go into recoverer mode after a serialization failure when updating application_services_state.last_txn (https://github.com/matrix-org/synapse/blob/v1.48.0/synapse/storage/databases/main/appservice.py#L272). We recently fixed a similar issue in #11195 which targets the same database row, so I suspect the issue now is that this last_txn update occaisionally clashes with the stream position update we added the linearizer for.
I see that the last_txn implementation was added a few years back (0a60bbf) and hasn't changed since - as far as I can tell the only use of this column is to check the previous value hasn't incremented more than one. This does appear to get triggered from time to time still - I can see 12 log entires in the last 7 days in our deployment.
Possible Solutions
- move this column to it's own table, not great but would get around the immediate issue
- fix the ordering issue (or work around it) and remove the column/check entirely
I notice the recoverer just pulls the lowest transaction and sends that - could the entire sending mechanism work just like that instead? Simply have the _TransactionController write out transactions to the database in the batches and have process (currently _Recoverer) that just pulls from these constantly. This would remove any need to track the last transaction ID for a given AS.
Version information
- v1.48
- worker based deployment with dedicated appservice pusher worker