-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
We detected this issue accidentally when a small gh-ost
migration got interrupted by a daily test database failover. Since the current writer became read-only, we saw lots of FATAL Error 1290 (HY000): The MySQL server is running with the --read-only option so it cannot execute this statement
logs and had to step in and kill the process.
Failures like this while copying data in the ApplyIterationInsertQuery
loop will be retried until MaxRetries
and then the process will terminate through a PanicAbort
error.
However, in the case described here, we were past the copy stage and observed gh-ost
getting stuck in throttled
state with the heartbeat lag growing indefinitely.
Repro steps
Partial output printed by an integration test where I issued a SET GLOBAL super_read_only=1
after the data is copied, but before cutover (super
just because of how my test db was set up).
First, copying completes.
I, [2025-07-10T11:57:02.661777 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Copy: 2000/2000 100.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: mysql-bin.000001:139473498; Lag: 0.01s, HeartbeatLag: 0.07s, State: migrating; ETA: due []
I, [2025-07-10T11:57:02.662144 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Grabbing voluntary lock: gh-ost.2831.lock
I, [2025-07-10T11:57:02.662327 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Setting LOCK timeout as 20 seconds
I, [2025-07-10T11:57:02.662529 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Looking for magic cut-over table
I, [2025-07-10T11:57:02.663413 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Creating magic cut-over table `agent_test`.`_gh_ost_users_20250710115701_del`
I, [2025-07-10T11:57:02.667448 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Magic cut-over table created
I, [2025-07-10T11:57:02.667498 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Setting cut-over idle timeout as 30 seconds
I, [2025-07-10T11:57:02.667709 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Locking `agent_test`.`users`, `agent_test`.`_gh_ost_users_20250710115701_del`
I, [2025-07-10T11:57:02.668176 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Tables locked
I, [2025-07-10T11:57:02.668196 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Session locking original & magic tables is 2831
I, [2025-07-10T11:57:02.668225 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Writing changelog state: AllEventsUpToLockProcessed:1752141422668177000
I, [2025-07-10T11:57:02.670031 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Waiting for events up to lock
I, [2025-07-10T11:57:02.672942 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Intercepted changelog state AllEventsUpToLockProcessed
I, [2025-07-10T11:57:02.672997 #11938] INFO -- : [gh-ost]: 2025-07-10 11:57:02 INFO Handled changelog state AllEventsUpToLockProcessed
gh-ost
starts the idle timeout and we can simulate failover now.
I, [2025-07-10T11:36:08.696813 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:08 INFO Setting cut-over idle timeout as 30 seconds
Once queries start to fail and process hangs in State: throttled
, printing a batches of errors and the progress message every second, infinitely.
I, [2025-07-10T11:36:39.258420 #10808] INFO -- : [gh-ost]: Copy: 2000/2000 100.0%; Applied: 0; Backlog: 0/1000; Time: 32s(total), 1s(copy); streamer: mysql-bin.000001:135927553; Lag: 30.11s, HeartbeatLag: 30.21s, State: migrating; ETA: due
I, [2025-07-10T11:36:39.258710 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 INFO Copy: 2000/2000 100.0%; Applied: 0; Backlog: 0/1000; Time: 32s(total), 1s(copy); streamer: mysql-bin.000001:135927553; Lag: 30.11s, HeartbeatLag: 30.21s, State: migrating; ETA: due []
I, [2025-07-10T11:36:39.351713 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.352618 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.450953 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.451353 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.551662 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.553035 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.651587 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.652518 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.715076 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 INFO Checking session lock: gh-ost.2817.lock
I, [2025-07-10T11:36:39.717221 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Session lock gh-ost.2817.lock expected to be found but wasn't
I, [2025-07-10T11:36:39.751343 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.752083 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.851623 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.852618 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.951796 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:39.952941 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:39 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.052204 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.053756 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.152068 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.153595 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.253441 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.253506 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.253548 #10808] INFO -- : [gh-ost]: Copy: 2000/2000 100.0%; Applied: 0; Backlog: 0/1000; Time: 33s(total), 1s(copy); streamer: mysql-bin.000001:135927553; Lag: 31.11s, HeartbeatLag: 31.21s, State: throttled, lag=31.010552s; ETA: due
I, [2025-07-10T11:36:40.253640 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 INFO Copy: 2000/2000 100.0%; Applied: 0; Backlog: 0/1000; Time: 33s(total), 1s(copy); streamer: mysql-bin.000001:135927553; Lag: 31.11s, HeartbeatLag: 31.21s, State: throttled, lag=31.010552s; ETA: due []
I, [2025-07-10T11:36:40.254304 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.351491 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.352606 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.450674 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.450972 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.550188 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.550489 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.650515 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.650860 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.750394 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.750620 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.850404 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.850777 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.950297 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:40.950692 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:40 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:41.050396 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:41 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:41.050685 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:41 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:41.150416 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:41 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:41.150718 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:41 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:41.250454 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:41 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:41.250707 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:41 ERROR Error 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
I, [2025-07-10T11:36:41.250730 #10808] INFO -- : [gh-ost]: Copy: 2000/2000 100.0%; Applied: 0; Backlog: 0/1000; Time: 34s(total), 1s(copy); streamer: mysql-bin.000001:135927553; Lag: 32.11s, HeartbeatLag: 32.21s, State: throttled, lag=32.008558s; ETA: due
I, [2025-07-10T11:36:41.250872 #10808] INFO -- : [gh-ost]: 2025-07-10 11:36:41 INFO Copy: 2000/2000 100.0%; Applied: 0; Backlog: 0/1000; Time: 34s(total), 1s(copy); streamer: mysql-bin.000001:135927553; Lag: 32.11s, HeartbeatLag: 32.21s, State: throttled, lag=32.008558s; ETA: due []
Solution
I used the repro test case to find out which functions fail to connect to the writer and how they are retried. All of the failures point to things trying to write to the changelog table, which makes sense.
- the first failures logged come from
Applier.InitiateHeartbeat
, which retries up toMaxRetries
and then returns on error and silently stops sending new heartbeats. It doesn't seem there's any logic to re-initiate the heartbeat goroutine. throttlerFunction
fails toWriteAndLogChangelog
every 100ms, which is what gets logged infinitely
I think it would be reasonable to exit with error after a retry period in injectHeartbeat
, assuming the heartbeat failure is non-recoverable. Perhaps it makes sense to control this with a new CLI flag.
It could look like this: Shopify#13. Would love your feedback before I open an upstream PR here.