-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
We have a pretty large and busy (at some times) table where we try to add column xxx int null
, everything seems fine until the cut-over phase, it gets an error and then retries for some time but never succeeds. I have checked other issues here which also have this error but haven't found anything really.
The binlog is not stuck as it keeps applying and rotating.
This is on RDS Aurora, normal setup, we haven't changed any parameters but starting to think that some parameter might need to be changed in the DB? But not sure which in that case.
We are running latest master and have tried some different command flags, like for ex --cut-over-exponential-backoff
and increasing --cut-over-lock-timeout-seconds
and --default-retries
.
This is the full command right now:
gh-ost \ --user=$user \ --password=$password \ --host=$host \ --database=$database \ --table=$table \ --alter="$alter" \ --max-load=Threads_running=25 \ --critical-load=Threads_running=1000 \ --chunk-size=1000 \ --max-lag-millis=1500 \ --verbose \ --alter="engine=innodb" \ --allow-master-master \ --exact-rowcount \ --concurrent-rowcount \ --default-retries=120 \ --allow-on-master \ --initially-drop-ghost-table \ --initially-drop-old-table \ --cut-over-lock-timeout-seconds=10 \ --assume-rbr \ --cut-over-exponential-backoff \ --execute \
Copy: 79876601/79876601 100.0%; Applied: 16344; Backlog: 0/1000; Time: 3h42m10s(total), 2h20m24s(copy); streamer: mysql-bin-changelog.013410:1363370; Lag: 0.01s, State: migrating; ETA: due
2019-11-06 14:04:27 ERROR Timeout while waiting for events up to lock
2019-11-06 14:04:27 ERROR 2019-11-06 14:04:27 ERROR Timeout while waiting for events up to lock
2019-11-06 14:04:27 INFO Looking for magic cut-over table
2019-11-06 14:04:27 INFO Will now proceed to drop magic table and unlock tables
2019-11-06 14:04:27 INFO Dropping magic cut-over table
2019-11-06 14:04:27 INFO Dropping magic cut-over table
2019-11-06 14:04:27 INFO Dropping table `xxx`.`_xxx_del`
2019-11-06 14:04:27 INFO Table dropped
2019-11-06 14:04:27 INFO Releasing lock from `xxx`.`xxx`, `xxx`.`_xxx_del`
2019-11-06 14:04:27 INFO Tables unlocked
2019-11-06 14:04:27 INFO Grabbing voluntary lock: gh-ost.428716346.lock
2019-11-06 14:04:27 INFO Setting LOCK timeout as 20 seconds
2019-11-06 14:04:27 INFO Looking for magic cut-over table
2019-11-06 14:04:27 INFO Creating magic cut-over table `xxx`.`_xxx_del`
2019-11-06 14:04:27 INFO Magic cut-over table created
2019-11-06 14:04:27 INFO Locking `xxx`.`xxx`, `xxx`.`_xxx_del`
2019-11-06 14:04:27 INFO Tables locked
2019-11-06 14:04:27 INFO Session locking original & magic tables is 428716346
2019-11-06 14:04:27 INFO Writing changelog state: AllEventsUpToLockProcessed:1573049067764852856
2019-11-06 14:04:27 INFO Waiting for events up to lock
Copy: 79876601/79876601 100.0%; Applied: 16344; Backlog: 0/1000; Time: 3h42m15s(total), 2h20m24s(copy); streamer: mysql-bin-changelog.013410:16394106; Lag: 0.01s, State: migrating; ETA: due
Copy: 79876601/79876601 100.0%; Applied: 16344; Backlog: 0/1000; Time: 3h42m20s(total), 2h20m24s(copy); streamer: mysql-bin-changelog.013410:32408062; Lag: 0.01s, State: migrating; ETA: due
2019-11-06 14:04:37 ERROR Timeout while waiting for events up to lock
2019-11-06 14:04:37 ERROR 2019-11-06 14:04:37 ERROR Timeout while waiting for events up to lock
2019-11-06 14:04:37 INFO Looking for magic cut-over table
2019-11-06 14:04:37 INFO Will now proceed to drop magic table and unlock tables
2019-11-06 14:04:37 INFO Dropping magic cut-over table
2019-11-06 14:04:37 INFO Dropping magic cut-over table
2019-11-06 14:04:37 INFO Dropping table `xxx`.`_xxx_del`
2019-11-06 14:04:37 INFO Table dropped
2019-11-06 14:04:37 INFO Removing socket file: /tmp/gh-ost.xxx.xxx.sock
2019-11-06 14:04:37 INFO Releasing lock from `xxx`.`xxx`, `xxx`.`_xxx_del`
2019-11-06 14:04:37 FATAL 2019-11-06 14:04:37 ERROR Timeout while waiting for events up to lock