-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
Description
We encountered a case of the gh-ost lose data. The details of this case is as follows:
-
start the gh-ost to alter online table t1.
-
when the gh-ost is finish, we find there is lose a data in new table t1, and the lose data is in the old table _t1_del.
-
we analyze the binlog and gh-ost log is as follows
- lose data id is 17, binlog position is (mysql-bin.000009, 98240).
- the gh-ost begin to sync binlog from position (mysql-bin.000009, 1780).
- the gh-ost migration min and max value is (1, 16).
- the gh-ost first status log is
Copy: 0/16 0.0%; Applied: 0; Backlog: 0/1000; Time: 33s(total), 0s(copy); streamer: mysql-bin.000009:101907; Lag: 15.18s, HeartbeatLag: 15.08s, State: migrating; ETA: N/A
We suspect the case is because of MySQL two-phase commit.
The gh-ost command is
gh-ost \
--max-load=Threads_running=32 \
--critical-load=Threads_running=48 \
--chunk-size=500 \
--max-lag-millis=1000000 \
--initially-drop-old-table \
--initially-drop-socket-file \
--user="xx" \
--password="xxxx" \
--host="xxx.xxx.xxx.xxx" \
--port=xxxxx \
--database="shaohk" \
--table="t1" \
--alter="ADD INDEX idx_t1 (name);" \
--switch-to-rbr=false \
--cut-over=default \
--assume-rbr=true \
--exact-rowcount=false \
--default-retries=3 \
--critical-load-interval-millis=300000 \
--serve-socket-file=/tmp/ghost.socket.t1.sock \
--panic-flag-file=/tmp/ghost.panic.flag.shaohk.t1 \
--throttle-flag-file=/tmp/ghost.throttle.flag.shaohk.t1 \
--force-table-names=t1 \
--verbose \
--execute \
--postpone-cut-over-flag-file=/tmp/ghost.postpone.flag.shaohk.t1 \
--allow-on-master
The insert transaction is
mysql> begin;
Query OK, 0 rows affected (0.03 sec)
mysql> insert into t1 values (17,'test');
Query OK, 1 row affected (0.04 sec)
mysql> commit;
Query OK, 0 rows affected (30.3 sec)
The gh-ost log is
2021-09-26 23:11:50 INFO starting gh-ost
2021-09-26 23:11:50 INFO Migrating `shaohk`.`t1`
2021-09-26 23:11:50 INFO inspector connection validated on 127.0.0.1:3306
2021-09-26 23:11:50 INFO User has ALL privileges
2021-09-26 23:11:50 INFO binary logs validated on 127.0.0.1:3306
2021-09-26 23:11:50 INFO Inspector initiated on 127.0.0.1:3306, version 5.7.26-301001001-log
2021-09-26 23:11:50 INFO Table found. Engine=InnoDB
2021-09-26 23:11:50 INFO Estimated number of rows via EXPLAIN: 16
2021-09-26 23:11:51 INFO Recursively searching for replication master
2021-09-26 23:11:51 INFO Master found to be 127.0.0.1:3306
2021-09-26 23:11:51 INFO log_slave_updates validated on 127.0.0.1:3306
2021-09-26 23:11:51 INFO streamer connection validated on 127.0.0.1:3306
[2021/09/26 23:11:51] [info] binlogsyncer.go:148 create BinlogSyncer with config {99999 mysql 127.0.0.1 3306 xxx false false <nil> false UTC true 0 0s 0s 0 false false 0 <nil>}
2021-09-26 23:11:51 INFO Connecting binlog streamer at mysql-bin.000009:1780
[2021/09/26 23:11:51] [info] binlogsyncer.go:374 begin to sync binlog from position (mysql-bin.000009, 1780)
[2021/09/26 23:11:51] [info] binlogsyncer.go:791 rotate to (mysql-bin.000009, 1780)
2021-09-26 23:11:51 INFO rotate to next log from mysql-bin.000009:0 to mysql-bin.000009
2021-09-26 23:11:51 INFO applier connection validated on 127.0.0.1:3306
2021-09-26 23:11:51 INFO applier connection validated on 127.0.0.1:3306
2021-09-26 23:11:52 INFO will use time_zone='SYSTEM' on applier
2021-09-26 23:11:52 INFO Examining table structure on applier
2021-09-26 23:11:52 INFO Applier initiated on 127.0.0.1:3306, version 5.7.26-301001001-log
2021-09-26 23:11:52 INFO Dropping table `shaohk`.`_t1_del`
2021-09-26 23:11:52 INFO Table dropped
2021-09-26 23:11:52 INFO Dropping table `shaohk`.`_t1_ghc`
2021-09-26 23:11:52 INFO Table dropped
2021-09-26 23:11:52 INFO Creating changelog table `shaohk`.`_t1_ghc`
2021-09-26 23:11:52 INFO Changelog table created
2021-09-26 23:11:52 INFO Creating ghost table `shaohk`.`_t1_gho`
2021-09-26 23:11:52 INFO Ghost table created
2021-09-26 23:11:52 INFO Altering ghost table `shaohk`.`_t1_gho`
2021-09-26 23:11:52 INFO Ghost table altered
2021-09-26 23:11:52 INFO Intercepted changelog state GhostTableMigrated
2021-09-26 23:11:52 INFO Waiting for ghost table to be migrated. Current lag is 0s
2021-09-26 23:11:52 INFO Handled changelog state GhostTableMigrated
2021-09-26 23:11:52 INFO Chosen shared unique key is PRIMARY
2021-09-26 23:11:52 INFO Shared columns are id,name
2021-09-26 23:11:52 INFO Listening on unix socket file: /tmp/ghost.socket.146155549.sock
2021-09-26 23:12:12 INFO Migration min values: [1]
2021-09-26 23:12:12 INFO Migration max values: [16]
2021-09-26 23:12:22 INFO Waiting for first throttle metrics to be collected
2021-09-26 23:12:23 INFO First throttle metrics collected
# Migrating `shaohk`.`t1`; Ghost table is `shaohk`.`_t1_gho`
# Migrating 127.0.0.1:3306; inspecting 127.0.0.1:3306; executing on shaohk.local
# Migration started at Sun Sep 26 23:11:50 +0800 2021
# chunk-size: 500; max-lag-millis: 1000000ms; dml-batch-size: 10; max-load: Threads_running=32; critical-load: Threads_running=48; nice-ratio: 0.000000
# throttle-flag-file: /tmp/ghost.throttle.flag.shaohk.t1.146155549
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# postpone-cut-over-flag-file: /tmp/ghost.postpone.flag.shaohk.t1.146155549 [set]
# panic-flag-file: /tmp/ghost.panic.flag.shaohk.t1.146155549
# Serving on unix socket: /tmp/ghost.socket.146155549.sock
Copy: 0/16 0.0%; Applied: 0; Backlog: 0/1000; Time: 33s(total), 0s(copy); streamer: mysql-bin.000009:101907; Lag: 15.18s, HeartbeatLag: 15.08s, State: migrating; ETA: N/A
Copy: 16/16 100.0%; Applied: 0; Backlog: 0/1000; Time: 49s(total), 16s(copy); streamer: mysql-bin.000009:103548; Lag: 31.11s, HeartbeatLag: 31.13s, State: migrating; ETA: due
2021-09-26 23:12:40 INFO Row copy complete
Copy: 16/16 100.0%; Applied: 0; Backlog: 0/1000; Time: 50s(total), 17s(copy); streamer: mysql-bin.000009:106332; Lag: 0.38s, HeartbeatLag: 0.38s, State: migrating; ETA: due
Copy: 16/16 100.0%; Applied: 0; Backlog: 0/1000; Time: 50s(total), 17s(copy); streamer: mysql-bin.000009:106332; Lag: 0.34s, HeartbeatLag: 0.34s, State: migrating; ETA: due
Copy: 16/16 100.0%; Applied: 0; Backlog: 0/1000; Time: 51s(total), 17s(copy); streamer: mysql-bin.000009:114627; Lag: 0.08s, HeartbeatLag: 0.13s, State: postponing cut-over; ETA: due
Copy: 16/16 100.0%; Applied: 0; Backlog: 0/1000; Time: 52s(total), 17s(copy); streamer: mysql-bin.000009:121824; Lag: 0.08s, HeartbeatLag: 0.13s, State: postponing cut-over; ETA: due
2021-09-26 23:13:01 INFO Grabbing voluntary lock: gh-ost.66.lock
2021-09-26 23:13:01 INFO Setting LOCK timeout as 6 seconds
2021-09-26 23:13:01 INFO Looking for magic cut-over table
2021-09-26 23:13:01 INFO Creating magic cut-over table `shaohk`.`_t1_del`
2021-09-26 23:13:01 INFO Magic cut-over table created
2021-09-26 23:13:01 INFO Locking `shaohk`.`t1`, `shaohk`.`_t1_del`
2021-09-26 23:13:01 INFO Tables locked
2021-09-26 23:13:01 INFO Session locking original & magic tables is 66
2021-09-26 23:13:01 INFO Writing changelog state: AllEventsUpToLockProcessed:1632669181273438000
2021-09-26 23:13:01 INFO Intercepted changelog state AllEventsUpToLockProcessed
2021-09-26 23:13:01 INFO Handled changelog state AllEventsUpToLockProcessed
2021-09-26 23:13:01 INFO Waiting for events up to lock
2021-09-26 23:13:01 INFO Waiting for events up to lock: got AllEventsUpToLockProcessed:1632669181273438000
2021-09-26 23:13:01 INFO Done waiting for events up to lock; duration=269.19469ms
# Migrating `shaohk`.`t1`; Ghost table is `shaohk`.`_t1_gho`
# Migrating 127.0.0.1:3306; inspecting 127.0.0.1:3306; executing on shaohk.local
# Migration started at Sun Sep 26 23:11:50 +0800 2021
# chunk-size: 500; max-lag-millis: 1000000ms; dml-batch-size: 10; max-load: Threads_running=32; critical-load: Threads_running=48; nice-ratio: 0.000000
# throttle-flag-file: /tmp/ghost.throttle.flag.shaohk.t1.146155549
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# postpone-cut-over-flag-file: /tmp/ghost.postpone.flag.shaohk.t1.146155549
# panic-flag-file: /tmp/ghost.panic.flag.shaohk.t1.146155549
# Serving on unix socket: /tmp/ghost.socket.146155549.sock
Copy: 16/16 100.0%; Applied: 0; Backlog: 0/1000; Time: 1m11s(total), 17s(copy); streamer: mysql-bin.000009:249730; Lag: 0.07s, HeartbeatLag: 0.11s, State: migrating; ETA: due
2021-09-26 23:13:01 INFO Setting RENAME timeout as 3 seconds
2021-09-26 23:13:01 INFO Session renaming tables is 72
2021-09-26 23:13:01 INFO Issuing and expecting this to block: rename /* gh-ost */ table `shaohk`.`t1` to `shaohk`.`_t1_del`, `shaohk`.`_t1_gho` to `shaohk`.`t1`
2021-09-26 23:13:01 INFO Found atomic RENAME to be blocking, as expected. Double checking the lock is still in place (though I don't strictly have to)
2021-09-26 23:13:01 INFO Checking session lock: gh-ost.66.lock
2021-09-26 23:13:01 INFO Connection holding lock on original table still exists
2021-09-26 23:13:01 INFO Will now proceed to drop magic table and unlock tables
2021-09-26 23:13:01 INFO Dropping magic cut-over table
2021-09-26 23:13:01 INFO Releasing lock from `shaohk`.`t1`, `shaohk`.`_t1_del`
2021-09-26 23:13:01 INFO Tables renamed
2021-09-26 23:13:01 INFO Tables unlocked
2021-09-26 23:13:01 INFO Lock & rename duration: 741.223847ms. During this time, queries on `t1` were blocked
[2021/09/26 23:13:01] [info] binlogsyncer.go:180 syncer is closing...
[2021/09/26 23:13:02] [info] binlogsyncer.go:864 kill last connection id 68
[2021/09/26 23:13:02] [info] binlogsyncer.go:210 syncer is closed
2021-09-26 23:13:02 INFO Closed streamer connection. err=<nil>
2021-09-26 23:13:02 INFO Dropping table `shaohk`.`_t1_ghc`
2021-09-26 23:13:02 INFO Table dropped
2021-09-26 23:13:02 INFO Am not dropping old table because I want this operation to be as live as possible. If you insist I should do it, please add `--ok-to-drop-table` next time. But I prefer you do not. To drop the old table, issue:
2021-09-26 23:13:02 INFO -- drop table `shaohk`.`_t1_del`
2021-09-26 23:13:02 INFO Done migrating `shaohk`.`t1`
2021-09-26 23:13:02 INFO Removing socket file: /tmp/ghost.socket.146155549.sock
2021-09-26 23:13:02 INFO Tearing down inspector
2021-09-26 23:13:02 INFO Tearing down applier
2021-09-26 23:13:02 INFO Tearing down streamer
2021-09-26 23:13:02 INFO Tearing down throttler
The alter result is
mysql> select * from t1;
+----+------+
| id | name |
+----+------+
| 1 | test |
| 2 | test |
| 3 | test |
| 4 | test |
| 5 | test |
| 6 | test |
| 7 | test |
| 8 | test |
| 9 | test |
| 10 | test |
| 11 | test |
| 12 | test |
| 13 | test |
| 14 | test |
| 15 | test |
| 16 | test |
+----+------+
mysql> select * from _t1_del;
+----+------+
| id | name |
+----+------+
| 1 | test |
| 2 | test |
| 3 | test |
| 4 | test |
| 5 | test |
| 6 | test |
| 7 | test |
| 8 | test |
| 9 | test |
| 10 | test |
| 11 | test |
| 12 | test |
| 13 | test |
| 14 | test |
| 15 | test |
| 16 | test |
| 17 | test |
+----+------+
17 rows in set (0.03 sec)
Metadata
Metadata
Assignees
Labels
No labels