因为使用delete删除数据的时候,MySQL并不会把数据文件真实删除,而只是将数据文件的标识位删除,也没有整理数据文件,因此不会彻底释放表空间。换句话说,每当我们从表中删除数据时,这段被删除数据的空间就会被留出来,如果又赶上某段时间内该表进行大量的delete操作,则这部分被删除数据的空间就会越来越大。当有新数据写入时,MySQL会再次利用这些被删除的区域,但也无法彻底占用。
//安装依赖
yum install -y perl-TremR perl-DBI perl-DBD-mysql perl-Time-HiRes perl-IO-Socket-SSL perl-TermReadKey perl-Digest-MD5
//下载
wget https://downloads.percona.com/downloads/percona-toolkit/3.5.2/binary/redhat/7/x86_64/percona-toolkit-3.5.2-2.el7.x86_64.rpm
//安装
rpm -ivh percona-toolkit-3.5.2-2.el7.x86_64.rpm
工具自身环境检查:
说明:工具在执行时也会进行检查,如果遇到不能执行的情况,则报错,建议在执行前先进行 dry-run。
执行之前测试:
pt-online-schema-change h=localhost,u=root,p=11111,P=3306,D=userblink,t=copy1 --alter "ENGINE=InnoDB" --recursion-method=none --no-check-replication-filters --alter-foreign-keys-method auto --print --dry-run
pt-online-schema-change h=localhost,u=root,p=11111,P=3306,D=userblink,t=copy1 --alter "ENGINE=InnoDB" --recursion-method=none --no-check-replication-filters --alter-foreign-keys-method auto --print --execute
清理数据库磁盘碎片和在线更改表结构:
No slaves found. See --recursion-method if host host-192-168-25-212 has slaves.
Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
Operation, tries, wait:
analyze_table, 10, 1
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
No foreign keys reference `passport`.`userinfo`; ignoring --alter-foreign-keys-method.
Altering `passport`.`userinfo`...
Creating new table...
CREATE TABLE `passport`.`_userinfo_new` (
`UserId` int(11) NOT NULL DEFAULT '0',
`Industry` varchar(64) DEFAULT NULL,
`City` varchar(64) DEFAULT NULL,
`Job` varchar(64) DEFAULT NULL,
`WorkYears` varchar(64) DEFAULT NULL,
`Gender` tinyint(2) DEFAULT '1',
`NickName` varchar(64) DEFAULT NULL,
`Website` varchar(128) DEFAULT NULL,
`Description` varchar(512) DEFAULT NULL,
`QQ` varchar(32) DEFAULT NULL,
`MSN` varchar(64) DEFAULT NULL,
`Birthday` datetime DEFAULT NULL,
`Mobile` varchar(64) DEFAULT NULL,
`RealName` varchar(64) DEFAULT NULL,
`IsLocked` bit(1) DEFAULT b'0',
PRIMARY KEY (`UserId`),
KEY `ix_NickName` (`NickName`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Created new table passport._userinfo_new OK.
Altering new table...
ALTER TABLE `passport`.`_userinfo_new` ENGINE=InnoDB
Altered `passport`.`_userinfo_new` OK.
2023-11-08T11:00:30 Creating triggers...
-----------------------------------------------------------
Event : DELETE
Name : pt_osc_passport_userinfo_del
SQL : CREATE TRIGGER `pt_osc_passport_userinfo_del` AFTER DELETE ON `passport`.`userinfo` FOR EACH ROW BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; DELETE IGNORE FROM `passport`.`_userinfo_new` WHERE `passport`.`_userinfo_new`.`userid` <=> OLD.`userid`; END
Suffix: del
Time : AFTER
-----------------------------------------------------------
-----------------------------------------------------------
Event : UPDATE
Name : pt_osc_passport_userinfo_upd
SQL : CREATE TRIGGER `pt_osc_passport_userinfo_upd` AFTER UPDATE ON `passport`.`userinfo` FOR EACH ROW BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; DELETE IGNORE FROM `passport`.`_userinfo_new` WHERE !(OLD.`userid` <=> NEW.`userid`) AND `passport`.`_userinfo_new`.`userid` <=> OLD.`userid`; REPLACE INTO `passport`.`_userinfo_new` (`userid`, `industry`, `city`, `job`, `workyears`, `gender`, `nickname`, `website`, `description`, `qq`, `msn`, `birthday`, `mobile`, `realname`, `islocked`) VALUES (NEW.`userid`, NEW.`industry`, NEW.`city`, NEW.`job`, NEW.`workyears`, NEW.`gender`, NEW.`nickname`, NEW.`website`, NEW.`description`, NEW.`qq`, NEW.`msn`, NEW.`birthday`, NEW.`mobile`, NEW.`realname`, NEW.`islocked`); END
Suffix: upd
Time : AFTER
-----------------------------------------------------------
-----------------------------------------------------------
Event : INSERT
Name : pt_osc_passport_userinfo_ins
SQL : CREATE TRIGGER `pt_osc_passport_userinfo_ins` AFTER INSERT ON `passport`.`userinfo` FOR EACH ROW BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; REPLACE INTO `passport`.`_userinfo_new` (`userid`, `industry`, `city`, `job`, `workyears`, `gender`, `nickname`, `website`, `description`, `qq`, `msn`, `birthday`, `mobile`, `realname`, `islocked`) VALUES (NEW.`userid`, NEW.`industry`, NEW.`city`, NEW.`job`, NEW.`workyears`, NEW.`gender`, NEW.`nickname`, NEW.`website`, NEW.`description`, NEW.`qq`, NEW.`msn`, NEW.`birthday`, NEW.`mobile`, NEW.`realname`, NEW.`islocked`);END
Suffix: ins
Time : AFTER
-----------------------------------------------------------
2023-11-08T11:00:30 Created triggers OK.
2023-11-08T11:00:30 Copying approximately 30978638 rows...
INSERT LOW_PRIORITY IGNORE INTO `passport`.`_userinfo_new` (`userid`, `industry`, `city`, `job`, `workyears`, `gender`, `nickname`, `website`, `description`, `qq`, `msn`, `birthday`, `mobile`, `realname`, `islocked`) SELECT `userid`, `industry`, `city`, `job`, `workyears`, `gender`, `nickname`, `website`, `description`, `qq`, `msn`, `birthday`, `mobile`, `realname`, `islocked` FROM `passport`.`userinfo` FORCE INDEX(`PRIMARY`) WHERE ((`userid` >= ?)) AND ((`userid` <= ?)) LOCK IN SHARE MODE /*pt-online-schema-change 29249 copy nibble*/
SELECT /*!40001 SQL_NO_CACHE */ `userid` FROM `passport`.`userinfo` FORCE INDEX(`PRIMARY`) WHERE ((`userid` >= ?)) ORDER BY `userid` LIMIT ?, 2 /*next chunk boundary*/
Copying `passport`.`userinfo`: 9% 04:35 remain
Copying `passport`.`userinfo`: 20% 03:58 remain
Copying `passport`.`userinfo`: 30% 03:27 remain
Copying `passport`.`userinfo`: 40% 02:56 remain
Copying `passport`.`userinfo`: 51% 02:21 remain
2023-11-08T11:03:00 Copied rows OK.
2023-11-08T11:03:00 Analyzing new table...
2023-11-08T11:03:00 Swapping tables...
RENAME TABLE `passport`.`userinfo` TO `passport`.`_userinfo_old`, `passport`.`_userinfo_new` TO `passport`.`userinfo`
2023-11-08T11:03:00 Swapped original and new tables OK.
2023-11-08T11:03:00 Dropping old table...
DROP TABLE IF EXISTS `passport`.`_userinfo_old`
2023-11-08T11:03:00 Dropped old table `passport`.`_userinfo_old` OK.
2023-11-08T11:03:00 Dropping triggers...
DROP TRIGGER IF EXISTS `passport`.`pt_osc_passport_userinfo_del`
DROP TRIGGER IF EXISTS `passport`.`pt_osc_passport_userinfo_upd`
DROP TRIGGER IF EXISTS `passport`.`pt_osc_passport_userinfo_ins`
2023-11-08T11:03:00 Dropped triggers OK.
Successfully altered `passport`.`userinfo`.
–print:将工具执行的 SQL 语句打印到 STDOUT,可以和 --dry-run 同时使用。
–progress:在复制行时,将进度报告打印到 STDERR。该值是一个逗号分隔的列表,由两部分组成。第一部分可以是 percentage, time, iterations(每秒打印次数);第二部分指定对应的数值,表示打印的频率。
–statistics:打印统计信息。
–alter-foreign-keys-method :指定修改外键以使引用新表。当该工具重命名原始表以让新表取而代之时,外键跟随被重命名的表,因此必须更改外键以引用新表。支持两种方式:rebuild_constraints 和 drop_swap 。
由于表上创建有触发器,若表的更新此时比较频繁很可能遇见锁争用问题。之前在给线上表增加索引时就遇见过这种问题,应用端频繁的报死锁错误,在停止 pt-osc 并删除触发器后死锁问题解决。
由于 pt-osc 需要使用触发器来同步表上的变更,所以在使用时也有一些相应的限制:
更改有外键引用的表会让操作比较负载,目前 pt-osc 提供三种方式处理外键:
表上创建有触发器,若表的更新此时比较频繁很可能遇见锁争用问题。之前在给线上表增加索引时就遇见过这种问题,应用端频繁的报死锁错误,在停止 pt-osc 并删除触发器后死锁问题解决。
// 查看存在的触发器
SHOW TRIGGERS FROM passport;
//删除掉TRIGGERS
DROP TRIGGER [IF EXISTS] [schema_name.]trigger_name;
由于 pt-osc 使用 INSERT LOW_PRIORITY IGNORE 方式同步原始表和影子表之间的数据,所以若新建唯一索引的列上有重复数据将会导致数据的丢失。若是需要创建唯一索引或主键需要提前确认数据是否重复,是否允许缺失等,需要指定选项 --no-check-alter。