MySQL的NULL值、空值查询;模糊查询的like、%和=比较

MySQL的NULL值、空值查询;模糊查询的like、%和=比较(用初略测试数据)

mysql 用法 Explain

MySQL_执行计划详细说明

MySQL执行计划extra中的using index 和 using where using index 的区别

提要

​ 今天正好项目要设计数据库,再纠结以前没特地纠结的问题,那就是MySQL如果有字段可能不存在,是否要设置成NULL还是用NOT NULL default ''来设置空值,自己就无聊用3W条左右的数据试了下(数据量比较小,但是我自己电脑配置比较差,这样就已经跑了3000s左右。)

数据库表格原型

​ 话不多说,把测试用的表格贴上来。有4个字段,分别是自增主键id,标题title,简介profile,密码password。其中密码可能压根不存在,比如房间不需要输入密码就可以直接登录,而简介是不管有没有都需要至少显示为空值的(常理上我是这么想的)。

下面给除了本来就是主键的id以外的title、profile、password都设置了NORMAL索引,索引方法为BTREE

SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;

-- ----------------------------
-- Table structure for room
-- ----------------------------
DROP TABLE IF EXISTS `room`;
CREATE TABLE `room`  (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `title` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL COMMENT '标题',
  `profile` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL COMMENT '可以为空值',
  `password` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NULL DEFAULT NULL COMMENT '可以压根没有密码',
  PRIMARY KEY (`id`) USING BTREE,
  INDEX `title`(`title`) USING BTREE,
  INDEX `profile`(`profile`) USING BTREE,
  INDEX `password`(`password`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_0900_ai_ci ROW_FORMAT = Dynamic;

SET FOREIGN_KEY_CHECKS = 1;

测试数据插入

​ 我自己测的时候,一开始之前插入了5-6左右的垃圾数据,问题不大。(我的垃圾电脑跑了差不多3000s,要是电脑也不是很好的,建议就没必要特地尝试了)

下面重点就3句:

insert into room values (null,CONCAT("title",i),"",NULL);

insert into room values (null,CONCAT("title_",i),CONCAT("",i),NULL);

insert into room values (null,CONCAT("title__",i),"",CONCAT("",i));

delimiter //                            #定义标识符为双斜杠
drop procedure if exists test;          #如果存在test存储过程则删除
create procedure test()                 #创建无参存储过程,名称为test
begin
    declare i int;                      #申明变量
    set i = 0;                          #变量赋值
    lp : loop                           #lp为循环体名,可随意 loop为关键字
        insert into room values (null,CONCAT("title",i),"",NULL);    #往test表添加数据
				insert into room values (null,CONCAT("title_",i),CONCAT("",i),NULL);    #往test表添加数据
				insert into room values (null,CONCAT("title__",i),"",CONCAT("",i));    #往test表添加数据
        set i = i + 1;                  #循环一次,i加一
        if i > 10000 then                  #结束循环的条件: 当i大于10时跳出loop循环
            leave lp;
        end if; 
    end loop;
    select * from room;                 #查看test表数据
end
//                                      #结束定义语句
call test(); 

几组测试+结果

首先,由上面的存储过程procedure,可以知道,3W条数据里面应该有2W行的profile为空值,2W行的password为NULL。然后3W条title都以字符串"title"开头。下面EXPLAIN指令看不懂的可以看看我上面推荐的文章,前人写得很详细了。

下面的观点比较主观,建议要是确实想知道区别,自己动手用自己设计的数据模拟。毕竟我这里就单表查询,也没有用什么复杂的查询条件啥的

  1. password 通过判断NULL,能否用到索引,效率如何?

    从下面可以看出该查询语句is NULL用到了索引

    按照上面推荐的文章的意思就是,这里索引用来进行键值查找,而没有被用来实际进行查找动作。(毕竟这里is NULL就够筛选了,本身其实也算是很具体的查找条件了)。

EXPLAIN	SELECT * FROM	room WHERE `password` is NULL;
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ref password password 1023 const 15176 100.00 Using index condition

​ 下面查询(将近2W条)时间,按照运行的次数:0.224s->0.122s -> 后面基本就是0.1~0.17之间波动了。试了下删除索引,测试后,稳定下平均0.075s。索引这里设置索引等于浪费了空间,也没得到时间上的便宜。

SELECT * FROM	room WHERE `password` is NULL;

对比

​ 可以看出 is NOT NULL没有用到索引,直接全表查询了(主要还是因为搜索的字段是*,所有列)。

EXPLAIN	SELECT * FROM	room WHERE `password` is NOT NULL;
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ALL password (Null) (Null) (Null) 30352 50.00 Using Where

​ 下面查询(将近1W条)时间,按照运行的次数(最初几次已经被漏记了):0.68s->0.57s -> 后面基本就是0.05·0.06之间波动了。虽然走的全表查询,但是时间却差不多是is NULL的1/2,查询出来的数据量正好也是其1/2。这里初略判断is NULL实际没有走索引,所以查询(数据量2倍)出来的时间为不走索引的is NOT NULL的两倍。后面我删除索引试了下,查询时间还是差不多0.058s左右波动,也就是索引确实没有起到作用,但是也没有明显“副作用”。

SELECT * FROM	room WHERE `password` is NOT NULL;
  1. profile使用空值,而不用NULL来标识,看看效率如何。下面我试的情况比较多,就不一一说明了,自己看看就好。
  • = ""因为获取的是所有字段,由于索引不是所有字段,获取所有列需要回表查询,所以没标识Using Index。
EXPLAIN SELECT * FROM	room WHERE `profile` = "";
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ref profile profile 1022 const 15176 100.00 (Null)

​ 这里我试了删除profile索引和保留索引的情况,结果有索引平均0.125s;没索引平均0.081s。没错,就是没索引反而快了,我没有打错字。

SELECT * FROM	room WHERE `profile` = "";
  • ='',因为获取所有字段,所以没有用到索引。
EXPLAIN SELECT * FROM	room WHERE `profile` = '';
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ref profile profile 1022 const 15152 100.00 (Null)

… 实在有点太多了。都有点打算放弃打字的形式了。

上面不完整,主要觉得自己讲得不是很清楚,直接贴出我测的所有情况应该会更直观一点。反正看到这里的人应该都大概懂了。我就直接按照我想到的情况,把测试结果都贴出来得了。


下面贴出我测试的各种情况的结果(下面查询时间都是取多次查询后的稳定时间)

  1. passwordis NULL
  • password有设置NORMAL索引,索引方式为BTREE时
EXPLAIN SELECT * FROM	room WHERE `password` is NULL; -- Using index condition
EXPLAIN SELECT `id` FROM	room WHERE `password` is NULL; -- Using where; Using index
EXPLAIN SELECT `title` FROM	room WHERE `password` is NULL; -- Using index condition
EXPLAIN SELECT `profile` FROM	room WHERE `password` is NULL; -- Using index condition
EXPLAIN SELECT `password` FROM	room WHERE `password` is NULL; -- Using where; Using index
EXPLAIN SELECT COUNT(*) FROM	room WHERE `password` is NULL; -- Using where; Using index
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ref password password 1023 const 15152 100.00 Using index condition
SELECT * FROM	room WHERE `password` is NULL; -- 0.144 
SELECT `id` FROM	room WHERE `password` is NULL; -- 0.036
SELECT `title` FROM	room WHERE `password` is NULL; -- 0.124
SELECT `profile` FROM	room WHERE `password` is NULL; -- 0.101
SELECT `password` FROM	room WHERE `password` is NULL; -- 0.025
SELECT COUNT(*) FROM	room WHERE `password` is NULL; -- 0.016
  • password没设置索引
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ALL password (Null) (Null) (Null) 30304 10.00 Using where
SELECT * FROM	room WHERE `password` is NULL; -- 0.125s
SELECT `id` FROM	room WHERE `password` is NULL; -- 0.066s
SELECT `title` FROM	room WHERE `password` is NULL; -- 0.061s 
SELECT `profile` FROM	room WHERE `password` is NULL; -- 0.065s
SELECT `password` FROM	room WHERE `password` is NULL; -- 0.034s
SELECT COUNT(*) FROM	room WHERE `password` is NULL; -- 0.024s

初略得出: Using where; Using index > Using index > Using where > Using index condition

  1. passwordis NOT NULL
  • password有设置NORMAL索引,索引方式为BTREE时
EXPLAIN SELECT * FROM	room WHERE `password` is NOT NULL; 
EXPLAIN SELECT `id` FROM	room WHERE `password` is NOT NULL; 
EXPLAIN SELECT `title` FROM	room WHERE `password` is NOT NULL; 
EXPLAIN SELECT `profile` FROM	room WHERE `password` is NOT NULL; 
EXPLAIN SELECT `password` FROM	room WHERE `password` is NOT NULL; 
EXPLAIN SELECT COUNT(*) FROM	room WHERE `password` is NOT NULL; 
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ALL password (Null) (Null) (Null) 30304 50.00 Using where
index password 1023 Using where; Using index
ALL (Null) (Null) Using where
index password 1023 Using where; Using index
SELECT * FROM	room WHERE `password` is NOT NULL; -- 0.046
SELECT `id` FROM	room WHERE `password` is NOT NULL; -- 0.023
SELECT `title` FROM	room WHERE `password` is NOT NULL; -- 0.026
SELECT `profile` FROM	room WHERE `password` is NOT NULL; -- 0.024
SELECT `password` FROM	room WHERE `password` is NOT NULL; -- 0.018
SELECT COUNT(*) FROM	room WHERE `password` is NOT NULL; -- 0.015
  • password没设置索引
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ALL (Null) (Null) (Null) (Null) 30304 90.00 Using where
SELECT * FROM	room WHERE `password` is NOT NULL; -- 0.053
SELECT `id` FROM	room WHERE `password` is NOT NULL; -- 0.036
SELECT `title` FROM	room WHERE `password` is NOT NULL; -- 0.036
SELECT `profile` FROM	room WHERE `password` is NOT NULL; -- 0.032
SELECT `password` FROM	room WHERE `password` is NOT NULL; -- 0.029
SELECT COUNT(*) FROM	room WHERE `password` is NOT NULL; -- 0.039
  1. profile=''
  • profile有设置NORMAL索引,索引方式为BTREE时(没标注的默认和上一行一样,或者都一样)
EXPLAIN SELECT * FROM	room WHERE `profile` =''; -- (Null)
EXPLAIN SELECT `id` FROM	room WHERE `profile` =''; -- Using index 
EXPLAIN SELECT `title` FROM	room WHERE `profile` =''; -- (Null) 
EXPLAIN SELECT `profile` FROM	room WHERE `profile` =''; -- Using index 
EXPLAIN SELECT `password` FROM	room WHERE `profile` =''; -- (Null) 
EXPLAIN SELECT COUNT(*) FROM	room WHERE `profile` =''; -- Using index 
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ref profile profile 1022 const 15152 100.00 (Null)
SELECT * FROM	room WHERE `profile` =''; -- 0.164
SELECT `id` FROM	room WHERE `profile` =''; -- 0.053
SELECT `title` FROM	room WHERE `profile` =''; -- 0.105
SELECT `profile` FROM	room WHERE `profile` =''; -- 0.029
SELECT `password` FROM	room WHERE `profile` =''; -- 0.093
SELECT COUNT(*) FROM	room WHERE `profile` =''; -- 0.015

光这么看,用=’'所有上面列举的情况看下来,平均会比NULL的空值查找快一点。

  • profile没设置索引
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ALL (Null) (Null) (Null) (Null) 30304 10.00 Using where
SELECT * FROM	room WHERE `profile` =''; -- 0.09
SELECT `id` FROM	room WHERE `profile` =''; -- 0.114
SELECT `title` FROM	room WHERE `profile` =''; -- 0.081
SELECT `profile` FROM	room WHERE `profile` =''; -- 0.049
SELECT `password` FROM	room WHERE `profile` =''; -- 0.069
SELECT COUNT(*) FROM	room WHERE `profile` =''; -- 0.003
  1. profile!=''
  • profile有设置NORMAL索引,索引方式为BTREE时(没标注的默认和上一行一样,或者都一样)
EXPLAIN SELECT * FROM	room WHERE `profile` !=''; -- ALL
EXPLAIN SELECT `id` FROM	room WHERE `profile` !=''; -- range
EXPLAIN SELECT `title` FROM	room WHERE `profile` !=''; -- ALL
EXPLAIN SELECT `profile` FROM	room WHERE `profile` !=''; -- range
EXPLAIN SELECT `password` FROM	room WHERE `profile` !=''; -- ALL
EXPLAIN SELECT COUNT(*) FROM	room WHERE `profile` !=''; -- range
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ALL profile (Null) (Null) (Null) 30304 33.06 Using where
range profile 1022 10017 100.00 Using where; Using index
SELECT * FROM	room WHERE `profile` !=''; -- 0.055
SELECT `id` FROM	room WHERE `profile` !=''; -- 0.017
SELECT `title` FROM	room WHERE `profile` !=''; -- 0.039
SELECT `profile` FROM	room WHERE `profile` !=''; -- 0.015
SELECT `password` FROM	room WHERE `profile` !=''; -- 0.034
SELECT COUNT(*) FROM	room WHERE `profile` !=''; -- 0.009
  • profile没设置索引
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ALL (Null) (Null) (Null) (Null) 30304 90.00 Using where
SELECT * FROM	room WHERE `profile` !=''; -- 0.041
SELECT `id` FROM	room WHERE `profile` !=''; -- 0.023
SELECT `title` FROM	room WHERE `profile` !=''; -- 0.037
SELECT `profile` FROM	room WHERE `profile` !=''; -- 0.026
SELECT `password` FROM	room WHERE `profile` !=''; -- 0.025
SELECT COUNT(*) FROM	room WHERE `profile` !=''; -- 0.017

可能因为我数据量太少,!=''没看出来有比is NOT NULL好到哪去,但是还是能大致推测数据量大的时候,!=''会比is NOT NULL表现好。

至少到这里为止,我靠上面的数据+一些主观推测,认为要是单表情况下考虑索引的效率,那么空值会比NULL效率高。

  1. profile=""
  • profile有设置NORMAL索引,索引方式为BTREE时
EXPLAIN SELECT * FROM	room WHERE `profile` =""; 
EXPLAIN SELECT `id` FROM	room WHERE `profile` =""; 
EXPLAIN SELECT `title` FROM	room WHERE `profile` =""; 
EXPLAIN SELECT `profile` FROM	room WHERE `profile` =""; 
EXPLAIN SELECT `password` FROM	room WHERE `profile` =""; 
EXPLAIN SELECT COUNT(*) FROM	room WHERE `profile` =""; 
-- 结果和 ='' 一样,就不贴出来了
SELECT * FROM	room WHERE `profile` =""; -- 0.122
SELECT `id` FROM	room WHERE `profile` =""; -- 0.037
SELECT `title` FROM	room WHERE `profile` =""; -- 0.093
SELECT `profile` FROM	room WHERE `profile` =""; -- 0.026
SELECT `password` FROM	room WHERE `profile` =""; -- 0.089
SELECT COUNT(*) FROM	room WHERE `profile` =""; -- 0.016
-- 和 ='' 结果差不多下面不多做展开,毕竟也都说""和''没区别,我就好奇试试而已,看看有没有啥‘魔法’
  1. profilelike ''
  • profile有设置NORMAL索引,索引方式为BTREE时
EXPLAIN SELECT * FROM	room WHERE `profile` like ''; -- ALL
EXPLAIN SELECT `id` FROM	room WHERE `profile` like ''; -- index
EXPLAIN SELECT `title` FROM	room WHERE `profile` like ''; -- ALL
EXPLAIN SELECT `profile` FROM	room WHERE `profile` like ''; -- index
EXPLAIN SELECT `password` FROM	room WHERE `profile` like ''; -- ALL
EXPLAIN SELECT COUNT(*) FROM	room WHERE `profile` like ''; -- index
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ALL profile (Null) (Null) 30304 50.00 Using where
index profile 1022 30304 50.00 Using where; Using index
SELECT * FROM	room WHERE `profile` like ''; -- 0.071
SELECT `id` FROM	room WHERE `profile` like ''; -- 0.036
SELECT `title` FROM	room WHERE `profile` like ''; -- 0.048
SELECT `profile` FROM	room WHERE `profile` like ''; -- 0.032
SELECT `password` FROM	room WHERE `profile` like ''; -- 0.044
SELECT COUNT(*) FROM	room WHERE `profile` like ''; -- 0.022
  • profile没设置索引
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ALL (Null) (Null) (Null) (Null) 30304 11.11 Using where
SELECT * FROM	room WHERE `profile` like ''; -- 0.094
SELECT `id` FROM	room WHERE `profile` like ''; -- 0.047
SELECT `title` FROM	room WHERE `profile` like ''; -- 0.044
SELECT `profile` FROM	room WHERE `profile` like ''; -- 0.04
SELECT `password` FROM	room WHERE `profile` like ''; -- 0.042
SELECT COUNT(*) FROM	room WHERE `profile` like ''; -- 0.02

从上面几个比较,考虑like一般用于模糊查询,且从EXPLAIN的结果和查询的时间来看,有字段属性犹豫要为NULL还是空值’’,如果极端要求索引速度可以使用’'空值。(我这里数据量少、查询情况也不够复杂,可能没啥太大说服力)。要是NULL没有什么特别的优化的话(这个我没怎么了解),我个人可能偏向用空值而不是NULL,因为NULL再之后要是修改数据库属性或者java类,会有更多协商问题,但是如果用空值,就可以省去很多事情。

  1. titlelike 'title%'
  • title有设置NORMAL索引,索引方式为BTREE时
EXPLAIN SELECT * FROM	room WHERE `title` like 'title%'; -- ALL
EXPLAIN SELECT `id` FROM	room WHERE `title` like 'title%'; -- index
EXPLAIN SELECT `title` FROM	room WHERE `title` like 'title%'; -- index
EXPLAIN SELECT `profile` FROM	room WHERE `title` like 'title%'; -- ALL
EXPLAIN SELECT `password` FROM	room WHERE `title` like 'title%'; -- ALL
EXPLAIN SELECT COUNT(*) FROM	room WHERE `title` like 'title%'; -- index
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ALL title (Null) (Null) (Null) 30304 50.00 Using where
index title title 1022 50.00 Using where; Using index
SELECT * FROM	room WHERE `title` like 'title%'; -- 0.103
SELECT `id` FROM	room WHERE `title` like 'title%'; -- 0.087
SELECT `title` FROM	room WHERE `title` like 'title%'; -- 0.08
SELECT `profile` FROM	room WHERE `title` like 'title%'; -- 0.046
SELECT `password` FROM	room WHERE `title` like 'title%'; -- 0.048
SELECT COUNT(*) FROM	room WHERE `title` like 'title%'; -- 0.042
  • title没设置索引
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ALL (Null) (Null) (Null) (Null) 30304 11.11 Using where
SELECT * FROM	room WHERE `title` like 'title%'; -- 0.089
SELECT `id` FROM	room WHERE `title` like 'title%'; -- 0.049
SELECT `title` FROM	room WHERE `title` like 'title%'; -- 0.053
SELECT `profile` FROM	room WHERE `title` like 'title%'; -- 0.049
SELECT `password` FROM	room WHERE `title` like 'title%'; -- 0.045
SELECT COUNT(*) FROM	room WHERE `title` like 'title%'; -- 0.027
  1. titlelike '%title%'
  • title有设置NORMAL索引,索引方式为BTREE时
EXPLAIN SELECT * FROM	room WHERE `title` like '%title%'; -- ALL
EXPLAIN SELECT `id` FROM	room WHERE `title` like '%title%'; -- index
EXPLAIN SELECT `title` FROM	room WHERE `title` like '%title%'; -- index
EXPLAIN SELECT `profile` FROM	room WHERE `title` like '%title%'; -- ALL
EXPLAIN SELECT `password` FROM	room WHERE `title` like '%title%'; -- ALL
EXPLAIN SELECT COUNT(*) FROM	room WHERE `title` like '%title%'; -- index
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE room (Null) ALL (Null) (Null) (Null) (Null) 30304 11.11 Using where
index title 1022 30304 Using where; Using index
SELECT * FROM	room WHERE `title` like '%title%'; -- 0.045
SELECT `id` FROM	room WHERE `title` like '%title%'; --  0.025
SELECT `title` FROM	room WHERE `title` like '%title%'; -- 0.024
SELECT `profile` FROM	room WHERE `title` like '%title%'; -- 0.031
SELECT `password` FROM	room WHERE `title` like '%title%'; -- 0.03
SELECT COUNT(*) FROM	room WHERE `title` like '%title%'; -- 0.012
  • title没设置索引
EXPLAIN SELECT * FROM	room WHERE `title` like '%title%'; -- ALL
EXPLAIN SELECT `id` FROM	room WHERE `title` like '%title%'; -- index
EXPLAIN SELECT `title` FROM	room WHERE `title` like '%title%'; -- index
EXPLAIN SELECT `profile` FROM	room WHERE `title` like '%title%'; -- ALL
EXPLAIN SELECT `password` FROM	room WHERE `title` like '%title%'; -- ALL
EXPLAIN SELECT COUNT(*) FROM	room WHERE `title` like '%title%'; -- index
-- `title`  和 `like 'title%'`, 和这个没区别
SELECT * FROM	room WHERE `title` like '%title%'; -- 0.118
SELECT `id` FROM	room WHERE `title` like '%title%'; -- 0.056
SELECT `title` FROM	room WHERE `title` like '%title%'; -- 0.076
SELECT `profile` FROM	room WHERE `title` like '%title%'; -- 0.042
SELECT `password` FROM	room WHERE `title` like '%title%'; -- 0.044
SELECT COUNT(*) FROM	room WHERE `title` like '%title%'; -- 0.023

总结


like 模糊查询的%不要乱用,够用就好,但是一般图方便都是%string%,这种场景也相对比较多。

​ 如果是确定值的字符串比较,那当然还是用=,而不是模糊查询了。

​ 然后就是NULL和空值的选择,速度上我这里极少量数据是认为空值快一点,而且可以避免一些麻烦的协商问题。但是NULL也有NULL的特性,比如COUNT(含有NULL的列)是不会记录含有NULL的行的。不过要是为了方便后期调整,个人觉得空值会更方便一点。

你可能感兴趣的:(MySQL)