找回密码
立即注册
搜索
热搜: Java Python Linux Go
发回帖 发新帖

759

积分

0

好友

99

主题
发表于 5 小时前 | 查看: 0| 回复: 0

数据库设计思考笔记本配图

来源: cnblogs.com/aaron8219/p/9259379.html

我们常听到一种说法:“使用了NULL值的列将会使索引失效”。但如果你实际测试一下就会发现,IS NULL这样的查询其实可以使用索引。可见,这个广为流传的说法存在漏洞。

如果你想直接看结论,可以跳到文章最后。

前言

NULL值是对列的一种特殊约束。当我们创建新列时,如果没有明确使用NOT NULL关键字进行声明,MySQL 就会默认为该列添加NULL约束。

有些开发者出于便利(减少判空代码)的考虑,在创建数据表时直接采用MySQL的默认设置,即允许字段为NULL值。然而,这一习惯很容易在使用NULL的场景中导致不确定的查询结果,甚至引发数据库性能下降。

什么是NULL?

NULL并不意味着什么都没有,我们必须注意 NULL''(空字符串)是两个完全不同的值。

MySQL中主要使用三个操作符来处理NULL值:

  • IS NULL
  • IS NOT NULL
  • <=> (太空船操作符)。这个操作符很像=,例如SELECT NULL<=>NULL会返回true,但SELECT NULL=NULL会返回false

此外还有一个IFNULL()函数,具体用法请自行查阅。

示例

NULL通过任何比较操作符(除<=>外)与其他值比较,结果都会得到NULL

1 (root@localhost mysql3306.sock)[zlm]> create table test_null(
2     -> id int not null,
3     -> name varchar(10)
4     -> );
5 Query OK, 0 rows affected (0.02 sec)
6
7 (root@localhost mysql3306.sock)[zlm]> insert into test_null values(1,'zlm');
8 Query OK, 1 row affected (0.00 sec)
9
10 (root@localhost mysql3306.sock)[zlm]> insert into test_null values(2,null);
11 Query OK, 1 row affected (0.00 sec)
12
13 (root@localhost mysql3306.sock)[zlm]> select * from test_null;
14 +----+------+
15 | id | name |
16 +----+------+
17 |  1 | zlm  |
18 |  2 | NULL |
19 +----+------+
20 2 rows in set (0.00 sec)
21 // ---------------------->这个很有代表性<----------------------
22 (root@localhost mysql3306.sock)[zlm]> select * from test_null where name=null;
23 Empty set (0.00 sec)
24
25 (root@localhost mysql3306.sock)[zlm]> select * from test_null where name is null;
26 +----+------+
27 | id | name |
28 +----+------+
29 |  2 | NULL |
30 +----+------+
31 1 row in set (0.00 sec)
32
33 (root@localhost mysql3306.sock)[zlm]> select * from test_null where name is not null;
34 +----+------+
35 | id | name |
36 +----+------+
37 |  1 | zlm  |
38 +----+------+
39 1 row in set (0.00 sec)
40
41 (root@localhost mysql3306.sock)[zlm]> select * from test_null where null=null;
42 Empty set (0.00 sec)
43
44 (root@localhost mysql3306.sock)[zlm]> select * from test_null where null<>null;
45 Empty set (0.00 sec)
46
47 (root@localhost mysql3306.sock)[zlm]> select * from test_null where null<=>null;
48 +----+------+
49 | id | name |
50 +----+------+
51 |  1 | zlm  |
52 |  2 | NULL |
53 +----+------+
54 2 rows in set (0.00 sec)
55 // null<=>null always return true,it‘s equal to ”where 1=1“.

NULL代表一个“缺失且未知”的值。这意味着,即使是两个NULL,它们也不一定相等(类似于C语言中未初始化的局部变量)。

1 (root@localhost mysql3306.sock)[zlm]> SELECT 0 IS NULL, 0 IS NOT NULL, ‘‘ IS NULL, ‘‘ IS NOT NULL;
2 +-----------+---------------+------------+----------------+
3 | 0 IS NULL | 0 IS NOT NULL | ‘‘ IS NULL | ‘‘ IS NOT NULL |
4 +-----------+---------------+------------+----------------+
5 |         0 |             1 |          0 |              1 |
6 +-----------+---------------+------------+----------------+
7 1 row in set (0.00 sec)
8
9 //It‘s not equal to zero number or vacant string.
10 //In MySQL,0 means fasle,1 means true.
11
12 (root@localhost mysql3306.sock)[zlm]>SELECT 1 = NULL, 1 <> NULL, 1 < NULL, 1 > NULL;
13 +----------+-----------+----------+----------+
14 | 1 = NULL | 1 <> NULL | 1 < NULL | 1 > NULL |
15 +----------+-----------+----------+----------+
16 |     NULL |      NULL |     NULL |     NULL |
17 +----------+-----------+----------+----------+
18 1 row in set (0.00 sec)
19
20 //It cannot be compared with number.
21 //In MySQL,null means false,too.

任何包含NULL值的表达式,其结果都会是NULL

1 (root@localhost mysql3306.sock)[zlm]> select ifnull(null, ’First is null‘),ifnull(null+10, ’First is null‘),ifnull(concat(’abc‘,null), ’First is null‘);
2 +------------------------------+---------------------------------+--------------------------------------------+
3 | ifnull(null,‘First is null‘) | ifnull(null+10,‘First is null‘) | ifnull(concat(’abc‘,null),‘First is null‘) |
4 +------------------------------+---------------------------------+--------------------------------------------+
5 | First is null                | First is null                   | First is null                              |
6 +------------------------------+---------------------------------+--------------------------------------------+
7 1 row in set (0.00 sec)
8
9 //null value needs to be disposed with ifnull() function,what usually causes sql statement more complex.
10 //As we all know,MySQL does not support funcion index.Therefore,indexes on the column may not be used.That‘s really worse.

NULL带来的具体问题

影响统计结果

使用count(*)count(null column)的结果不同,count(null column)的结果会小于或等于count(*)

1 (root@localhost mysql3306.sock)[zlm]> select count(*),count(name) from test_null;
2 +----------+-------------+
3 | count(*) | count(name) |
4 +----------+-------------+
5 |        2 |           1 |
6 +----------+-------------+
7 1 row in set (0.00 sec)
8
9 //count(*) returns all rows ignore the null while count(name) returns the non-null rows in column ”name“.
10 //This will also leads to uncertainty if someone is unaware of the details above.

如果使用者对NULL属性不熟悉,很容易统计出错误的结果。

干扰分组、排序和去重

虽然SELECT NULL=NULL的结果为false,但在使用distinctgroup byorder by时,所有NULL值会被认为是相同的。

1 (root@localhost mysql3306.sock)[zlm]> insert into test_null values(3,null);
2 Query OK, 1 row affected (0.00 sec)
3
4 (root@localhost mysql3306.sock)[zlm]> select distinct name from test_null;
5 +------+
6 | name |
7 +------+
8 | zlm  |
9 | NULL |
10 +------+
11 2 rows in set (0.00 sec)
12
13 //Two rows of null value returned one and the result became two.
14
15 (root@localhost mysql3306.sock)[zlm]> select name from test_null group by name;
16 +------+
17 | name |
18 +------+
19 | NULL |
20 | zlm  |
21 +------+
22 2 rows in set (0.00 sec)
23
24 //Two rows of null value were put into the same group.
25 //By default,group by will also sort the result(null row showed first).
26
27 (root@localhost mysql3306.sock)[zlm]> select id,name from test_null order by name;
28 +----+------+
29 | id | name |
30 +----+------+
31 |  2 | NULL |
32 |  3 | NULL |
33 |  1 | zlm  |
34 +----+------+
35 3 rows in set (0.00 sec)
36
37 //Three rows were sorted(two null rows showed first).

关于索引的真相

与Oracle不同,MySQL支持在包含NULL值的列上使用索引。我们常说的“列上含有NULL会使索引失效”对MySQL而言并不完全准确。

1 (root@localhost mysql3306.sock)[sysbench]> show tables;
2 +--------------------+
3 | Tables_in_sysbench |
4 +--------------------+
5 | sbtest1            |
6 | sbtest10           |
7 | sbtest2            |
8 | sbtest3            |
9 | sbtest4            |
10 | sbtest5            |
11 | sbtest6            |
12 | sbtest7            |
13 | sbtest8            |
14 | sbtest9            |
15 +--------------------+
16 10 rows in set (0.00 sec)
17
18 (root@localhost mysql3306.sock)[sysbench]> show create table sbtest1\G
19 *************************** 1. row ***************************
20 Table: sbtest1
21 Create Table: CREATE TABLE `sbtest1` (
22 `id` int(11) NOT NULL AUTO_INCREMENT,
23 `k` int(11) NOT NULL DEFAULT ‘0‘,
24 `c` char(120) NOT NULL DEFAULT ‘‘,
25 `pad` char(60) NOT NULL DEFAULT ‘‘,
26   PRIMARY KEY (`id`),
27   KEY `k_1` (`k`)
28 ) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=utf8
29 1 row in set (0.00 sec)
30
31 (root@localhost mysql3306.sock)[sysbench]> alter table sbtest1 modify k int null,modify c char(120) null,modify pad char(60) null;
32 Query OK, 0 rows affected (4.14 sec)
33 Records: 0  Duplicates: 0  Warnings: 0
34
35 (root@localhost mysql3306.sock)[sysbench]> insert into sbtest1 values(100001,null,null,null);
36 Query OK, 1 row affected (0.00 sec)
37
38 (root@localhost mysql3306.sock)[sysbench]> explain select id,k from sbtest1 where id=100001;
39 +----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
40 | id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
41 +----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
42 |  1 | SIMPLE      | sbtest1 | NULL       | const | PRIMARY       | PRIMARY | 4       | const |    1 |   100.00 | NULL  |
43 +----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
44 1 row in set, 1 warning (0.00 sec)
45
46 (root@localhost mysql3306.sock)[sysbench]> explain select id,k from sbtest1 where k is null;
47 +----+-------------+---------+------------+------+---------------+------+---------+-------+------+----------+--------------------------+
48 | id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra                    |
49 +----+-------------+---------+------------+------+---------------+------+---------+-------+------+----------+--------------------------+
50 |  1 | SIMPLE      | sbtest1 | NULL       | ref  | k_1           | k_1  | 5       | const |    1 |   100.00 | Using where; Using index |
51 +----+-------------+---------+------------+------+---------------+------+---------+-------+------+----------+--------------------------+
52 1 row in set, 1 warning (0.00 sec)
53
54 //In the first query,the newly added row is retrieved(检索) by primary key.
55 //In the second query,the newly added row is retrieved by secondary key ”k_1“
56 //It has been proved that indexes can be used on the columns which contain null value.

通过explain可以看到MySQL确实支持在含有NULL值的列上使用索引。
需要注意的是,列“k”是int类型,占用4字节,但执行计划中key_len却显示为5。这是因为存储NULL值需要额外的1个字节来标记该行为NULL

下面是一个补充的测试例子:

mysql> select * from test_1;
+-----------+------+------+
| name      | code | id   |
+-----------+------+------+
| gaoyi     | wo   |    1 |
| gaoyi     | w    |    2 |
| chuzhong  | wo   |    3 |
| chuzhong  | w    |    4 |
| xiaoxue   | dd   |    5 |
| xiaoxue   | dfdf |    6 |
| sujianhui | su   |   99 |
| sujianhui | NULL |   99 |
+-----------+------+------+
8 rows in set (0.00 sec)

mysql> explain select * from test_1 where code is NULL;
+----+-------------+--------+------------+------+---------------+------------+---------+-------+------+----------+-----------------------+
| id | select_type | table  | partitions | type | possible_keys | key        | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+--------+------------+------+---------------+------------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | test_1 | NULL       | ref  | index_code    | index_code | 161     | const |    1 |   100.00 | Using index condition |
+----+-------------+--------+------------+------+---------------+------------+---------+-------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select * from test_1 where code is not NULL;
+----+-------------+--------+------------+-------+---------------+------------+---------+------+------+----------+-----------------------+
| id | select_type | table  | partitions | type  | possible_keys | key        | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+--------+------------+-------+---------------+------------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | test_1 | NULL       | range | index_code    | index_code | 161     | NULL |    7 |   100.00 | Using index condition |
+----+-------------+--------+------------+-------+---------------+------------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select * from test_1 where code=‘dd‘;
+----+-------------+--------+------------+------+---------------+------------+---------+-------+------+----------+-----------------------+
| id | select_type | table  | partitions | type | possible_keys | key        | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+--------+------------+------+---------------+------------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | test_1 | NULL       | ref  | index_code    | index_code | 161     | const |    1 |   100.00 | Using index condition |
+----+-------------+--------+------------+------+---------------+------------+---------+-------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select * from test_1 where code like ”dd%“;
+----+-------------+--------+------------+-------+---------------+------------+---------+------+------+----------+-----------------------+
| id | select_type | table  | partitions | type  | possible_keys | key        | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+--------+------------+-------+---------------+------------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | test_1 | NULL       | range | index_code    | index_code | 161     | NULL |    1 |   100.00 | Using index condition |
+----+-------------+--------+------------+-------+---------------+------------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

总结

在列中使用NULL值容易引发不可控的问题,有时还会严重拖慢系统性能。

例如:

  • 影响聚合函数结果:对含有NULL值的列进行统计计算(如count()max()min()),结果可能不符合预期。
  • 干扰数据操作:影响distinctgroup byorder by的行为,导致错误的排序或分组结果。
  • 增加代码复杂度:为了处理NULL带来的不确定性,往往需要在SQL中使用IFNULL()等函数来确保结果可控,这增加了程序的复杂性。
  • 额外存储开销NULL值并非占用原有字段的空间存储,而是需要额外申请一个字节来标记该字段为NULL(就像一个额外的标志位)。

鉴于以上种种弊端,我们不推荐在列中设置NULL作为默认值。最佳实践是在所有列上明确使用NOT NULL约束,并使用0、空字符串‘’或其他的业务默认值来替代NULL,从而确保数据的确定性和查询的高效性。

如果你对数据库设计规范、SQL优化等话题感兴趣,欢迎到 云栈社区 与更多开发者一起交流探讨。




上一篇:iPhone 18存储成本上涨分析:大容量版本为何面临涨价压力?
下一篇:vivo大数据平台:基于湖仓一体与StarRocks实现分钟级数据时效性实战
您需要登录后才可以回帖 登录 | 立即注册

手机版|小黑屋|网站地图|云栈社区 ( 苏ICP备2022046150号-2 )

GMT+8, 2026-1-25 18:07 , Processed in 0.293049 second(s), 42 queries , Gzip On.

Powered by Discuz! X3.5

© 2025-2026 云栈社区.

快速回复 返回顶部 返回列表