索引及执行计划管理
1. 索引的作用
类似于一本书的目录,起到优化查询的功能。
2. 索引类型(笔试)
BTREE(树)索引.
RTREE(树)索引
HASH索引
全文索引
3. BTREE的细分类(算法)
B-TREE
B+TREE
B*TREE
新新
4. Btree索引的功能分类
辅助索引(二级索引)
聚集索引(集群索引) Cluster Index
覆盖索引
前缀索引
5. B树是如何构建的
辅助索引 alter table t1 add index idx_name(name);
- 将name列的所有值取出,进行自动排序
- 将拍完序的值均匀的落到16KB叶子节点页中,并将索引键值所对应的数据行的聚集索引列值
- 向上生成枝节点和根节点
聚集索引(IOT)
- 默认安装主键生成聚集索引,没有主键,存储引擎会使用唯一键,如果都没有,会自动生成一个隐式的聚集索引。
- 数据在存储时,就会按照聚集索引的顺序存储到磁盘的数据页
- 由于本身数据时有序的,所以在聚集索引构建时,不需要进行排序。
- 聚集索引直接将磁盘的数据页作为叶子节点。
- 枝节点和根节点只会调取下层节点主键的最小值
辅助索引和聚集索引的区别:
(1).辅助索引
- 叶子节点只保存主键值+索引键值的有序存储
- 对索引键值会自动排序
- 需要手工创建
- 辅助索引可以有多个
- 任何列都可以创建
(2).聚集索引
- 只能在主键列生成,唯一且非空
- 数据存储时,就是按照聚集索引顺序进行有序存储
叶子节点不需要单独生成
新新
6.辅助索引细分
(1) 单列辅助索引
select * from t1 where name=''
(2) 联合索引
select * from t1 where a and b and c
(3) 唯一索引
(4) 前缀索引
7. 索引树的高度(越低越好)
(1) 表的数据量级大
分区表(比较旧的方案)
分库分表(分布式架构)
(2) 索引键值的长度
① 尽可能选择列值短的的创建索引
② 采用前缀索引
(3) 数据类型的选择
varchar和char的选择
enum
(4) 聚集索引列的长度
8. 索引的管理
① 压力测试准备
create database test charset utf8mb4;
use test;
create table t100w (id int,num int,k1 char(2),k2 char(4),dt timestamp);
delimiter //
create procedure rand_data(in num int)
begin
declare str char(62) default 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
declare str2 char(2);
declare str4 char(4);
declare i int default 0;
while i<num do
set str2=concat(substring(str,1+floor(rand()*61),1),substring(str,1+floor(rand()*61),1));
set str4=concat(substring(str,1+floor(rand()*61),2),substring(str,1+floor(rand()*61),2));
set i=i+1;
insert into t100w values (i,floor(rand()*num),str2,str4,now());
end while;
end;
//
delimiter ;
>>插入100w条数据:
call rand_data(1000000);
commit;
oldguo[test]>select count(*) from test.t100w; (查看完成率)
--- 压力测试命令
mysqlslap --defaults-file=/etc/my.cnf --concurrency=100 --iterations=1 --create-schema='test' --query="select * from test.t100w where k2='VWtu'" engine=innodb --number-of-queries=2000 -uroot -pwangxin -verbose
<查看进程列表>
show processlist;
② 索引命令操作
--- 查询索引
oldguo[school]>use school
oldguo[school]>desc student;
Key:PRI(主键),UNI(唯一索引),MUL(辅助索引)
show index from student\G
--- 创建索引
<创建单个索引>
alter table student add index idx_name(sname);
<创建联合索引>
alter table student add index idx_sname_sage_ssex(sname,sage,ssex);
idx_a_b_c ------> idx_a + idx_a_b + idx_a_b_c
<创建前缀索引(防止索引过长,取前五个字符创建索引)>
alter table student add index idx(sname(5));
<创建唯一索引>
alter table student add telnum char(11) not null;
alter table student add unique index idx_tell(telnum);
<删除索引>
alter table student drop index idx;
alter table student drop index idx_name;
alter table student add unique index idx_tell(telnum);
--- 查询索引
show index from student;
9. explain(desc)索引查询
explain select * from test.t100w where k2='VWtu';
>>或者
desc select * from test.t100w where k2='VWtu';
作用:抓取优化器优化过的执行计划
(1). 执行计划的分析
mysql[test]>explain select * from test.t100w where k2='VWtu';
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
| 1 | SIMPLE | t100w | NULL | ref | idx_k2 | idx_k2 | 17 | const | 1116 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
table : 以上SQL语句涉及到的表 ***
type : 查询的类型(全表扫描,索引扫描,查不到数据) *****
possible_keys: 可能会用到的索引 ***
key : 使用到的索引 ****
key_len : 索引的覆盖长度 *****
extra : 额外的信息 ****
--- 注:五星重点关注,四星熟悉,三星要清楚
新新
(2). type 详细说明
ALL :全表扫描,不会走任何索引
- ① 查询条件,没建索引
mysql[test]>explain select * from test.t100w where k2='VWtu';
mysql[test]>desc t100w;
+-------+-----------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+-------------------+-----------------------------+
| id | int(11) | YES | | NULL | |
| num | int(11) | YES | | NULL | |
| k1 | char(2) | YES | | NULL | |
| k2 | char(4) | YES | MUL | NULL | |
| dt | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+-------+-----------+------+-----+-------------------+-----------------------------+
5 rows in set (0.01 sec)
- ② 有索引不走
mysql[test]>desc select * from t100w where k2 !='asdf';
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | t100w | NULL | ALL | idx_k2 | NULL | NULL | NULL | 997537 | 88.45 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
>>避免出现的语句(有索引不走):
desc select * from t100w where k2 like 'aa%';
desc select * from t100w where k2 not in ('asda','asas')
desc select * from t100w;
<注意:!=和not in如果是主键,则走range>
index:全索引扫描
mysql[test]>desc select k2 from t100w;
+----+-------------+-------+------------+-------+---------------+--------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | t100w | NULL | index | NULL | idx_k2 | 17 | NULL | 997537 | 100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
从range开始,索引才是有价值的
range:索引范围查询
>>所有索引:
>>辅助索引
> , < , >= , <= , like , between and
--- 示例:
mysql[world]>desc select * from city where id<10;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | city | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 9 | 100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
--- 示例:
mysql[world]>desc select * from city where countrycode like 'CH%';
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | city | NULL | range | CountryCode | CountryCode | 3 | NULL | 397 | 100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.34 sec)
in() , or
--- 示例:
mysql[world]>desc select * from city where countrycode in ('CHN','USA');
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | city | NULL | range | CountryCode | CountryCode | 3 | NULL | 637 | 100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)
<说明:
B+tree 索引能额外优化到: > , < , >= , <= , like , between and
in 和 or 享受不到b+tree额外的优化效果的,所以我一般情况会将in , or 进行改性
>
>>聚集索引:
!= , not in
--- 示例:
mysql[world]>desc select * from world.city where id != 10;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | city | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 2103 | 100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.34 sec)
--- 示例:
mysql[world]>desc select * from world.city where id not in (10,20);
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | city | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 2112 | 100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
ref:辅助索引等值查询
desc select * from city where countrycode = 'CHN';
eq_ref:多表连接查询中,非驱动表on的列条件列是主键或者唯一键
--- 示例:
mysql[world]>mysql[world]>desc select a.name,b.name from city as a join country as b on a.countrycode=b.code where a.population<100;
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------------+------+----------+-------------+
| 1 | SIMPLE | a | NULL | ALL | CountryCode | NULL | NULL | NULL | 4188 | 33.33 | Using where |
| 1 | SIMPLE | b | NULL | eq_ref | PRIMARY | PRIMARY | 3 | world.a.CountryCode | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
const(system):主键或唯一键的等值查询
mysql[world]>desc select * from city where id=10;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | city | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.01 sec)
NULL:获取不到数据(效果最好)
mysql[world]>desc select * from city where id=1000000000000000;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | no matching row in const table |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
1 row in set, 1 warning (0.00 sec)
(3). possible_keys:可能会用到的索引
NULL:没有和查询条件匹配的索引条目
有值: 有查询条件匹配的索引条目,但是没走,大部分原因是语句查询方式不合符索引应用条件
(4). key:使用的索引
最终使用的索引,可以帮助我们判断是否走了合适的索引
(5). key_len:索引的覆盖长度
在联合索引应用的判断时,会经常看
对于单列索引:
字符类型 | 长度 | not null | 没有指定not null |
---|---|---|---|
int | 4个字节 | 4 | 4+1 |
tinyint | 1个字节 | 1 | 2 |
utf8mb4 | 一个字符最大是4个字节 | <== | <== |
char(2) | ^ | 2*4 | 2*4+1 |
varchar(2) | ^ | 2*4+2 | 2*4+2+1 |
说明:
- 有非空约束时,key_length就是最大字节长度
- 在没有非空约束时:字符最大长度+1
- varchar类型,需要额外在最大字符长度+2(存储字符长度的长度占位)
(1)最完美的查询情况
结论:当查询条件中,包含了索引列中的所有条件,并且都是等值的查询,name无关排列顺序,都可以走全联合索引优化。
(原因:优化器会自动调整顺序,来达到最佳的优化效果)
所以我们重点关注的是联合索引的建立顺序,从左到右,唯一值越多的列放在最左边。
(2)查询条件中,哪些因素会key_len长度
--- 按照索引的建立顺序,在查询条件中,少了任意一个中间列,后续的都无法走索引。
--- 在条件查询中间,出现不等值查询时,从不等值列开始后续列都无法使用联合索引。
优化方法:将不等值列放在最后。
(3)如果有多字句的条件查询(必须是联合索引)
按照子句的执行顺序,建立索引。
(6). Extra:额外的信息
Using filesort 额外的文件排序 在group by,order by,distinct等。
一般优化的方法,和where条件的列建立联合索引。
小结补充
如何判断某一列中是否有重复值
select name,count(id) from city group by name having count(id)>1 order by count(id) desc;
如果统计除了重复的行数
select count(distinct name) from city;