MySQL索引优化

2021-03-16 本文已影响0人传达室马大爷

trace工具

MySQL执行的跟踪工具，通过trace可以查询sql语句具体使用中使用了哪些索引以及索引的性能比较等，方便定位慢查询和优化sql

trace使用

-- 开启trace，json格式输出
set session optimizer_trace="enabled=on",end_markers_in_json=on;
-- 执行慢查询语句
SELECT * from employees where name > 'a';
-- 查询trace结果
SELECT * FROM information_schema.OPTIMIZER_TRACE;
-- 关闭trace
set session optimizer_trace=“enabled=off”;

结果分析

{
    "steps": [
        /* 第一阶段：SQL准备阶段 */
        {
            "join_preparation": {
                "select#": 1,
                "steps": [{
                    "expanded_query": "/* select#1 */ select `employees`.`id` AS `id`,`employees`.`name` AS `name`,`employees`.`age` AS `age`,`employees`.`position` AS `position`,`employees`.`hire_time` AS `hire_time` from `employees` where (`employees`.`name` > 'a')"
                }] /* steps */
            } /* join_preparation */
        },
        /* 第二阶段：SQL优化阶段 */
        {
            "join_optimization": {
                "select#": 1,
                "steps": [{
                        /* 条件处理 */
                        "condition_processing": {
                            "condition": "WHERE",
                            "original_condition": "(`employees`.`name` > 'a')",
                            "steps": [{
                                    "transformation": "equality_propagation",
                                    "resulting_condition": "(`employees`.`name` > 'a')"
                                },
                                {
                                    "transformation": "constant_propagation",
                                    "resulting_condition": "(`employees`.`name` > 'a')"
                                },
                                {
                                    "transformation": "trivial_condition_removal",
                                    "resulting_condition": "(`employees`.`name` > 'a')"
                                }
                            ] /* steps */
                        } /* condition_processing */
                    },
                    {
                        "substitute_generated_columns": {} /* substitute_generated_columns */
                    },
                    /* 表依赖详情 */
                    {
                        "table_dependencies": [{
                            "table": "`employees`",
                            "row_may_be_null": false,
                            "map_bit": 0,
                            "depends_on_map_bits": [] /* depends_on_map_bits */
                        }] /* table_dependencies */
                    },
                    {
                        "ref_optimizer_key_uses": [] /* ref_optimizer_key_uses */
                    },
                    /* 预估表的访问成本 */
                    {
                        "rows_estimation": [{
                            "table": "`employees`",
                            "range_analysis": {
                                /* 全表扫描 */
                                "table_scan": {
                                    /* 扫描行数 */
                                    "rows": 3, 
                                    /* 查询成本 */
                                    "cost": 3.7 
                                } /* table_scan */ ,
                                /* 查询可能使用的索引 */
                                "potential_range_indexes": [
                                    /* 主键索引 */
                                    {
                                        "index": "PRIMARY",
                                        "usable": false,
                                        "cause": "not_applicable"
                                    },
                                    /* 辅助索引 */
                                    {
                                        "index": "idx_name_age_position",
                                        "usable": true,
                                        "key_parts": [
                                            "name",
                                            "age",
                                            "position",
                                            "id"
                                        ] /* key_parts */
                                    }
                                ] /* potential_range_indexes */ ,
                                "setup_range_conditions": [] /* setup_range_conditions */ ,
                                "group_index_range": {
                                    "chosen": false,
                                    "cause": "not_group_by_or_distinct"
                                } /* group_index_range */ ,
                                /* 分析各个索引使用成本 */
                                "analyzing_range_alternatives": {
                                    "range_scan_alternatives": [{
                                        "index": "idx_name_age_position",
                                        /* 索引使用范围 */
                                        "ranges": [
                                            "a < name"
                                        ] /* ranges */ ,
                                        "index_dives_for_eq_ranges": true,
                                        /* 使用该索引获取的记录是否按照主键排序 */
                                        "rowid_ordered": false,
                                        "using_mrr": false,
                                        /* 是否使用覆盖索引 */
                                        "index_only": false,
                                        /* 索引扫描行数 */
                                        "rows": 3,
                                        /* 索引使用成本 */
                                        "cost": 4.61,
                                        /* 可能选择该索引 */
                                        "chosen": false,
                                        "cause": "cost"
                                    }] /* range_scan_alternatives */ ,
                                    "analyzing_roworder_intersect": {
                                        "usable": false,
                                        "cause": "too_few_roworder_scans"
                                    } /* analyzing_roworder_intersect */
                                } /* analyzing_range_alternatives */
                            } /* range_analysis */
                        }] /* rows_estimation */
                    },
                    {
                        "considered_execution_plans": [{
                            "plan_prefix": [] /* plan_prefix */ ,
                            "table": "`employees`",
                            /* 最优访问路径 */
                            "best_access_path": {
                                /* 决定的访问路径 */
                                "considered_access_paths": [{
                                    "rows_to_scan": 3,
                                    /* 访问类型 scan：全表扫描 */
                                    "access_type": "scan",
                                    "resulting_rows": 3,
                                    "cost": 1.6,
                                    /* 确定选择 */
                                    "chosen": true
                                }] /* considered_access_paths */
                            } /* best_access_path */ ,
                            "condition_filtering_pct": 100,
                            "rows_for_plan": 3,
                            "cost_for_plan": 1.6,
                            "chosen": true
                        }] /* considered_execution_plans */
                    },
                    {
                        "attaching_conditions_to_tables": {
                            "original_condition": "(`employees`.`name` > 'a')",
                            "attached_conditions_computation": [] /* attached_conditions_computation */ ,
                            "attached_conditions_summary": [{
                                "table": "`employees`",
                                "attached": "(`employees`.`name` > 'a')"
                            }] /* attached_conditions_summary */
                        } /* attaching_conditions_to_tables */
                    },
                    {
                        "refine_plan": [{
                            "table": "`employees`"
                        }] /* refine_plan */
                    }
                ] /* steps */
            } /* join_optimization */
        },
        /* 第三阶段：SQL执行阶段 */
        {
            "join_execution": {
                "select#": 1,
                "steps": [] /* steps */
            } /* join_execution */
        }
    ] /* steps */
}

结论：全表扫描的成本低于索引扫描，所以mysql最终选择全表扫描

MySQL排序

Using index

是MySQL扫描索引本身就可以完成排序

Order by使用Using index索引排序需要满足以下两个条件：

排序字段使用索引最左前缀规则
where子句和order by子句字段条件组合满足索引最左前缀规则

Using filesort

Using filesort是文件排序，并不是在磁盘中进行排序，而是告诉我们是一种排序算法，文件排序通过相应排序算法将数据读取到内存中，MySQL在内存中对数据进行排序。通过sort_buffer_size字段排序区大小，排序区线程独占，同一时间MySQL中有多个sort_buffer_size排序区同时存在

单路排序

一次性取出满足条件的所有行记录的所有字段，然后在sort_buffer中进行排序，用trace工具可
以看到sort_mode信息里显示< sort_key, additional_fields >或者< sort_key, packed_additional_fields >
双路排序（回表排序）

首先根据查询条件取出相应的排序字段和能直接定位行记录的行id（rowId），然后在sort_buffer中进行排序，排序完成后在根据行id取出其他需要的字段，用trace工具可以看到sort_mode信息里显示< sort_key, rowid >

MySQL 通过比较系统变量 max_length_for_sort_data(默认1024字节) 的大小和需要查询的字段总大小来判断使用哪种排序模式。

如果 max_length_for_sort_data 比查询字段的总长度大，那么使用单路排序模式
如果 max_length_for_sort_data 比查询字段的总长度小，那么使用双路排序模式

单路排序执行流程

1、根据索引查找出满足条件的第一行记录的主键id
2、根据主键id取出整行数据，取出需要查询的字段列表，放入sort buffer中
3、根据索引查找出满足条件的下一行记录的主键id
4、重复2、3步操作，直到不满足筛选条件
5、对sort_buffer中的数据按排序字段进行排序
6、返回客户端

双路排序执行流程

1、根据索引查找出满足条件的第一行记录的主键id
2、根据主键id取出整行数据，取出排序列和主键id值，放入sort buffer中
3、根据索引查找出满足条件的下一行记录的主键id
4、重复2、3步操作，直到不满足筛选条件
5、对sort_buffer中的数据按排序字段进行排序
6、对sort_buffer中排序好的数据，根据主键id回表查询需要查询的其他字段取出后返回客户端

表关联查询

驱动表：表关联查询数据量少的表为驱动表

被驱动表：表关联查询数据量大的表为被驱动表

MySQL在关联字段有索引时使用NLJ算法，没有索引时使用BNL算法，有索引的情况下NLJ算法比BNL算法效率更高

嵌套循环连接 Nested-Loop Join(NLJ) 算法

一行一行的从循环从驱动表中读取行数据，在这行数据中取关联字段，根据关联字段去被驱动表中查询满足条件的行，然后取出两张表的合集，小表驱动大表

-- t1为大表，t2为小表
-- t1表上a字段创建idx_a索引
EXPLAIN select*from t1 inner join t2 on t1.a= t2.a;

image-20210315233330104

    由执行计划可知，t2为驱动表，t1为被驱动表，优化器一般会选择小表作为驱动表。索引 inner join时排在前面的表不一样是驱动表

    如果Extra中未出现Using join buffer，则表示join算法为NLJ算法

sql执行流程

读取t2表的第一行数据
从第一步中取出关联字段a，到t1表中查找
取出t1表满足条件的行，和t2表合并
重复1-3步骤，直到t2表取完
合并后返回客户端

基于块的嵌套循环连接 Block Nested-Loop Join(BNL)算法

把驱动表的所有数据一次性读取到join_buffer中，然后在扫描被驱动表，把被驱动表每一行数据取出来和join_buffer中的数据比较

-- t1为大表，t2为小表
-- t1的b字段没有创建索引
EXPLAIN select * from t1 inner join t2 on t1.b= t2.b;

image-20210315234226821

    Extra 中 的Using join buffer (Block Nested Loop)说明该关联查询使用的是 BNL 算法。

sql执行流程

将t2中所有数据存入join buffer中
取出t1表的第一行数据，与join buffer中的数据做对比
重复1-2操作，知道t1表全部取完
返回客户端

in和exists优化

原则：小表驱大表，即小的数据集驱动大的数据集

in

当IN中子查询（t2表）的数据集小于t1表时，in优于exists

SELECT * FROM t1 where id IN (SELECT userId FROM t2)
-- 等价于
for (userId in t2) {
    SELECT * FROM t1 where t1.id = t2.userId
}

exists

当EXISTS中子查询（t2表）的数据集大于t1表时，exists优于in

-- 将主表（t1）的数据放到子查询（t2表）中进行条件校验，根据校验结果（true或false）来决定主表数据是否保留
SELECT * FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t1.id = t2.userId)

-- 等价于
for (t1.id in t1) {
    SELECT * FROM t2 where t2.userId = t1.id
}