SQL查询技巧分析

2022-09-26 本文已影响0人右耳菌

1. Like 会不会走索引？

在‘%’在后边的情况下会走索引，即前面非通配符的情况下是会走索引的，此时的本质是一个范围查询，即使用了前面的内容进行了字符大小对比。
在‘%’在前边的情况下不会走索引。此时无法使用字符进行大小对比。

-- 走索引
explain select * from myshop.ecs_users where email like 'onlyoneemail.com'; -- 本质是一个范围查询
-- 走索引
explain select * from myshop.ecs_users where email like 'onlyoneemail.com%'; -- 本质是一个范围查询
-- 不走索引
explain select * from myshop.ecs_users where email like '%onlyoneemail.com';
-- 不走索引
explain select * from myshop.ecs_users where email like '%onlyoneemail.com%';

2. 索引列能不能为空？

IS NULL 会走索引，IS NOT NULL 不会走索引，但是索引列是可以为空的，且使用is null 查询的时候，是走索引的。

-- 走索引
explain select * from myshop.ecs_users where email is null;
-- 不走索引
explain select * from myshop.ecs_users where email is not null;

3. 函数计算会不会走索引？

此情况以执行计划为准，与sql查询语句的查询情况为准。

-- 走索引，count(*) 实际是count了index的内容
explain select count(*) from myshop.ecs_order_info where from_unixtime(add_time, '%Y-%m-%d') = '2022-09-27';
-- 不走索引，此时index无法满足要求
explain select * from myshop.ecs_order_info where from_unixtime(add_time, '%Y-%m-%d') = '2022-09-27';
-- 走索引
explain select count(*) from myshop.ecs_order_info where abs(add_time) > 159025358;

4. 类型不一致会不会走索引？

-- 走索引，这种情况会走
explain select * from myshop.ecs_order_info where add_time >= 1591025358;

使用下面的内容来分析

select * from myshop.ecs_order_info where add_time >= 1591025358;

set optimizer_trace = "enabled=on";
select trace from information_schema.optimizer_trace;
set optimizer_trace = "enabled=off";

5. where 条件顺序怎么写？

多列索引，顺序反了会不会走索引？

-- 案例5–多列索引，顺序反了会不会走索引？索引idx_ship_pay(pay_id , shipping_id，pay_time)
-- 记住最左前缀的概念
-- 走索引
explain SELECT* FROM myshop.ecs_order_info where pay_time = '1591025358' and shipping_id = 4 and pay_id = 2 ;
-- 不走索引
explain SELECT * FROM myshop.ecs_order_info where shipping_id = 2 and pay_time >= '1591025358';
-- 不走索引
explain SELECT * FROM myshop.ecs_order_info where shipping_id = 2;
-- 走索引
explain SELECT *FROM myshop.ecs_order_info where pay_id = 2 and pay_time >= '1591025358';

-- 不用组合索引，查询多个单列索引会不会只有一个索引生效？–索引合并
-- 走索引，此时会索引合并，即两个索引各走一边，然后合并
explain select * from myshop.ecs_users where email = 'onlyoneemail.com' or user_name = 'edu_159100060138810';

6. 要不要用UNION替代OR

如果不嫌麻烦可以考虑替换，如果比较麻烦也可以不替换。

-- query_cost: 约等于 5.02
select*from myshop.ecs_users where user_name = 'edu_159100060138810' or email = 'onlyoneemail.com';

-- query_cost: 约等于 2.4
select * from myshop.ecs_users where email = 'onlyoneemail.com'
union
select * from myshop.ecs_users where user_name = 'edu_159100060138810';

7. EXISTS vs IN

-- 案例7 - EXISTS VS IN - 是否需要用EXISTS替代IN、用NOT EXISTS替代NOT IN？

select * from myshop.ecs_users where user_id 
in ( SELECT user_id FROM myshop.ecs_order_info where add_time >= 1590076800 ) limit 1;

select * from myshop.ecs_users u where 
EXISTS (SELECT user_id FROM myshop.ecs_order_info o where add_time >= 1590076800 and u.user_id = o.user_id)  limit 1;

-- 常见情况下，子查询结果少，用in ，子查询结果多，用exists
-- 根据实际运行情况进行分析

-- 快
select * from myshop.ecs_users where user_id in ( 1,2,4);
-- 慢
select * from myshop.ecs_users u where EXISTS 
( select * from (
select 1 user_id union select 2 union select 4
) u1 where u.user_id = u1.user_id );

8. 非等于会不会走索引？

-- 案例8 - != 和 <> 会不会走索引？
-- 走索引，因为是主键
explain select * from myshop.ecs_users where user_id != 999;
-- 不走索引，基于表数据的分布情况，极端情况下，如果只有一条是等于的，那么基本等于全表扫描。
explain select * from myshop.ecs_users where user_name != 'edu_159100060138810';
-- 走索引，count基本上都会走索引
explain select count(*) from myshop.ecs_users where user_name != 'edu_159100060138810';

这里可以使用show index from tableName 来查看索引的情况，如下：

其中Cardinality越大，表示数据越分散。

9. 索引覆盖

-- 还需要再查询一次具体的数据

SELECT * FROM myshop.ecs_users WHERE last_login_time >= 1591025358 limit 10000, 1000;

-- 这里可以直接从索引中拿到数据，就不用再去拿全部的数据了
-- 所以查询的列如果在索引中的话，建议使用索引覆盖的方式

SELECT user_id FROM myshop.ecs_users WHERE last_login_time >= 1591025358 limit 10000, 1000;

10. 用子查询还是表关联？

-- 多表关联、子查询------------------------------------------
-- 需求：假设今天6.1日，查询最近20天总消费金额高于 3W 的且近7天登录过用户信息
-- 注意：第一次查询会非常慢，你的buffer_pool调大一点

SELECT count(*) FROM myshop.ecs_order_info where add_time >= 1590076800; -- 100W
SELECT count(*) FROM myshop.ecs_users where last_login_time >= 1590076800; -- 33W

表关联的方式

SELECT 
    u.user_id, SUM(o.money_paid) pay
FROM
    myshop.ecs_order_info o, myshop.ecs_users u
WHERE u.last_login_time >= 1590076800
AND o.add_time >= 1590076800 
AND o.user_id = u.user_id
GROUP BY o.user_id
HAVING pay > 30000;

子查询的方式

SELECT u.user_id, o.pay FROM
myshop.ecs_users u, 
(
    select user_id,SUM(money_paid) pay from myshop.ecs_order_info where add_time >= 1590076800 
    GROUP BY user_id HAVING pay > 30000
)  o
where o.user_id = u.user_id
and u.last_login_time >= 1590076800;

总结：建议使用多表的情况进行查询。

11. 表关联之大表小表

-- 大表关联小表，还是小表关联大表？ -- 统计某个地区的订单

explain select r.region_name,count(o.order_id) from myshop.ecs_region r , myshop.ecs_order_info o
where r.region_id = o.province and o.add_time >= 1591025358  group by r.region_name;

explain select r.region_name,count(o.order_id) from myshop.ecs_order_info o , myshop.ecs_region r 
where r.region_id = o.province and o.add_time >= 1591025358  group by r.region_name;

explain select r.region_name,count(o.order_id) from myshop.ecs_region r left join myshop.ecs_order_info o
on r.region_id = o.province where o.add_time >= 1591025358  group by r.region_name;

explain select r.region_name,count(o.order_id) from myshop.ecs_order_info o left join myshop.ecs_region r
on r.region_id = o.province where o.add_time >= 1591025358  group by r.region_name;

经过检验，其实无论是放大表在前还是小表在前，是没有关系的，本质上来说，其实mysql的其实会自己分析进行优化，所以无需关心。

12. 分页的玩法

-- 分页案例 -----------------------------------------
-- 1. count(*) 解疑

select count(email) from myshop.ecs_users;  -- 不统计null
select count(1) from myshop.ecs_users; -- 不解析内容
select count(*) from myshop.ecs_users; -- 标准SQL

-- 2. 分页（查你需要的字段，不要*）

select * from myshop.ecs_order_info order by order_id limit 1,100;
-- 下面这个会变得很慢
select * from myshop.ecs_order_info order by order_id limit 4000000,100;

递增ID，连续不中断

select * from myshop.ecs_order_info o where o.order_id between 4000000 and 4000100;

递增ID，不连续

select * from myshop.ecs_order_info o where o.order_id 
>=
(select order_id from myshop.ecs_order_info order by order_id limit 4000000,1) limit 100;

无序读取

SELECT * FROM myshop.ecs_users u where u.last_login_time >= 1590076800 order by u.last_login_time,u.user_id limit 1, 10;

SELECT * FROM myshop.ecs_users u where u.last_login_time >= 1590076800 order by u.last_login_time,u.user_id limit 200000, 10;

SELECT * FROM myshop.ecs_users u , 
( 
    SELECT user_id FROM myshop.ecs_users u where u.last_login_time >= 1590076800 order by u.last_login_time,u.user_id limit 200000, 10 
) u1 where u1.user_id = u.user_id order by u.user_id;

如果觉得有收获，欢迎点赞和评论，更多知识，请点击关注查看我的主页信息哦~