MySQL子查询性能

2020-08-25  本文已影响0人  欧阳_z

1、语法
子查询有INEXISTS两种,哪种速度更快呢?主要取决于两张表的大小关系。
SELECT * FROM A WHERE cc IN (SELECT cc FROM B);
SELECT * FROM A WHERE EXISTS (SELECT cc FROM B WHERE B.cc=A.cc);

IN会先查询B,然后把查询B的结果,做为输入,再查询A,当 A 记录多时,可以很好的利用 A 的索引,时间复杂度是O(b*log a)
EXISTS会先查询A,然后把查询A的结果,做为输入,再查询B,当 B 记录多时,可以很好的利用 B 的索引,时间复杂度是O(a*log b)

2、下面来做个实验

(1)先执行 source first.sql; 来创建 2 张数据表

DROP TABLE IF EXISTS `t_big`;

CREATE TABLE `t_big` ( 
`id` int(11) NOT NULL AUTO_INCREMENT, 
`name` varchar(30) NOT NULL, 
`phone` varchar(30) NOT NULL, 
PRIMARY KEY (`id`),
KEY idx_name(`name`),
KEY idx_phone(`phone`)
) ENGINE = InnoDB;

DROP TABLE IF EXISTS `t_small`;

CREATE TABLE `t_small` ( 
`id` int(11) NOT NULL AUTO_INCREMENT, 
`name` varchar(30) NOT NULL, 
`nickname` varchar(30) NOT NULL, 
PRIMARY KEY (`id`),
KEY idx_name(`name`),
KEY idx_nickname(`nickname`)
) ENGINE = InnoDB;

(2)执行如下 shell 脚本bash rand.sh来生成随机数据的 sql 插入语句的 second.sql 文件,再执行 source second.sql; 来插入数据:

#!/bin/bash
small=50
big=500000
rm -f second.sql

for((i=0;i<${big};i++));  
do
printf "INSERT INTO \`t_big\` (\`name\`, \`phone\`) VALUES ('zhang%d', '%08d');\n" $[$RANDOM%1000000] $[$RANDOM%100000000] >> second.sql
done

for((i=0;i<${small};i++));  
do
printf "INSERT INTO \`t_small\` (\`name\`, \`nickname\`) VALUES ('zhang%d', '%s');\n" $[$RANDOM%1000000] $(tr -dc "a-z"</dev/urandom|head -c 8) >> second.sql
done
echo "create second.sql success!"

(3)此时大表 t_big 的记录数量多,小表 t_small 的记录少,可以看到使用 EXISTSIN 慢很多。

mysql> set profiling = 1;
SELECT * FROM t_big WHERE `name` IN (SELECT `name` FROM t_small);
mysql> show profiles;
+----------+------------+------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                            |
+----------+------------+------------------------------------------------------------------+
|        1 | 0.01406125 | SELECT * FROM t_big WHERE `name` IN (SELECT `name` FROM t_small) |
+----------+------------+------------------------------------------------------------------+
+----------+------------+---------------------------------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                                                       |
+----------+------------+---------------------------------------------------------------------------------------------+
|        1 | 1.65336600 | SELECT * FROM t_big WHERE EXISTS (SELECT `name` FROM t_small WHERE t_big.name=t_small.name) |
+----------+------------+---------------------------------------------------------------------------------------------+

(4)再做一个相反的实验让两个表的位置互换一下,可以看到 EXISTSIN 快。

+----------+------------+------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                            |
+----------+------------+------------------------------------------------------------------+
|        1 | 0.00277950 | SELECT * FROM t_small WHERE `name` IN (SELECT `name` FROM t_big) |
+----------+------------+------------------------------------------------------------------+
+----------+------------+---------------------------------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                                                       |
+----------+------------+---------------------------------------------------------------------------------------------+
|        1 | 0.00050700 | SELECT * FROM t_small WHERE EXISTS (SELECT `name` FROM t_big WHERE t_small.name=t_big.name) |
+----------+------------+---------------------------------------------------------------------------------------------+

3、结论
当 A > B,用 IN:
SELECT * FROM A WHERE cc IN (SELECT cc FROM B);

当 A < B,用 EXISTS:
SELECT * FROM A WHERE EXISTS (SELECT cc FROM B WHERE B.cc=A.cc);

上一篇 下一篇

猜你喜欢

热点阅读