Database 数据库mysql程序员

关于MySQL前置索引

2015-07-21  本文已影响1025人  AQ王浩

一、前置索引的好处

使用前置索引的好处

Normally, the entire value of a column is used to build the index - this is fine for short data types (integers and the like) but can result in a lot of data in the index for longer data types (CHAR and VARCHAR, for example). Using an index prefix allows you to make a trade off between the space required for the index and the cardinality of the index.

二、前置索引的坑

第一种情况

| areas | CREATE TABLE `areas` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `areaid` varchar(20) NOT NULL,
  `name` varchar(50) NOT NULL,
  `cityid` varchar(20) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_of_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain select id, name from areas order by name;
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
|  1 | SIMPLE      | areas | index | NULL          | idx_of_name | 152     | NULL | 3144 | Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> explain select * from areas where name = '龙潭区';
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-----------------------+
| id | select_type | table | type | possible_keys | key         | key_len | ref   | rows | Extra                 |
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-----------------------+
|  1 | SIMPLE      | areas | ref  | idx_of_name   | idx_of_name | 152     | const |    1 | Using index condition |
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-----------------------+
1 row in set (0.01 sec)

mysql>  explain select id, name from areas group by name;
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
|  1 | SIMPLE      | areas | index | idx_of_name   | idx_of_name | 152     | NULL | 3144 | Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
1 row in set (0.01 sec)

第二种情况:

| areas | CREATE TABLE `areas` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `areaid` varchar(20) NOT NULL,
  `name` varchar(50) NOT NULL,
  `cityid` varchar(20) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_of_name` (`name`(6))
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain select id, name from areas order by name;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
|  1 | SIMPLE      | areas | ALL  | NULL          | NULL | NULL    | NULL | 3144 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.01 sec)

mysql> explain select * from areas where name = '龙潭区';
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key         | key_len | ref   | rows | Extra       |
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-------------+
|  1 | SIMPLE      | areas | ref  | idx_of_name   | idx_of_name | 20      | const |    1 | Using where |
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-------------+
1 row in set (0.01 sec)

mysql> explain select id, name from areas group by name;
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                           |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
|  1 | SIMPLE      | areas | ALL  | NULL          | NULL | NULL    | NULL | 3144 | Using temporary; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
1 row in set (0.01 sec)

虽然在查询上用到了索引,但是在排序时,索引不生效。

使用索引就一定能加快查询效率么?不一定。上面的例子就告诉我们,前置索引
并不是一个万能药,它的确可以帮助我们在一个过长的字段中建立索引。但同时也会导致排序(order by, group by)查询都无法使用前置索引。

三、如何计算出前置索引的最佳长度

关于如何计算出,最恰当的设置索引的长度,总体思想:

# 全列选择性
SELECT COUNT(DISTINCT column_name) / COUNT(*) FROM table_name;

# 测试某一长度前缀的选择性
SELECT COUNT(DISTINCT LEFT(column_name, prefix_length)) / COUNT(*) FROM table_name;

当前置的选择性越接近全列的选择性的时候,索引效果越好。

通常可以索引开始的几个字符,而不是全部值,以节约空间并取得好的性能。这使索引需要的空间变小,但是也会降低选择性。索引选择性不重复的索引值和表中所有行的比值。高选择性的索引有好处,因为它使MySQL在查找匹配的时候可以过滤更多的行。唯一索引的选择率为1,为最佳值

具体操作如下:

CREATE TABLE `areas` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `areaid` varchar(20) NOT NULL,
  `name` varchar(50) NOT NULL,
  `cityid` varchar(20) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select count(distinct(name))/count(id) from areas;
+---------------------------------+
| count(distinct(name))/count(id) |
+---------------------------------+
|                          0.8954 |
+---------------------------------+
1 row in set (0.01 sec)

mysql> select count(distinct left(name, 2))/count(id) from areas;
+-----------------------------------------+
| count(distinct left(name, 2))/count(id) |
+-----------------------------------------+
|                                  0.8648 |
+-----------------------------------------+
1 row in set (0.00 sec)

mysql> select count(distinct left(name, 3))/count(id) from areas;
+-----------------------------------------+
| count(distinct left(name, 3))/count(id) |
+-----------------------------------------+
|                                  0.8909 |
+-----------------------------------------+
1 row in set (0.01 sec)

mysql> select count(distinct left(name, 5))/count(id) from areas;
+-----------------------------------------+
| count(distinct left(name, 5))/count(id) |
+-----------------------------------------+
|                                  0.8941 |
+-----------------------------------------+
1 row in set (0.01 sec)

mysql> select count(distinct left(name, 6))/count(id) from areas;
+-----------------------------------------+
| count(distinct left(name, 6))/count(id) |
+-----------------------------------------+
|                                  0.8954 |
+-----------------------------------------+
1 row in set (0.00 sec)

前置索引为6时,选择性和列值选择性相同。那么就设定该索引的长度为6位。

mysql> select max(length(name)) from areas;
+-------------------+
| max(length(name)) |
+-------------------+
|                45 |
+-------------------+
1 row in set (0.11 sec)

mysql> select * from areas where length(name) = 45;
+------+--------+-----------------------------------------------+--------+
| id   | areaid | name                                          | cityid |
+------+--------+-----------------------------------------------+--------+
| 2624 | 530925 | 双江拉祜族佤族布朗族傣族自治县                | 530900 |
| 2965 | 622927 | 积石山保安族东乡族撒拉族自治县                | 622900 |
+------+--------+-----------------------------------------------+--------+
2 rows in set (0.01 sec)


四、索引的最大长度

mysql> select 255 * 3 from dual;
+---------+
| 255 * 3 |
+---------+
|     765 |
+---------+
1 row in set (0.00 sec)

mysql> show create table areas;

CREATE TABLE `areas` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `areaid` varchar(20) NOT NULL,
  `name` varchar(255) NOT NULL,
  `cityid` varchar(20) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_of_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |

mysql> alter table areas change `name` `name` varchar(256) not null;
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
mysql> alter table areas drop index `idx_of_name`;
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table areas change `name` `name` varchar(256) not null;
Query OK, 3144 rows affected (0.06 sec)
Records: 3144  Duplicates: 0  Warnings: 0

mysql> show create table areas;

CREATE TABLE `areas` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `areaid` varchar(20) NOT NULL,
  `name` varchar(256) NOT NULL,
  `cityid` varchar(20) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8

mysql> alter table areas add index `idx_of_name` (name);
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes


添加索引时,如果不指定索引的长度,MySQL默认会选择该字段的长度作为索引长度。其实这里

alter table areas add index `idx_of_name` (name)
等同于
alter table areas add index `idx_of_name` (name(256))

MySQL 索引最大 bytes 为 767, 255 * 3 < 767 但是 256 * 3 > 767 所以,varchar(255)时候,创建默认索引可以成功,但是varchar(256)时,创建默认索引就不成功。

并且由上述列子可知,索引和字段相互影响,当索引设置为 idx_of_name (name(255))时,要改变字段的长度,也会提示 Specified key was too long 错误。

总结

根据具体的业务需求来选择索引,并不是索引都会加快查询速度。

参考

错误使用MySQL前缀索引导致的慢查询
MySQL 前缀索引
mysql前缀索引
mysql省市区邮政编码和区号级联sql文件

上一篇下一篇

猜你喜欢

热点阅读