MySQL 相关规范

2021-12-08 本文已影响0人齐舞647

流程规范

流程图.jpg

服务正式上线前，一定要check下线上数据库改动已经上线。不然会引起线上事故！
测试环境也要有，不然测试环境后面会越来越难建设。

设计规范

规范共分为【强制】、【推荐】、【参考】三个等级，优先级从高至低，如违反【强制】规范Reviewer应当打回。

命名规范

【强制】表的命名最好是“业务名称_表的作用” ，无需额外的数据库名开头。

对于bits数据库来说，正例：app_build_config，workflow_job；反例：bits_app_info，bits_xxx；

【强制】表名、字段名必须使用小写字母或****数字，禁止出现数字开头，禁止两个下划线中间只出现数字。正例:app_user，app_config，level3_name 反例:AppUser，rdcConfig，level_3_name

【强制】表名使用单数。说明:表名应该仅仅表示表里面的实体内容，不应该表示实体数量，符合表达习惯。

【强制】索引名字的前缀必须为 idx_，唯一约束/索引的前缀必须为 uk_

【强制】所有命名必须使用全名，有默认约定的除外，如果超过 30 个字符，使用缩写

【强制】每个字段和表必须提供清晰的注释，现在没有注释 rds 不让建表。

【推荐】库的命名最好是“系统名称_子系统名”，正例 bits_pipeline，bits_ttp。

字段规范

【强制】必加的三个字段

id(unsigned bigint)，单表时自增。防止枚举时用id generator生成
create_time(timestamp)，用于记录主动创建时间，默认值：current_timestamp
update_time(timestamp)，用于记录最后更新时间，默认值： current_timestamp

【强制】每个表不超过30个字段

【强制】如果存储的字符串长度固定，使用 char 定长字符串类型

【强制】文本数据尽量用varchar ，变长存储，更省存储空间，长度不要超过 2700，如果存储长度大于此值，定义字段类型为 text，独立出来一张表，用主键来对应，避免影响其它字段索引效率。

【强制】存储字段如果为非负数，默认加上UNSIGNED。

【强制】小数类型为 decimal，禁止使用 float 和 double。decimal不存在精度损失，数据类型decimal(p,s) 需要分别指定小数的最大位数（p）和小数位的数量（s）。

【强制】如果修改字段含义或对字段表示的状态追加时，需要及时更新字段注释。

【强制】业务中选择性很少的状态、****类型 等字段推荐使用 TINYINT。从1开始枚举，并且将具体含义写在字段备注里。禁止使用 bool 来存状态。

【强制】索引字段不要设置为null

【强制】优先选择utf8mb4字符集，它的兼容性最好，而且还支持emoji字符。

【推荐】json 的数据使用 json 类型存储而不是 text。MySQL 会帮我们做 json 格式校验。注意长度限制。（看业务场景）

【推荐】字段尽量设置为 NOT NULL，为字段提供默认值。如字符型的默认值为一个空字符串值；数值型默认值为数值 0；逻辑型的默认值为数值 0;

【推荐】考虑业务场景，选择软硬删除，必要的话，新增 delete_at，软删除。

ToDo：补 GORM 和 build Tools @刘思齐 @刘涛

注意点：

count(*) 会统计NULL的行， count(列名）不会统计此列为NULL值的行
count(distinct col) 计算该列除 NULL 之外的不重复行数，注意 count(distinct col1, col2) 如果其中一列全为 NULL，那么即使另一列有不同的值，也返回为 0。

【推荐】遇到BLOB、TEXT字段，则尽量拆出去，再用主键做关联。超过4k大值拆出，考虑换存储方案，主从延迟风险。

【推荐】反范式设计：字段允许适当冗余，以提高查询性能。如app表中的应用名称/应用id等唯一键，可以在别的业务表中均存储一份，避免使用 join。

~~【推荐】时间类型采用为 timestamp 数据类型，因为 datetime 占用 8 字节，timestamp 仅占用4字节，范围为 1970-01-01 00:00:01 到2038-01-01 00:00:00~~

【参考】合适的字符存储长度，不但节约数据库表空间、节约索引存储，更重要的是提升检索速度。

类型（同义词）	存储长度(BYTES)	最小值(SIGNED/UNSIGNED)	最大值(SIGNED/UNSIGNED)
整型数字
TINYINT	1	-128/0	127/255
SMALLINT	2	-32,768/0	32767/65,535
MEDIUMINT	3	-8,388,608/0	8388607/16,777,215/
INT(INTEGER)	4	-2,14,7483,648/0	2147483647/4,294,967,295/
BIGINT	8	-2^63/0	2^63-1/264-1
时间类型
DATETIME	8	1001-01-01 00:00:00	9999-12-31 23:59:59
DATE	3	1001-01-01	9999-12-31
TIME	3	00:00:00	23:59:59
YEAR	1	1001	9999
TIMESTAMP	4	1970-01-01 00:00:00	2038-01-01 00:00:00

索引规范

【强制】表写多读少建议索引数不超过5个。表读多写少索引数不超过10个。

【强制】业务上具有唯一特性的字段，即使是多个字段的组合，也必须建成唯一索引。

【强制】建议不使用 join。如果一定要使用，最多两个表 join。需要 join 的字段，数据类型必须绝对一致;多表关联查询时，保证被关联的字段需要有索引。

【强制】在 varchar 字段上建立索引时，必须指定索引长度，没必要对全字段建立索引，根据实际文本区分度决定索引长度即可。说明:索引的长度与区分度是一对矛盾体，一般对字符串类型数据，长度为 20 的索引，区分度会高达 90%以上。

【强制】在建立索引时，多考虑建立联合索引，并把区分度最高的字段放在最前面。

【强制】模糊查询仅允许右模糊 即“xxx%”，如果需要请走ES来解决。

【强制】不要用外键，一切外键概念在应用层解决。

【推荐】利用覆盖索引来进行查询操作，避免回表。即 select b from tableA where a = 1

【推荐】如果选择性超过 20%(区分度低于20%)，那么全表扫描比使用索引性能更优，即没有设置索引的必要。

【推荐】防止因字段类型不同造成的隐式转换，导致索引失效。

【推荐】如果有 order by 的场景，请注意利用索引的有序性。order by 最后的字段是组合索引的一部分，并且放在索引组合顺序的最后，避免出现 file_sort 的情况，影响查询性能。

【推荐】不要对频繁更新字段加索引

【参考】创建索引时避免有如下极端误解

错误一：宁滥勿缺。认为一个查询就需要建一个索引。
错误二：宁缺勿滥。认为索引会消耗空间、严重拖慢更新和新增速度。
错误三：抵制惟一索引。认为业务的惟一性一律需要在应用层通过“先查后插”方式解决。

使用规范

【强制】用Explain来做SQL 性能优化，type 至少要达到 range 级别，要求是 ref 级别，如果可以是 consts 最好

值	描述
system	The table has only one row (= system table). This is a special case of the const join type.
const	The table has at most one matching row, which is read at the start of the query. Because there is only one row, values from the column in this row can be regarded as constants by the rest of the optimizer. const tables are very fast because they are read only once.
eq_ref	One row is read from this table for each combination of rows from the previous tables. Other than the system and const types, this is the best possible join type. It is used when all parts of an index are used by the join and the index is a PRIMARY KEY or UNIQUE NOT NULL index.
ref	All rows with matching index values are read from this table for each combination of rows from the previous tables. ref is used if the join uses only a leftmost prefix of the key or if the key is not a PRIMARY KEY or UNIQUE index (in other words, if the join cannot select a single row based on the key value). If the key that is used matches only a few rows, this is a good join type.
fulltext	The join is performed using a FULLTEXT index.
ref_or_null	This join type is like ref, but with the addition that MySQL does an extra search for rows that contain NULL values. This join type optimization is used most often in resolving subqueries. In the following examples, MySQL can use a ref_or_null join to process ref_table:
index_merge	This join type indicates that the Index Merge optimization is used. In this case, the key column in the output row contains a list of indexes used, and key_len contains a list of the longest key parts for the indexes used
unique_subquery	This type replaces eq_ref for some IN subqueries of the following form:
index_subquery	This join type is similar to unique_subquery. It replaces IN subqueries, but it works for nonunique indexes in subqueries of the following form
range	Only rows that are in a given range are retrieved, using an index to select the rows. The key column in the output row indicates which index is used. The key_len contains the longest key part that was used. The ref column is NULL for this type.
index	The index join type is the same as ALL, except that the index tree is scanned. This occurs two ways: - If the index is a covering index for the queries and can be used to satisfy all data required from the table, only the index tree is scanned. In this case, the Extra column says Using index. An index-only scan usually is faster than ALL because the size of the index usually is smaller than the table data. -A full table scan is performed using reads from the index to look up data rows in index order. Uses index does not appear in the Extra column. -MySQL can use this join type when the query uses only columns that are part of a single index.
ALL	A full table scan is done for each combination of rows from the previous tables. This is normally not good if the table is the first table not marked const, and usually very bad in all other cases. Normally, you can avoid ALL by adding indexes that enable row retrieval from the table based on constant values or column values from earlier tables.

【强制】不要使用 count(列名)或 count(常量)来替代 count()，count()是 SQL92 定义的标准统计行数的语法，跟数据库无关，跟 NULL 和非 NULL 无关。

count(*)会统计值为 NULL 的行，而 count(列名)不会统计此列为 NULL 值的行。

【强制】连表查询的情况下，要确保关联条件的数据类型一致，避免嵌套

【强制】使用 ISNULL()来判断是否为 NULL 值。说明:NULL 与任何值的直接比较都为 NULL。

NULL<>NULL的返回结果是NULL，而不是false。
NULL=NULL的返回结果是NULL，而不是true。
NULL<>1的返回结果是NULL，而不是true。

【强制】查询时不要对索引列做计算/使用函数。避免索引列的类型转换以及字符串编码转换。

【强制】更新数据超过1000条的update，delete和insert语句，需要改成分批量多次更新来进行

【强制】禁止使用存储过程，视图，触发器，event等高级特性

【强制】in子句中的值不要超过500个

【强制】提RDS DML工单时，update or delete 的长度确定时，加上 limit <行数>。这样就算提的工单 DML SQL写错了，错误造成的影响也可控。（降低工单造成的事故风险）

【强制】提工单的时候备注背景写一下。

【推荐】不建议使用子查询，建议将子查询 SQL 拆开结合程序多次查询。

【推荐】不要对索引字段使用负向查询，比如not in,not like