mysql中统计常用聚合函数
每一次做统计,涉及到稍微复杂点的sql去问老大,老大都会笑着说,这不都是mysql最基本的东西,你该回去给msyql补课了~~~趁清明节,搞搞!!!
mysql中什么是聚合函数?其实聚合函数还有一个定义:组函数!
在每一行集合上进行操作,对每一个组给出一个结果!
一、常用的组函数:
avg:求平均值
count:统计行的数量
max:求最大值
min:求最小值
sum:求和
注意:
1、每组函数接收一个参数
2、默认情况下,组函数忽略值为null的行,不参与计算
3、使用关键字distinct剔除字段值重复的条数
一、count函数使用demo数据
mysql> select * from salary_tab;
+--------+---------+
| userid | salary |
+--------+---------+
| 1 | 1000.00 |
| 2 | 2000.00 |
| 3 | 3000.00 |
| 4 | NULL |
| 5 | 1000.00 |
+--------+---------+
rows in set (0.00 sec)
mysql> use TENNIS
mysql> show tables;
+-------------------+
| Tables_in_TENNIS |
+-------------------+
| COMMITTEE_MEMBERS |
| MATCHES |
| PENALTIES |
| PLAYERS |
| TEAMS |
+-------------------+
rows in set (0.00 sec)
(1)、count(*):返回表中满足where条件的行的数量
mysql> select count(*) from salary_tab where salary='1000';
+----------+
| count(*) |
+----------+
| 2 |
+----------+
mysql> select count(*) from salary_tab; #没有条件,默认统计表数据行数
+----------+
| count(*) |
+----------+
| 5 |
+----------+
(2)、count(列):返回列值非空的行的数量
mysql> select count(salary) from salary_tab;
+---------------+
| count(salary) |
+---------------+
| 4 |
+---------------+
(3)、count(distinct 列):返回列值非空的、并且列值不重复的行的数量
mysql> select count(distinct salary) from salary_tab;
+------------------------+
| count(distinct salary) |
+------------------------+
| 3 |
+------------------------+
(4)、count(expr):根据表达式统计数据(to_days返回的是天数,不懂可以查下)
mysql> select * from TT;
+------+------------+
| UNIT | DATE |
+------+------------+
| a | 2018-04-03 |
| a | 2017-12-12 |
| b | 2018-01-01 |
| b | 2018-04-03 |
| c | 2016-06-06 |
| d | 2018-03-03 |
+------+------------+
rows in set (0.00 sec)
mysql> select UNIT as '单位',
-> COUNT(TO_DAYS(DATE)=TO_DAYS(NOW()) or null) as '今日统计',
-> COUNT(YEAR(DATE)=YEAR(NOW()) or null) as '今年统计'
-> from TT
-> group by UNIT;
+------+----------+----------+
| 单位 | 今日统计 | 今年统计 |
+------+----------+----------+
| a | 1 | 1 |
| b | 1 | 2 |
| c | 0 | 0 |
| d | 0 | 1 |
+------+----------+----------+
rows in set (0.00 sec)
二、max和min使用demo
mysql> select max(salary) from salary_tab;
+-------------+
| max(salary) |
+-------------+
| 3000.00 |
+-------------+
mysql> select min(salary) from salary_tab;
+-------------+
| min(salary) |
+-------------+
| 1000.00 |
+-------------+
如果统计的列中只有NULL值,那么MAX和MIN就返回NULL
三、sum和avg函数---求和与求平均
!!注意:
表中列值为null的行不参与计算
mysql> select sum(salary) from salary_tab;
+-------------+
| sum(salary) |
+-------------+
| 7000.00 |
+-------------+
mysql> select avg(salary) from salary_tab;
+-------------+
| avg(salary) |
+-------------+
| 1750.000000 |
+-------------+
mysql> select avg(ifnull(salary,0)) from salary_tab;
+-----------------------+
| avg(ifnull(salary,0)) |
+-----------------------+
| 1400.000000 |
+-----------------------+
要想列值为NULL的行也参与组函数的计算,必须使用IFNULL函数对NULL值做转换
二、SELECT分组:
1、SELECT分组的基本格式
select [聚合函数] 字段名 from 表名 [where 查询条件] [group by 字段名] [having 过滤条件]
(1)、group by子句:
根据给定列或者表达式的每一个不同的值将表中的行分成不同的组,使用组函数返回每一组的统计信息
规则:
①出现在SELECT子句中的单独的列,必须出现在GROUP BY子句中作为分组列
②分组列可以不出现在SELECT子句中
③分组列可出现在SELECT子句中的一个复合表达式中
④如果GROUP BY后面是一个复合表达式,那么在SELECT子句中,它必须整体作为一个表达式的一部分才能使用。
1)指定一个列进行分组
mysql> select salary,count(*) from salary_tab
-> where salary>=2000
-> group by salary;
+---------+----------+
| salary | count(*) |
+---------+----------+
| 2000.00 | 1 |
| 3000.00 | 1 |
+---------+----------+
2)指定多个分组列,‘大组中再分小组’
mysql> select userid,count(salary) from salary_tab
-> where salary>=2000
-> group by salary,userid;
+--------+---------------+
| userid | count(salary) |
+--------+---------------+
| 2 | 1 |
| 3 | 1 |
+--------+---------------+
3)根据表达式分组
mysql> select year(payment_date),count(*)
-> from PENALTIES
-> group by year(payment_date);
+--------------------+----------+
| year(payment_date) | count(*) |
+--------------------+----------+
| 1980 | 3 |
| 1981 | 1 |
| 1982 | 1 |
| 1983 | 1 |
| 1984 | 2 |
+--------------------+----------+
rows in set (0.00 sec)
4)带有排序的分组:如果分组列和排序列相同,则可以合并group by和order by子句
mysql> select teamno,count(*)
-> from MATCHES
-> group by teamno
-> order by teamno desc;
+--------+----------+
| teamno | count(*) |
+--------+----------+
| 2 | 5 |
| 1 | 8 |
+--------+----------+
rows in set (0.00 sec)
mysql> select teamno,count(*)
-> from MATCHES
-> group by teamno desc; #可以把desc(或者asc)包含到group by子句中简化
+--------+----------+
| teamno | count(*) |
+--------+----------+
| 2 | 5 |
| 1 | 8 |
+--------+----------+
rows in set (0.00 sec)
2、GROUP_CONCAT()函数
函数的值等于属于一个组的指定列的所有值,以逗号隔开,并且以字符串表示。
例1:对于每个球队,得到其编号和所有球员的编号
mysql> select teamno,group_concat(playerno)
-> from MATCHES
-> group by teamno;
+--------+------------------------+
| teamno | group_concat(playerno) |
+--------+------------------------+
| 1 | 6,6,6,44,83,2,57,8 |
| 2 | 27,104,112,112,8 |
+--------+------------------------+
rows in set (0.01 sec)
如果没有group by子句,group_concat返回一列的所有值
例2:得到所有的罚款编号列表
mysql> select group_concat(paymentno)
-> from PENALTIES;
+-------------------------+
| group_concat(paymentno) |
+-------------------------+
| 1,2,3,4,5,6,7,8 |
+-------------------------+
row in set (0.00 sec)
4、HAVING子句:对分组结果进行过滤
注意:
不能使用WHERE子句对分组后的结果进行过滤
不能在WHERE子句中使用组函数,仅用于过滤行
因为WHERE子句比GROUP BY先执行,而组函数必须在分完组之后才执行,且分完组后必须使用having子句进行结果集的过滤。
SELECT select_expr [, select_expr ...]
FROM table_name
[WHERE where_condition]
[GROUP BY {col_name | expr} [ASC | DESC], ... [WITH ROLLUP]]
[HAVING where_condition]
!!!having子语句与where子语句区别:
where子句在分组前对记录进行过滤;
having子句在分组后对记录进行过滤
mysql> select salary,count(*) from salary_tab
-> where salary>=2000
-> group by salary
-> having count(*)>=0;
+---------+----------+
| salary | count(*) |
+---------+----------+
| 2000.00 | 1 |
| 3000.00 | 1 |
+---------+----------+
1)HAVING可以单独使用而不和GROUP BY配合,如果只有HAVING子句而没有GROUP BY,表中所有的行分为一组
2)HAVING子句中可以使用组函数
3)HAVING子句中的列,要么出现在一个组函数中,要么出现在GROUP BY子句中(否则出错)
mysql> select town,count(*)
-> from PLAYERS
-> group by town
-> having birth_date>'1970-01-01';
ERROR 1054 (42S22): Unknown column 'birth_date' in 'having clause'
mysql> select town,count(*)
-> from PLAYERS
-> group by town
-> having town in ('Eltham','Midhurst');
+----------+----------+
| town | count(*) |
+----------+----------+
| Eltham | 2 |
| Midhurst | 1 |
+----------+----------+
rows in set (0.00 sec)
基本就这么多了、、、以后再补充