排序查询、排序枚举数值、生成摘要

mysql cookbook0007

排序查询结果
7.1 使用order by 命令排序查询结果
1. 可以使用单独列的值或者多列进行排序(多列的排序规则要在列后写上desc/asc)
2. 可以使用升序(默认),降序排列
3. 可以通过别名进行排列
7.2 使用表达式排列
例子:假如叫你使用邮件大小在5000以上并以大小进行排序,并且以千字节显示
select sender,receiver,floor((size+1023)/1024) as 'kilo bytes' where size > 5000 order by 'kilo bytes' desc;

7.3 显示一组按照其他属性排序的值
7.4 字符串排序的大小写区分控制
使用dayofweek函数的表达式可以用来将一周的任意一天作为排序的第一天
Sunday dayofweek(date) Monday MOD(dayofweek(date) + 5 ,7) Tuesday MOD(dayofweek(date) + 4 ,7) ....
7.5 使用固定长度的子串排序
使用left,right,mid..函数提取子串进行排序条件
select * from tab1 order by left(column1,4) desc,right(column2,3) desc;
7.6 使用变长的字符串排序
mysql> select * from housewares;
+------------+--------------+
| id | description |
+------------+--------------+
| ada2342as | dinner table |
| asf342rd | dispoal |
| skd3324es | bedsize pamp |
| spo38392es | lavatory |
+------------+--------------+
例子:对该表进行排序,并根据id中的数字进行排序
mysql> select left(substring(id,4),char_length(substring(id,4))-2)+0 as maths from housewares order by maths;
+-------+
| maths |
+-------+
| 342 |
| 2342 |
| 3324 |
| 38392 |
+-------+
mysql> select substring(id,4,char_length(id)-5)+0 as maths from housewares order by maths;
+-------+
| maths |
+-------+
| 342 |
| 2342 |
| 3324 |
| 38392 |
+-------+
其实上面的方案也不算是完美的,还有一种:之所以可以这样写 因为字符串到数值的转换将忽略跟其后的非数字后缀并且提供在识别码中
可变长度的数字序列部分上排序所需要的数值。这也以为着substring函数的第三个参数实际上是不需要的
mysql> select substring(id,4)+0 as maths from housewares order by maths;
+-------+
| maths |
+-------+
| 342 |
| 2342 |
| 3324 |
| 38392 |
+-------+
substring_index(str,c,n)函数:
他在字符串中搜索给定字符C的第n个出现的位置并且返回那个字符的左边将所有的东西返回
substring_index('11-12-13','-',2)-> 11-12
假如n为负数的话将返回从右侧算起第n个右侧的字符串
substring_index('11-12-13-14','-',3)->12-13-14
mysql> select * from housewares2 ;
+--------------+-------------+
| id | description |
+--------------+-------------+
| 22-34-2-3 | tables |
| 23-52-1-3-5 | cups |
| 234-21-2-1 | stall |
| 234-22-53-76 | oven |
+--------------+-------------+
对该表进行排序,并且是根据id中数字进行排列
mysql> select * ,id+0 as maths from housewares2 order by maths;
+--------------+-------------+-------+
| id | description | maths |
+--------------+-------------+-------+
| 22-34-2-3 | tables | 22 |
| 23-52-1-3-5 | cups | 23 |
| 234-21-2-1 | stall | 234 |
| 234-22-53-76 | oven | 234 |
+--------------+-------------+-------+
4 rows in set, 8 warnings (0.00 sec)
mysql> select *,substring_index(substring_index(id,'-',2),'-',-1) as maths from housewares2 order by maths;
+--------------+-------------+-------+
| id | description | maths |
+--------------+-------------+-------+
| 234-21-2-1 | stall | 21 |
| 234-22-53-76 | oven | 22 |
| 22-34-2-3 | tables | 34 |
| 23-52-1-3-5 | cups | 52 |
+--------------+-------------+-------+
7.12 让域名顺序排列主机名
例子:mysql> select * from realname;
+--------------+
| name |
+--------------+
| qwe.e.com |
| weq.org |
| ada.grege.cn |
| mysql.com |
+--------------+
4 rows in set (0.00 sec)

mysql> select * , substring_index(name,'.',-1) as first_name,
-> substring_index(substring_index( concat('.',name),'.',-2),'.',1) as second_name,
-> substring_index( substring_index(concat('.',name),'.',-3),'.',1)as third_name
-> from realname order by first_name,second_name,third_name;
+--------------+------------+-------------+------------+
| name | first_name | second_name | third_name |
+--------------+------------+-------------+------------+
| ada.grege.cn | cn | grege | ada |
| qwe.e.com | com | e | qwe |
| mysql.com | com | mysql | |
| weq.org | org | weq | |
+--------------+------------+-------------+------------+
4 rows in set (0.00 sec)

7.13 按照数字顺序排序点分式ip地址
1. 通过域名的方式来划分段进行:select ip from ips order by substring(ip,'.',1),substring(substring(ip,'.',-3),'.',1),
substring(substring(ip,'.',-2),'.',1),substring(substring(ip,'.',-1),'.',1)
2. mysql 提供了函数专门针对ip:select * from ips order by inet_aton(ip);

7.14 将数值移动到排序结果的头部或者尾部
1. mysql在排序结果中将结果中将他们放置在一起(升序放置在开始处,降序放置在结束处)
2. 处理特殊数据的排序: 添加排序列来放置这些你希望单独处理的少数数据
mysql> select * from date_val order by if(name='zhangsan',1,0),d;//if 创建了一个新的数据列,他被用来作为排序的基本数值
+------------+----------+
| d | name |
+------------+----------+
| 1900-01-15 | lisi |
| 1987-03-05 | wangwu |
| 1999-12-31 | mazi |
| 2000-06-04 | xiaosan |
| 1864-02-28 | zhangsan |
+------------+----------+
5 rows in set (0.00 sec)
mysql> select * from date_val order by d;
+------------+----------+
| d | name |
+------------+----------+
| 1864-02-28 | zhangsan |
| 1900-01-15 | lisi |
| 1987-03-05 | wangwu |
| 1999-12-31 | mazi |
| 2000-06-04 | xiaosan |
+------------+----------+
5 rows in set (0.00 sec)

7.15 按照用户定义排序
field()函数进行自定义排序:
mysql> select * from date_val order by field(name,'zhangsan','xiaosan');
+------------+----------+
| d | name |
+------------+----------+
| 1900-01-15 | lisi |
| 1987-03-05 | wangwu |
| 1999-12-31 | mazi |
| 1864-02-28 | zhangsan|
| 2000-06-04 | xiaosan|
+------------+----------+
5 rows in set (0.00 sec)
7.16 排序枚举数值
ENUM被认为是一个字符串数据类型,但是ENUM数值实 际上是按照数字方式存储的(所以排序枚举列也是数值方式),并且按照表定义中列举的顺序排列数值
在mysql内部定义了从Sunday到Saturday的枚举数值,在定义中具有从1到7的数值。
mysql> select day, day + 0from weekday order by day;
+-----------+---------+
| day | day + 0 |
+-----------+---------+
| sunday | 1 |
| monday | 2 |
| tuesday | 3 |
| wednesday | 4 |
| thurday | 5 |
| friday | 6 |
| saturday | 7 |
+-----------+---------+
7 rows in set (0.00 sec)
mysql> select day,day + 0 from weekday order by cast(day as char)
+-----------+---------+
| day | day + 0 |
+-----------+---------+
| friday | 6 |
| monday | 2 |
| saturday | 7 |
| sunday | 1 |
| thurday | 5 |
| tuesday | 3 |
| wednesday | 4 |
+-----------+---------+

mysql cookbook0008

生成摘要
8.1 使用count函数生成摘要
注意:
1. 在没有where子句的count(*)对与myisam是非常块的,然而,对于BDB或者innodb表而言,
你可能想要避免他,因为这条语句要求执行完成的表扫描,为了避免全表扫描你可以通过查询information_schema表获取不太准确行数:
select tables_rows from information_schema.tables where table_achema = '数据库名' and table_name='表名'

2. count()不计数空值的事实对于从同一个数据行集合中生成多重技术非常有用。
例如:
mysql> select count(*),count(if(day in (7,1),1,null)) from weekday;
+----------+--------------------------------+
| count(*) | count(if(day in (7,1),1,null)) |
+----------+--------------------------------+
| 7 | 2 |
+----------+--------------------------------+
1 row in set (0.04 sec)
mysql>
if()语句决定了每一个数据列数据值是否应该被计数。表达式的结果为1且count()函数会对相应的数据路计数。
假如可以的话,可以创建视图来简化使用摘要:
mysql> create view leap_view as select count(if(d % 4 =0 and
(d % 400 = 0 or d%100 !=0 ),1 ,null )) as leap ,count(*) from date_val;
Query OK, 0 rows affected (0.16 sec)
mysql> select * from leap_view;
+------+----------+
| leap | count(*) |
+------+----------+
| 2 | 6 |
+------+----------+
8.2 使用min和max函数生成摘要
8.3 使用sum和avg函数生成摘要
8.4 使用distinct消除重复
注:1. 假如希望出现空值也是集合中的数据一种:
count(distinct val) +if(count(if(val is null ,1,0)) =0 ,0,1)
count(distinct val) +if(sum(isnull(val)) = 0,0,1)
count(distinct val) +if(sum(isnull(val))!=0)
2. 当用于多数据列是,distinct标识了数据列中不同的数据联合,count对联合的数目计数
3. distinct不止限与数据列,表达式也可以
mysql> select count( distinct month(d)) from date_val;
+---------------------------+
| count( distinct month(d)) |
+---------------------------+
| 5 |
+---------------------------+
1 row in set (0.04 sec)
8.5 查找数值相关的最大值和最小值
select max(col) from tab1 where col = max(col)
//这是有问题的 ,像这样的聚合函数并不能在where子句中使用,因为在where子句中要求是能够应用于单个数据行的表达式,
这句话失败的原因在于,where子句用来确定某一行数据,但是他所知道的聚合函数必须是从确定的函数行中选取,矛盾
select * from tab1 where col = (selectmax(col) from tab1)//子查询
8.6 控制min函数和max函数的字符串大小写区分
当你把字符串数据当作min或者max参数时,也是可以进行取最大值/或最小值的(就是类似于排序)

8.7 将摘要划分子群
mysql> select * from date_val;//测试表
+------------+----------+---------+
| d | name | article |
+------------+----------+---------+
| 1864-02-28 | zhangsan | aaa |
| 1900-01-15 | lisi | aaa |
| 1987-03-05 | wangwu | ddd |
| 1999-12-31 | mazi | ccc |
| 2000-06-04 | xiaosan | ccc |
| 2001-01-01 | xiaosan | bbb |
+------------+----------+---------+
6 rows in set (0.00 sec)
//你通过上面的表进行对比你会发现这样得到的结果是错的,因为使用group子句的时候,
你唯一能选择的子句就是划分子群的子句或者从子群中计算得到的摘要,如果要得到其他列的信息,他们不会划分数据列,也不会显示确定的值
mysql> select name, count(article) from date_val group by article;
+----------+----------------+
| name | count(article) |
+----------+----------------+
| zhangsan | 2 |
| xiaosan | 1 |
| mazi | 2 |
| wangwu | 1 |
+----------+----------------+
4 rows in set (0.00 sec)
//解决的方式1:让其也参加到划分子群当中中去。
mysql> select name,count(article) from date_val group by article,name;
+----------+----------------+
| name | count(article) |
+----------+----------------+
| lisi | 1 |
| zhangsan | 1 |
| xiaosan | 1 |
| mazi | 1 |
| xiaosan | 1 |
| wangwu | 1 |
+----------+----------------+
//假如你又想得到另外的列的信息,那么还有一种方式2:
mysql> create table name_count2 select name,count(article) as count from date_val group by name;
mysql> select d.name,d.d, n.count from date_val d inner join name_count2 n on n.name=d.name;
+----------+------------+-------+
| name | d | count |
+----------+------------+-------+
| zhangsan | 1864-02-28 | 1 |
| lisi | 1900-01-15 | 1 |
| wangwu | 1987-03-05 | 1 |
| mazi | 1999-12-31 | 1 |
| xiaosan | 2000-06-04 | 2 |
| xiaosan | 2001-01-01 | 2 |
+----------+------------+-------+
6 rows in set (0.00 sec)
8.8 摘要和空值
mysql> select * from date_val;
+------------+----------+---------+
| d | name | article |
+------------+----------+---------+
| 1864-02-28 | zhangsan | aaa |
| 1900-01-15 | lisi | aaa |
| 1987-03-05 | wangwu | ddd |
| 1999-12-31 | mazi | ccc |
| 2000-06-04 | xiaosan | ccc |
| 2001-01-01 | xiaosan | bbb |
| 2011-01-01 | xiaer | NULL |
+------------+----------+---------+
7 rows in set (0.00 sec)
//通过下面的结果你会发现: 1. 对于count函数,每个项目的得分数目是0同时他也是一样的被统计下来了,而其他的像max,min,sum这类函数并不会计算,直接是NULL
2. count(*)和count(expr)他们也是存在差异的:count(expr)他只计算非空值,但是count(*)是不会考虑器内容的。
mysql> select name, count(*) ,count(article),max(article)from date_val group by name;
+----------+----------+----------------+--------------+
| name | count(*) | count(article) | max(article) |
+----------+----------+----------------+--------------+
| lisi | 1 | 1 | aaa |
| mazi | 1 | 1 | ccc |
| wangwu | 1 | 1 | ddd |
| xiaer | 1 | 0 | NULL |
| xiaosan | 2 | 2 | ccc |
| zhangsan | 1 | 1 | aaa |
+----------+----------+----------------+--------------+
6 rows in set (0.00 sec)
8.9 使用确定的特性选择组群
注意: 在where子句中不能使用聚合函数在以前的章节中也有说过,今天在重新复习一下:
之所以不能这样使用是因为在where指明的初始约束条件确定了哪些数据行应该被选择(行的约束),
但是count等聚合函数的数值仅仅只能在数据行被选中以后才能确定(应该是 在where子句执行完后再执行的操作),
解决方案: 将count表达式放在hanving表达式子句中,和where类似,但是一般被应用在群组特性而非单个数据行中,
也就是说Having的操作在已经选定和划分好了子群的数据行集上,是基于对聚合函数结果的额外约束条件的应用中

提示:在使用having时,你仍然可以使用where子句,但是仅仅只能选择数据行。而非测试摘要数值。
8.10 使用计数确定数值是否唯一
select col1 ,count(col1) as counts from tab1 group by col1 having counts = 1;
8.11 使用表达式结果分组
mysql> select name, count(*) ,count(article),max(article) from date_val group by count(article);
ERROR 1111 (HY000): Invalid use of group function
//之所以会报错,其实和where与 聚合函数的问题一样的:一个作用于行,一个作用于组。
select * from tab1 group by (表达式1),表达式2...
8.12 分组无类别数据
mysql> select * from date_val;
+------------+----------+---------+-------+
| d | name | article | count |
+------------+----------+---------+-------+
| 1864-02-28 | zhangsan | aaa | 1234 |
| 1900-01-15 | lisi | aaa | 6543 |
| 1987-03-05 | wangwu | ddd | 5432 |
| 1999-12-31 | mazi | ccc | 8765 |
| 2000-06-04 | xiaosan | ccc | 4321 |
| 2001-01-01 | xiaosan | bbb | 4321 |
| 2011-01-01 | xiaer | NULL | 7654 |
+------------+----------+---------+-------+
7 rows in set (0.00 sec)
mysql> select count, count(count) from date_val group by count;
+-------+--------------+
| count | count(count) |
+-------+--------------+
| 1234 | 1 |
| 4321 | 2 |
| 5432 | 1 |
| 6543 | 1 |
| 7654 | 1 |
| 8765 | 1 |
+-------+--------------+
6 rows in set (0.00 sec)
//你会发现这样的测试结果并不是非常的理想,因为并没有起到分类的效果,因为数据 唯一性太大了
mysql> select floor(count/2000)*2000,count(*) as ranges from date_val group by floor(count/2000);
+------------------------+--------+
| floor(count/2000)*2000 | ranges |
+------------------------+--------+
| 0 | 1 |
| 4000 | 3 |
| 6000 | 2 |
| 8000 | 1 |
+------------------------+--------+
//上面显示的又有问题:原本8765这个数据应该是在9这个范围内的,所以 floor((count+1999)/2000)

技巧:数值集合中重复的比值有多少?
解决方案:使用count(distinct )/count(*)来确定重复度。
用处:接近0的结果表明了极高的重复度,这意味着可以很自然的将群组划分为数目较小的类别,当结果为接近1时说明有很多唯一,必然结果就是group by 子句不会
很有效果的分类, 这是告诉你如何去生成摘要,怎样分才是最有效率的


8.13 控制摘要显示顺序
在mysql中,group by 也有他自己的排序能力,因此在摘要的语句中使用order by
mysql> select dayofweek(d),sum(count) from date_val group by name order by dayofweek(d);
+--------------+------------+
| dayofweek(d) | sum(count) |
+--------------+------------+
| 1 | 8642 |
| 1 | 1234 |
| 2 | 6543 |
| 5 | 5432 |
| 6 | 8765 |
| 7 | 7654 |
+--------------+------------+
假如你不想使用group by 自带的排序能力,可以在其后加上order by null;

8.14 查找最小或者最大的摘要
min和max能求得数值范围的端点数值,但是如果你想知道摘要数值集合的两个端数值,那么就做不到了;
//查找最多count的那个article
mysql> select article,count(*) from date_val group by article having count(*) =
(select count(*) from date_val group by article order by count(*) desc limit 1);
+---------+----------+
| article | count(*) |
+---------+----------+
| aaa | 2 |
| ccc | 2 |
+---------+----------+
2 rows in set (0.02 sec)
8.15 基于日期的摘要
为了按照时间排序,使用group by 子句排序含有临时类型数据的数据列
select dayname(t) from tab1 group by dayofweek(t);

8.16 同时使用每一组的摘要和全体的摘要
mysql> select article,sum(count)/(select sum(count) from date_val) from date_val group by article;
+---------+----------------------------------------------+
| article | sum(count)/(select sum(count) from date_val) |
+---------+----------------------------------------------+
| NULL | 0.2000 |
| aaa | 0.2032 |
| bbb | 0.1129 |
| ccc | 0.3419 |
| ddd | 0.1419 |
+---------+----------------------------------------------+
5 rows in set (0.00 sec)
假如你只是想显示不同的摘要值,那么直接在后面加上with rollup
mysql> select article,sum(count) from date_val group by article with rollup;
+---------+------------+
| article | sum(count) |
+---------+------------+
| aaa | 7777 |
| bbb | 4321 |
| ccc | 13086 |
| ddd | 5432 |
| NULL | 38270 |//该值的含义是根据你上面使用的函数决定的,假如是sum那么就是代表总和,假如是avg那么就是平均值
+---------+------------+
如果采用了不只是一个数据列划分群组,with rollup能给出多层次的摘要。
mysql> select article,d,avg(count) from date_val group by article,d with rollup;
+---------+------------+------------+
| article | d | avg(count) |
+---------+------------+------------+
| NULL | 2011-01-01 | 7654.0000 |
| NULL | NULL | 7654.0000 |//二层的平均值
| aaa | 1864-02-28 | 1234.0000 |
| aaa | 1900-01-15 | 6543.0000 |
| aaa | NULL | 3888.5000 |//二层的平均值
| bbb | 2001-01-01 | 4321.0000 |
| bbb | NULL | 4321.0000 |//二层的平均值
| ccc | 1999-12-31 | 8765.0000 |
| ccc | 2000-06-04 | 4321.0000 |
| ccc | NULL | 6543.0000 |//二层的平均值
| ddd | 1987-03-05 | 5432.0000 |
| ddd | NULL | 5432.0000 |//二层的平均值
| NULL | NULL | 5467.1429 |//一层的平均值,也是总的平均值































































你可能感兴趣的:(排序)