Mysql巧用join优化sql的方法详解


本文整理自网络,侵删。

0. 准备相关表来进行接下来的测试

相关建表语句请看:https://github.com/YangBaohust/my_sql

user1表,取经组
+----+-----------+-----------------+---------------------------------+
| id | user_name | comment   | mobile       |
+----+-----------+-----------------+---------------------------------+
| 1 | 唐僧  | 旃檀功德佛  | 138245623,021-382349   |
| 2 | 孙悟空 | 斗战胜佛  | 159384292,022-483432,+86-392432 |
| 3 | 猪八戒 | 净坛使者  | 183208243,055-8234234   |
| 4 | 沙僧  | 金身罗汉  | 293842295,098-2383429   |
| 5 | NULL  | 白龙马   | 993267899      |
+----+-----------+-----------------+---------------------------------+

user2表,悟空的朋友圈
+----+--------------+-----------+
| id | user_name | comment |
+----+--------------+-----------+
| 1 | 孙悟空  | 美猴王 |
| 2 | 牛魔王  | 牛哥  |
| 3 | 铁扇公主  | 牛夫人 |
| 4 | 菩提老祖  | 葡萄  |
| 5 | NULL   | 晶晶  |
+----+--------------+-----------+

user1_kills表,取经路上杀的妖怪数量
+----+-----------+---------------------+-------+
| id | user_name | timestr    | kills |
+----+-----------+---------------------+-------+
| 1 | 孙悟空 | 2013-01-10 00:00:00 | 10 |
| 2 | 孙悟空 | 2013-02-01 00:00:00 |  2 |
| 3 | 孙悟空 | 2013-02-05 00:00:00 | 12 |
| 4 | 孙悟空 | 2013-02-12 00:00:00 | 22 |
| 5 | 猪八戒 | 2013-01-11 00:00:00 | 20 |
| 6 | 猪八戒 | 2013-02-07 00:00:00 | 17 |
| 7 | 猪八戒 | 2013-02-08 00:00:00 | 35 |
| 8 | 沙僧  | 2013-01-10 00:00:00 |  3 |
| 9 | 沙僧  | 2013-01-22 00:00:00 |  9 |
| 10 | 沙僧  | 2013-02-11 00:00:00 |  5 |
+----+-----------+---------------------+-------+

user1_equipment表,取经组装备
+----+-----------+--------------+-----------------+-----------------+
| id | user_name | arms   | clothing  | shoe   |
+----+-----------+--------------+-----------------+-----------------+
| 1 | 唐僧  | 九环锡杖  | 锦斓袈裟  | 僧鞋   |
| 2 | 孙悟空 | 金箍棒  | 梭子黄金甲  | 藕丝步云履  |
| 3 | 猪八戒 | 九齿钉耙  | 僧衣   | 僧鞋   |
| 4 | 沙僧  | 降妖宝杖  | 僧衣   | 僧鞋   |
+----+-----------+--------------+-----------------+-----------------+

1. 使用left join优化not in子句

例子:找出取经组中不属于悟空朋友圈的人

+----+-----------+-----------------+-----------------------+
| id | user_name | comment   | mobile    |
+----+-----------+-----------------+-----------------------+
| 1 | 唐僧  | 旃檀功德佛  | 138245623,021-382349 |
| 3 | 猪八戒 | 净坛使者  | 183208243,055-8234234 |
| 4 | 沙僧  | 金身罗汉  | 293842295,098-2383429 |
+----+-----------+-----------------+-----------------------+

not in写法:

select * from user1 a where a.user_name not in (select user_name from user2 where user_name is not null);

left join写法:

首先看通过user_name进行连接的外连接数据集

select a.*, b.* from user1 a left join user2 b on (a.user_name = b.user_name);
+----+-----------+-----------------+---------------------------------+------+-----------+-----------+
| id | user_name | comment   | mobile       | id | user_name | comment |
+----+-----------+-----------------+---------------------------------+------+-----------+-----------+
| 2 | 孙悟空 | 斗战胜佛  | 159384292,022-483432,+86-392432 | 1 | 孙悟空 | 美猴王 |
| 1 | 唐僧  | 旃檀功德佛  | 138245623,021-382349   | NULL | NULL  | NULL  |
| 3 | 猪八戒 | 净坛使者  | 183208243,055-8234234   | NULL | NULL  | NULL  |
| 4 | 沙僧  | 金身罗汉  | 293842295,098-2383429   | NULL | NULL  | NULL  |
| 5 | NULL  | 白龙马   | 993267899      | NULL | NULL  | NULL  |
+----+-----------+-----------------+---------------------------------+------+-----------+-----------+

可以看到a表中的所有数据都有显示,b表中的数据只有b.user_name与a.user_name相等才显示,其余都以null值填充,要想找出取经组中不属于悟空朋友圈的人,只需要在b.user_name中加一个过滤条件b.user_name is null即可。

select a.* from user1 a left join user2 b on (a.user_name = b.user_name) where b.user_name is null;
+----+-----------+-----------------+-----------------------+
| id | user_name | comment   | mobile    |
+----+-----------+-----------------+-----------------------+
| 1 | 唐僧  | 旃檀功德佛  | 138245623,021-382349 |
| 3 | 猪八戒 | 净坛使者  | 183208243,055-8234234 |
| 4 | 沙僧  | 金身罗汉  | 293842295,098-2383429 |
| 5 | NULL  | 白龙马   | 993267899    |
+----+-----------+-----------------+-----------------------+

看到这里发现结果集中还多了一个白龙马,继续添加过滤条件a.user_name is not null即可。

select a.* from user1 a left join user2 b on (a.user_name = b.user_name) where b.user_name is null and a.user_name is not null;

2. 使用left join优化标量子查询

例子:查看取经组中的人在悟空朋友圈的昵称

+-----------+-----------------+-----------+
| user_name | comment   | comment2 |
+-----------+-----------------+-----------+
| 唐僧  | 旃檀功德佛  | NULL  |
| 孙悟空 | 斗战胜佛  | 美猴王 |
| 猪八戒 | 净坛使者  | NULL  |
| 沙僧  | 金身罗汉  | NULL  |
| NULL  | 白龙马   | NULL  |
+-----------+-----------------+-----------+

子查询写法:

select a.user_name, a.comment, (select comment from user2 b where b.user_name = a.user_name) comment2 from user1 a;

left join写法:

select a.user_name, a.comment, b.comment comment2 from user1 a left join user2 b on (a.user_name = b.user_name);

3. 使用join优化聚合子查询

例子:查询出取经组中每人打怪最多的日期

+----+-----------+---------------------+-------+
| id | user_name | timestr    | kills |
+----+-----------+---------------------+-------+
| 4 | 孙悟空 | 2013-02-12 00:00:00 | 22 |
| 7 | 猪八戒 | 2013-02-08 00:00:00 | 35 |
| 9 | 沙僧  | 2013-01-22 00:00:00 |  9 |
+----+-----------+---------------------+-------+

聚合子查询写法:

select * from user1_kills a where a.kills = (select max(b.kills) from user1_kills b where b.user_name = a.user_name);

join写法:

首先看两表自关联的结果集,为节省篇幅,只取猪八戒的打怪数据来看

select a.*, b.* from user1_kills a join user1_kills b on (a.user_name = b.user_name) order by 1;
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+
| id | user_name | timestr    | kills | id | user_name | timestr    | kills |
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+
| 5 | 猪八戒 | 2013-01-11 00:00:00 | 20 | 5 | 猪八戒 | 2013-01-11 00:00:00 | 20 |
| 5 | 猪八戒 | 2013-01-11 00:00:00 | 20 | 6 | 猪八戒 | 2013-02-07 00:00:00 | 17 |
| 5 | 猪八戒 | 2013-01-11 00:00:00 | 20 | 7 | 猪八戒 | 2013-02-08 00:00:00 | 35 |
| 6 | 猪八戒 | 2013-02-07 00:00:00 | 17 | 7 | 猪八戒 | 2013-02-08 00:00:00 | 35 |
| 6 | 猪八戒 | 2013-02-07 00:00:00 | 17 | 5 | 猪八戒 | 2013-01-11 00:00:00 | 20 |
| 6 | 猪八戒 | 2013-02-07 00:00:00 | 17 | 6 | 猪八戒 | 2013-02-07 00:00:00 | 17 |
| 7 | 猪八戒 | 2013-02-08 00:00:00 | 35 | 5 | 猪八戒 | 2013-01-11 00:00:00 | 20 |
| 7 | 猪八戒 | 2013-02-08 00:00:00 | 35 | 6 | 猪八戒 | 2013-02-07 00:00:00 | 17 |
| 7 | 猪八戒 | 2013-02-08 00:00:00 | 35 | 7 | 猪八戒 | 2013-02-08 00:00:00 | 35 |
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+

可以看到当两表通过user_name进行自关联,只需要对a表的所有字段进行一个group by,取b表中的max(kills),只要a.kills=max(b.kills)就满足要求了。sql如下

select a.* from user1_kills a join user1_kills b on (a.user_name = b.user_name) group by a.id, a.user_name, a.timestr, a.kills having a.kills = max(b.kills);

4. 使用join进行分组选择

例子:对第3个例子进行升级,查询出取经组中每人打怪最多的前两个日期

+----+-----------+---------------------+-------+
| id | user_name | timestr       | kills |
+----+-----------+---------------------+-------+
| 3 | 孙悟空  | 2013-02-05 00:00:00 |  12 |
| 4 | 孙悟空  | 2013-02-12 00:00:00 |  22 |
| 5 | 猪八戒  | 2013-01-11 00:00:00 |  20 |
| 7 | 猪八戒  | 2013-02-08 00:00:00 |  35 |
| 9 | 沙僧   | 2013-01-22 00:00:00 |   9 |
| 10 | 沙僧   | 2013-02-11 00:00:00 |   5 |
+----+-----------+---------------------+-------+

在oracle中,可以通过分析函数来实现

select b.* from (select a.*, row_number() over(partition by user_name order by kills desc) cnt from user1_kills a) b where b.cnt <= 2;

很遗憾,上面sql在mysql中报错ERROR 1064 (42000): You have an error in your SQL syntax; 因为mysql并不支持分析函数。不过可以通过下面的方式去实现。

首先对两表进行自关联,为了节约篇幅,只取出孙悟空的数据

select a.*, b.* from user1_kills a join user1_kills b on (a.user_name=b.user_name and a.kills<=b.kills) order by a.user_name, a.kills desc;
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+
| id | user_name | timestr       | kills | id | user_name | timestr       | kills |
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+
| 4 | 孙悟空  | 2013-02-12 00:00:00 |  22 | 4 | 孙悟空  | 2013-02-12 00:00:00 |  22 |
| 3 | 孙悟空  | 2013-02-05 00:00:00 |  12 | 3 | 孙悟空  | 2013-02-05 00:00:00 |  12 |
| 3 | 孙悟空  | 2013-02-05 00:00:00 |  12 | 4 | 孙悟空  | 2013-02-12 00:00:00 |  22 |
| 1 | 孙悟空  | 2013-01-10 00:00:00 |  10 | 1 | 孙悟空  | 2013-01-10 00:00:00 |  10 |
| 1 | 孙悟空  | 2013-01-10 00:00:00 |  10 | 3 | 孙悟空  | 2013-02-05 00:00:00 |  12 |
| 1 | 孙悟空  | 2013-01-10 00:00:00 |  10 | 4 | 孙悟空  | 2013-02-12 00:00:00 |  22 |
| 2 | 孙悟空  | 2013-02-01 00:00:00 |   2 | 1 | 孙悟空  | 2013-01-10 00:00:00 |  10 |
| 2 | 孙悟空  | 2013-02-01 00:00:00 |   2 | 3 | 孙悟空  | 2013-02-05 00:00:00 |  12 |
| 2 | 孙悟空  | 2013-02-01 00:00:00 |   2 | 4 | 孙悟空  | 2013-02-12 00:00:00 |  22 |
| 2 | 孙悟空  | 2013-02-01 00:00:00 |   2 | 2 | 孙悟空  | 2013-02-01 00:00:00 |   2 |
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+

从上面的表中我们知道孙悟空打怪前两名的数量是22和12,那么只需要对a表的所有字段进行一个group by,对b表的id做个count,count值小于等于2就满足要求,sql改写如下:

select a.* from user1_kills a join user1_kills b on (a.user_name=b.user_name and a.kills<=b.kills) group by a.id, a.user_name, a.timestr, a.kills having count(b.id) <= 2;

5. 使用笛卡尔积关联实现一列转多行

阅读剩余部分

相关阅读 >>

sql server 的t-sql高级查询详解

sql cast,convert,quotename,exec 函数学习记录

简述mysql explain 命令

spark学习笔记之spark sql的具体使用

springboot之使用springdatajpa的自定义sql方式

sql server是什么意思

sqlserver给表增加多个字段的语法

sql查询至少连续n天登录的用户

oracle中top的写法

sql中的名称由什么组成?

更多相关阅读请进入《sql》频道 >>


数据库系统概念 第6版
书籍

数据库系统概念 第6版

机械工业出版社

本书主要讲述了数据模型、基于对象的数据库和XML、数据存储和查询、事务管理、体系结构等方面的内容。



打赏

取消

感谢您的支持,我会继续努力的!

扫码支持
扫码打赏,您说多少就多少

打开支付宝扫一扫,即可进行扫码打赏哦

分享从这里开始,精彩与您同在

评论

管理员已关闭评论功能...