mySQL中exists与in的使用

xiaoxiao2022-07-03 225

文章目录

1、exists1.1、特殊情况：1.2、not exists 2、in、not in 使用：2.1、in ：2.2、not in ：2.3、返回列的不同： 3、SQL 分析3.1、exists 和 in 的 SQL 分析：3.2、not exists 和 not in 的SQL 分析 4、效率4.1、in 、 exists4.2、not in 和 not exists4.3、结论： 5、in 与 or 的区别

1、exists

exists 用于检查子查询是否至少会返回一行数据，该子查询实际上并不返回任何数据，而是返回值 true 或 false 。

exists 指定一个子查询，检测行的存在。

语法：exists subquery 。参数 subquery 是一个受限的 select语句（不允许有 compute 子句和 into 关键字）。结果类型为 Boolean，如果子查询包含行，则返回 true。

exists 子查询对外表用 loop 逐条查询，每次查询都会查看 exists 的条件语句，当 exists 里的条件语句能够返回记录行时（无论记录行是的多少条，只要能返回），如果条件就为真，则返回当前 loop 到的这条记录。反之，如果 exists 里的条件语句不能返回记录行，则当前loop到的这条记录从外表中丢弃。

exists 的条件就像一个bool条件，当能返回结果集则为true，不能返回结果集则为 false。

select * from A where exists (select * from B where A.id=B.id)

exists 是对外表结果做 loop 循环，每次 loop 循环再对内表（exists表）进行查询。

1.1、特殊情况：

如下：

select * from user where exists (select 1);

对 user 表的记录逐条取出，由于子条件中的 select 1 永远能返回记录行，那么 user表的所有记录都将被加入结果集，所以与 select * from user 是一样的。

又如下：

select * from user where exists (select * from user where userId = 0);

可以知道对user表进行loop时，检查条件语句 (select * from user where userId = 0) ，由于userId永远不为0，所以条件语句永远返回空集，条件永远为false，那么user表的所有记录都将被丢弃。

exists子查询中使用 NULL 仍然返回结果集：

select * from TableIn where exists (select null)

等同于：

select * from TableIn

1.2、not exists

not exists 与 exists 相反，也就是当 exists 条件有结果集返回时，loop到的记录将被丢弃，否则将loop到的记录加入结果集。

总的来说，如果A表有n条记录，那么exists查询就是将这n条记录逐条取出，然后判断n遍exists条件。

2、in、not in 使用：

2.1、in ：

in 查询相当于多个 or 条件的叠加，这个比较好理解。比如下面的查询：

select * from user where userId in (1, 2, 3);

等效于

select * from user where userId = 1 or userId = 2 or userId = 3;

2.2、not in ：

not in与in相反，如下：

select * from user where userId not in (1, 2, 3);

等效于

select * from user where userId != 1 and userId != 2 and userId != 3;

总的来说，in查询就是先将子查询条件的记录全都查出来，假设结果集为B，共有m条记录，然后在将子查询条件的结果集分解成m个，再进行m次查询。

2.3、返回列的不同：

值得一提的是，in查询的子条件返回结果必须只有一个字段，例如

select * from user where userId in (select id from B);

而不能是

select * from user where userId in (select id, age from B); # 这个SQL是错误的，没法执行

而 exists 就没有这个限制。

3、SQL 分析

3.1、exists 和 in 的 SQL 分析：

SQL语句：

# 查询1: select * from A where exists (select * from B where B.id = A.id); # 查询2: select * from A where A.id in (select id from B);

查询1，可以转化以下伪代码：

for (int i = 0; i < count(A); i++) { 　　a = get_record(A, i); # 从A表逐条获取记录　　if (B.id = a[id]){ # 如果子条件成立　　　　result[] = a; } } return result;

可以看到，查询1 主要是用到了B表的索引，A表如何对查询的效率影响应该不大。

查询2，可以转换为：（假设B表的所有id为1,2,3 ）

select * from A where A.id = 1 or A.id = 2 or A.id = 3;

这里主要是用到了A的索引，B表如何对查询影响不大。

3.2、not exists 和 not in 的SQL 分析

# 查询1： select * from A where not exists (select * from B where B.id = A.id); # 查询2： select * from A where A.id not in (select id from B);

查询1，还是和上面一样，用了B的索引。

查询2，可以转化成如下语句：

select * from A where A.id != 1 and A.id != 2 and A.id != 3;

可以知道 not in 是个范围查询，这种 != 的范围查询无法使用任何索引，等于说A表的每条记录，都要在B表里遍历一次，查看B表里是否存在这条记录。

故 not exists 比 not in 效率高。

4、效率

4.1、in 、 exists

mysql中的 in语句是把外表和内表作hash 连接，而exists语句是对外表作loop循环，每次loop循环再对内表进行查询。一直大家都认为 exists 比 in 语句的效率要高，这种说法其实是不准确的。这个是要区分环境的。

如果查询的两个表大小相当，那么用 in 和exists差别不大。

如果两个表中一个较小，一个是大表，则子查询表大的用exists，子查询表小的用in。

示例：

表A（小表），表B（大表）

1、表B（大表）作子表：

select * from A where cc in (select cc from B) # in效率低，用到了A表上cc列的索引； select * from A where exists(select cc from B where cc=A.cc) # exists 效率高，用到了B表上cc列的索引。

大表做子表时，exists 效率高，前提是使用了列的索引。

2、表A（小表）作子表：

select * from B where cc in (select cc from A) # in效率高，用到了B表上cc列的索引； select * from B where exists(select cc from A where cc=B.cc) # exists效率低，用到了A表上cc列的索引。

小表作子表时，in效率高。

4.2、not in 和 not exists

如果查询语句使用了 not in 那么内外表都进行全表扫描，没有用到索引；而 not extsts 的子查询依然能用到表上的索引。

所以无论那个表大，用 not exists 都比 not in 要快，因为使用了索引。

4.3、结论：

如果主表和子表的大小相当，那么用 in 和exists差别不大；

大表做子查询时，exists 效率高，前提是使用了列有索引；

小表作子查询时，in效率高；

无论那个表大，用 not exists 都比 not in 要快。

综上所述：认为 exists 比 in 效率高的说法是不准确的。

5、in 与 or 的区别

select name from student where name in ('zhang','wang','li','zhao');

与

select name from student where name='zhang' or name='li' or name='wang' or name='zhao'

的结果是相同的。

转载： https://www.cnblogs.com/beijingstruggle/p/5885137.html

最新回复(0)