6miu盘搜

[MySQL 优化] --order by 原理

xiaoxiao2023-08-09 154

一、简介：

大部分查询中都要使用到order by。那么处理排序的方法有两种：一种是使用索引，另外一种则是查询时候使用Filesort处理。

1. 利用索引进行排：

利用有序索引进行排序，当 Query ORDER BY 条件和 Query 的执行计划中所利用的 Index 的索引键完全一致，且索引访问方式为 rang、 ref 或者 index 的时候，MySQL 可以利用索引顺序而直接取得已经排好序的数据。因为 MySQL 不需要进行实际的排序操作

2. 利用内存或者磁盘排序算法：

1. single pass 1. two pass

优缺点对比

优点缺点使用索引执行insert，update，delete查询时，索引已经排序好，只需要依次读取即可，处理速度快1.处理insert，update，delete时需要额外添加索引，所以很慢 2.索引需要更多的磁盘空间需要合理的规划好 3.索引个数越多，越需要更多innodb缓冲池使用Filesort1.因不必创建索引，故无使用索引时那么多的缺点 2.若要排序的记录不多，在内存中进行Filesort处理，速度也非常快1.因为要在执行查询时进行排序处理，目标的数据越多，则响应的速度会越慢

不能使用索引进行排序的场景：

排序的基准太多，无法依据某个基准创建索引要对group by 的结果或者DISTINCT的结果进行排序时对临时表的结果(union union all[5.7进行优化没有临时表的出现])重新排序时

二、举例：

表结构：

5.7@3306>[employees]>show create table employees\G *************************** 1. row *************************** Table: employees Create Table: CREATE TABLE `employees` ( `emp_no` int(11) NOT NULL, `birth_date` date NOT NULL, `first_name` varchar(14) NOT NULL, `last_name` varchar(16) NOT NULL, `gender` enum('M','F') NOT NULL, `hire_date` date NOT NULL, PRIMARY KEY (`emp_no`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8

1.使用索引进行排序：

5.7@3306>[employees]>desc select emp_no,first_name, last_name from employees order by emp_no desc limit 0,10; +----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+-------+ | 1 | SIMPLE | employees | NULL | index | NULL | PRIMARY | 4 | NULL | 10 | 100.00 | NULL | +----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+-------+

2.使用single pass:

5.7@3306>[employees]>show status like '%sort%'; +-------------------+-------+ | Variable_name | Value | +-------------------+-------+ | Sort_merge_passes | 0 | | Sort_range | 0 | | Sort_rows | 0 | | Sort_scan | 0 | +-------------------+-------+ 4 rows in set (0.00 sec) 5.7@3306>[employees]>select emp_no,first_name, last_name from employees order by first_name limit 100; 分析：将select的列，order 列放入到排序缓冲，进行排序处理，排序完成后，直接将排序缓冲中的内容返回。 5.7@3306>[employees]>show status like '%sort%'; +-------------------+-------+ | Variable_name | Value | +-------------------+-------+ | Sort_merge_passes | 0 | | Sort_range | 0 | | Sort_rows | 100 | | Sort_scan | 1 | +-------------------+-------+ 4 rows in set (0.00 sec) 5.7@3306>[employees]>desc select emp_no,first_name, last_name from employees order by first_name limit 100; +----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+----------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+----------------+ | 1 | SIMPLE | employees | NULL | ALL | NULL | NULL | NULL | NULL | 299423 | 100.00 | Using filesort | +----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+----------------+ 1 row in set, 1 warning (0.00 sec) 那么这样会造成什么问题：IO，网络等

3.Two pass

5.7@3306>[employees]>show variables like "%max_length%"; +--------------------------+-------+ | Variable_name | Value | +--------------------------+-------+ | max_length_for_sort_data | 1024 | +--------------------------+-------+ 1 row in set (0.00 sec) 排序的值需要大于上面的值 5.7@3306>[employees]>show status like '%sort%'; +-------------------+-------+ | Variable_name | Value | +-------------------+-------+ | Sort_merge_passes | 0 | | Sort_range | 0 | | Sort_rows | 0 | | Sort_scan | 0 | +-------------------+-------+ 4 rows in set (0.00 sec) 5.7@3306>[employees]> select count(1) from ( select * from employees order by first_name desc limit 100000000 ) a; +----------+ | count(1) | +----------+ | 300024 | +----------+ 1 row in set (0.87 sec) 5.7@3306>[employees]>show status like '%sort%'; +-------------------+--------+ | Variable_name | Value | +-------------------+--------+ | Sort_merge_passes | 25 | | Sort_range | 0 | | Sort_rows | 300024 | | Sort_scan | 1 | +-------------------+--------+ 4 rows in set (0.00 sec) 解释：Sort_merge_passes 超出sort buffer的值，将数据写入到了tmp file中

三、优化filesort：

优先选择第一种using index 的排序方式，在第一种方式无法满足的情况下，尽可能让 MySQL 选择使用第二种单路算法来进行排序。这样可以减少大量的随机IO操作,很大幅度地提高排序工作的效率。1、去掉不必要的字段2、加大max_length_for_sort_data 参数的设置3、增大sort buffer的设置

相关资源：敏捷开发V1.0.pptx

最新回复(0)