Index Condition Pushdown Optimization

摘要:
IndexConditionPushdown(ICP)是针对MySQL使用索引从数据表中检索行的情况进行的优化

Index Condition Pushdown (ICP) is an optimization for the case where MySQL retrieves rows from a table using an index(ICP是MySQL用索引从表中获取数据的一种优化). Without ICP, the storage engine traverses the index to locate rows in the base table and returns them to the MySQL server which evaluates the WHERE condition for the rows(如果没有ICP,存储引擎层将会穿过索引从基表里面定位行,并把结果返回给server层,MySQL server层将会根据where条件判断哪些结果合格并返回给客户端). With ICP enabled, and if parts of the WHERE condition can be evaluated by using only fields from the index, the MySQL server pushes this part of the WHERE condition down to the storage engine(如果用了ICP,如果能根据索引里的字段获取符合where条件,server层就会把where条件下推到存储引擎层). The storage engine then evaluates the pushed index condition by using the index entry and only if this is satisfied is the row read from the table(然后存储引擎层). ICP can reduce the number of times the storage engine must access the base table and the number of times the MySQL server must access the storage engine(ICP可以减少存储引擎层必须访问基表的次数和server层必须访问存储引擎层的次数).

Index Condition Pushdown optimization is used for the rangerefeq_ref, and ref_or_null access methods when there is a need to access full table rows(ICP是为需要使用range,ref,eq_ref和ref_or_null方法访问全表数据优化的).另外MariaDB也针对索引下推做了一个优化,Batched Key Access.

This strategy can be used for InnoDB and MyISAM tables. (Note that index condition pushdown is not supported with partitioned tables in MySQL 5.6; this issue is resolved in MySQL 5.7.) For InnoDB tables, however, ICP is used only for secondary indexes(然而对于innodb表,ICP只适用于二级索引). The goal of ICP is to reduce the number of full-record reads and thereby reduce IO operations(ICP的目标是减少全表扫描的次数和减少IO操作). For InnoDB clustered indexes, the complete record is already read into the InnoDB buffer. Using ICP in this case does not reduce IO(对于InnoDB的聚簇索引,所有的记录都已经在buffer pool里面了,所以使用ICP在这种情况下并没有减少IO).

The idea is to check part of the WHERE condition that refers to index fields (we call it Pushed Index Condition) as soon as we've accessed the index. If the Pushed Index Condition is not satisfied, we won't need to read the whole table record我认为是作者的笔误(概括来说,索引下推总的思想是当我们访问索引的时候,检查索引里的字段是否有满足where条件的,我们叫下推索引条件。如果下推索引条件满足,我们没必要访问全表记录).

To see how this optimization works, consider first how an index scan proceeds when Index Condition Pushdown is not used:

  1. Get the next row, first by reading the index tuple, and then by using the index tuple to locate and read the full table row.

  2. Test the part of the WHERE condition that applies to this table. Accept or reject the row based on the test result.

 用一张图表示更容易理解一点:

Index Condition Pushdown Optimization第1张

When Index Condition Pushdown is used, the scan proceeds like this instead:

  1. Get the next row's index tuple (but not the full table row).

  2. Test the part of the WHERE condition that applies to this table and can be checked using only index columns. If the condition is not satisfied, proceed to the index tuple for the next row.

  3. If the condition is satisfied, use the index tuple to locate and read the full table row.

  4. Test the remaining part of the WHERE condition that applies to this table. Accept or reject the row based on the test result.

用一张图说明:

Index Condition Pushdown Optimization第2张

Index Condition Pushdown is enabled by default; it can be controlled with the optimizer_switch system variable by setting the index_condition_pushdown flag. See Section 8.9.2, “Controlling Switchable Optimizations”.

索引下推默认打开的.如果使用了索引下推你就会在执行计划中看到"Using index condition"字样,如:

Mysql [test]> explain select * from tbl where key_col1 between 10 and 11 and key_col2 like '%foo%';
+----+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------+
| id | select_type | table | type  | possible_keys | key      | key_len | ref  | rows | Extra                 |
+----+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------+
|  1 | SIMPLE      | tbl   | range | key_col1      | key_col1 | 5       | NULL |    2 | Using index condition |
+----+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------+
1 row in set (0.01 sec)

那么它能带来多少性能提升呢?

答案是取决于它能过滤掉的记录数和要获取结果集的成本。前者取决于语句和结果集,后者取决于表的记录数,在什么样的硬盘上和表中是否有大字段(如:blob)。

来看个例子:

alter table lineitem add index s_r (l_shipdate, l_receiptdate);
select count(*) from lineitemwhere
  l_shipdate between '1993-01-01' and '1993-02-01' and
  datediff(l_receiptdate,l_shipdate) > 25 and
  l_quantity > 40;
没有下推索引条件
-+----------+-------+----------------------+-----+---------+------+--------+-------------+
 table    | type | possible_keys         | key | key_len | ref | rows    | Extra       |
-+----------+-------+----------------------+-----+---------+------+--------+-------------+
 | lineitem | range | s_r                  | s_r | 4       | NULL | 152064 | Using where |
-+----------+-------+----------------------+-----+---------+------+--------+-------------+
使用了下推索引
-+-----------+-------+---------------+-----+---------+------+--------+------------------------------------+
 table     | type | possible_keys | key | key_len | ref | rows     | Extra                              |
-+-----------+-------+---------------+-----+---------+------+--------+------------------------------------+
 | lineitem | range | s_r            | s_r | 4       | NULL | 152064 | Using index condition; Using where |
-+-----------+-------+---------------+-----+---------+------+--------+------------------------------------+

速度上的提升呢?

  • 冷buffer pool:从 5 min 下降到 1 min

  • 热buffer pool:从0.19 sec 下降到 0.07 sec

在MariaDB中有两个变量来查看使用下推索引的状态:

Status variables

There are two server status variables:

Variable nameMeaning
Handler_icp_attemptsNumber of times pushed index condition was checked.
Handler_icp_matchNumber of times the condition was matched.

That way, the value Handler_icp_attempts - Handler_icp_match shows the number records that the server did not have to read because of Index Condition Pushdown.

如果用的不是MariaDB,我们怎么知道索引下推带来了多少优化呢?我们可以使用这个参数Handler_read_next来看,

参考手册:

http://dev.mysql.com/doc/refman/5.6/en/index-condition-pushdown-optimization.html 

https://mariadb.com/kb/en/mariadb/index-condition-pushdown/ 

免责声明:文章转载自《Index Condition Pushdown Optimization》仅用于学习参考。如对内容有疑问,请及时联系本站处理。

上篇获取文件名称的两个函数4、zabbix基本配置入门下篇

宿迁高防,2C2G15M,22元/月;香港BGP,2C5G5M,25元/月 雨云优惠码:MjYwNzM=

相关文章

七、玩转select条件查询

前言:   电商中:我们想查看某个用户所有的订单,或者想查看某个用户在某个时间段内所有的订单,此时我们需要对订单表数据进行筛选,按照用户、时间进行过滤,得到我们期望的结果。   此时我们需要使用条件查询来对指定表进行操作,我们需要了解sql中的条件查询常见的玩法。 本篇内容 1、查询条件语法 2、条件查询运算符详解(=、<、>、<=、&g...

pgsql 相关函数

1、COALESCE — 空值替换函数。示例:COALESCE(col, 'replacement') :如果col列的值为null,则col的值将被替换为'replacement' 2、regexp_split_to_table — 行专列该函数将对指定列的值进行分割,分割后的每个子串将转成一行,多个子串将转成多行。 示例:regexp_split_to...

基于c#发送Outlook邮件(仅SMTP版本)

先表明Outlook的参数:网址:https://support.office.com/zh-cn/article/Outlook-com-%E7%9A%84-POP%E3%80%81IMAP-%E5%92%8C-SMTP-%E8%AE%BE%E7%BD%AE-d088b986-291d-42b8-9564-9c414e2aa040 POP 访问是被默认禁...

haproxy实现会话保持(2):stick table

HAProxy系列文章:http://www.cnblogs.com/f-ck-need-u/p/7576137.html 在上一篇文章中,分析了haproxy如何通过cookie实现会话保持,本文讨论haproxy另一种实现会话保持的方式:stick table。 1.stickiness和stick table简介 stick table是hap...

数据可视化之DAX篇(二十四)Power BI应用技巧:在总计行实现条件格式

https://zhuanlan.zhihu.com/p/98975646 如何将表格或者矩阵中值的条件格式也应用于总计行? 目前PowerBI并不支持这种功能,无法在总计行或者小计行上应用条件格式,不过我们可以摸索个变通的方式来实现。 以制作红绿灯效果这篇文章的数据为例: PowerBI小技巧:简单两步实现红绿灯、箭头效果   这个总计行显然没有应用...

查询(关键字查询,多条件查询)

一、关键字查询 (1)查询一张表,要把表先列出来,显示出查询的表 <table cellpadding="0" cellspacing="0" border="1"> //正常的查看表,前几天刚学习的内容   <tr> <td>代号</td> <td>名称&l...