执行效率高但“低效”的SQL优化

近期整理案例时发现,同事曾咨询过一个问题,数据库服务器CPU频繁报警,同事最终定位到一个TOP SQL,但是执行效率已经”算不错“,是否还有优化的余地?

SQL_TEXT如下:

SELECT DISTINCT S.*,
                BD.*,
                SD.*,
                UD.*
FROM T S
LEFT JOIN T1 BD ON S.DATA_OBJECT_ID = BD.DATA_OBJECT_ID
AND S.OBJECT_ID = BD.OBJECT_ID
LEFT JOIN T2 SD ON S.DATA_OBJECT_ID = SD.DATA_OBJECT_ID
AND S.OBJECT_ID = SD.OBJECT_ID
LEFT JOIN T3 UD ON S.DATA_OBJECT_ID = UD.DATA_OBJECT_ID
AND S.OBJECT_ID = UD.OBJECT_ID
WHERE S.OBJECT_ID = 88
  AND 1 = 1

原执行计划如下:
Plan hash value: 1035031963

----------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |      1 |        | 84081 (100)|          |      1 |00:00:00.60 |     309K|
|   1 |  HASH UNIQUE          |      |      1 |      1 | 84081   (1)| 00:16:49 |      1 |00:00:00.60 |     309K|
|*  2 |   HASH JOIN OUTER     |      |      1 |      1 | 84080   (1)| 00:16:49 |      1 |00:00:00.60 |     309K|
|*  3 |    HASH JOIN OUTER    |      |      1 |      1 | 79935   (1)| 00:16:00 |      1 |00:00:00.55 |     294K|
|*  4 |     HASH JOIN OUTER   |      |      1 |      1 | 75791   (1)| 00:15:10 |      1 |00:00:00.50 |     278K|
|*  5 |      TABLE ACCESS FULL| T    |      1 |      1 | 71647   (1)| 00:14:20 |      1 |00:00:00.44 |     263K|
|*  6 |      TABLE ACCESS FULL| T3   |      1 |    168 |  4144   (1)| 00:00:50 |      1 |00:00:00.05 |   15260 |
|*  7 |     TABLE ACCESS FULL | T2   |      1 |    168 |  4144   (1)| 00:00:50 |      1 |00:00:00.05 |   15260 |
|*  8 |    TABLE ACCESS FULL  | T1   |      1 |    168 |  4145   (1)| 00:00:50 |      1 |00:00:00.05 |   15260 |
----------------------------------------------------------------------------------------------------------------

 

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("S"."OBJECT_ID"="BD"."OBJECT_ID" AND "S"."DATA_OBJECT_ID"="BD"."DATA_OBJECT_ID")
   3 - access("S"."OBJECT_ID"="SD"."OBJECT_ID" AND "S"."DATA_OBJECT_ID"="SD"."DATA_OBJECT_ID")
   4 - access("S"."OBJECT_ID"="UD"."OBJECT_ID" AND "S"."DATA_OBJECT_ID"="UD"."DATA_OBJECT_ID")
   5 - filter("S"."OBJECT_ID"=88)
   6 - filter("UD"."OBJECT_ID"=88)
   7 - filter("SD"."OBJECT_ID"=88)
   8 - filter("BD"."OBJECT_ID"=88)

SQL涉及四张表T,T1,T2,T3,数据量均在100万+。OBJECT_ID选择率非常好(相当于主键),因为缺乏多个索引,执行计划选择全表扫描+HASH_JOIN的方式针对此SQL是严重低效的,即使执行效率还能接受,平均单次执行时间需要0.237秒,但是309K的BUFFER GET。SQL执行效率高不等于一定是高效率的。

优化建议:T表谓词创建索引,返回行数少(驱动表),剩余表连接列上创建索引(被驱动表),使得SQL走Nested Loop关联方式,小结果集驱动大结果集,最终只需要4000多个BUFFER即可。

模拟案例SQL:

create index t_idx01 on t(object_id);
create index t1_idx01 on t1(data_object_id);
create index t2_idx01 on t2(data_object_id);
create index t3_idx01 on t3(data_object_id);

Plan hash value: 3722872203

------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation			| Name	   | Starts | E-Rows | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		|	   |	  1 |	     |	 782 (100)|	     |	    1 |00:00:00.01 |	4018 |
|   1 |  HASH UNIQUE			|	   |	  1 |	   1 |	 782   (1)| 00:00:10 |	    1 |00:00:00.01 |	4018 |
|   2 |   NESTED LOOPS OUTER		|	   |	  1 |	   1 |	 781   (0)| 00:00:10 |	    1 |00:00:00.01 |	4018 |
|   3 |    NESTED LOOPS OUTER		|	   |	  1 |	   1 |	 522   (0)| 00:00:07 |	    1 |00:00:00.01 |	2680 |
|   4 |     NESTED LOOPS OUTER		|	   |	  1 |	   1 |	 263   (0)| 00:00:04 |	    1 |00:00:00.01 |	1342 |
|   5 |      TABLE ACCESS BY INDEX ROWID| T	   |	  1 |	   1 |	   4   (0)| 00:00:01 |	    1 |00:00:00.01 |	   4 |
|*  6 |       INDEX RANGE SCAN		| T_IDX01  |	  1 |	   1 |	   3   (0)| 00:00:01 |	    1 |00:00:00.01 |	   3 |
|*  7 |      TABLE ACCESS BY INDEX ROWID| T1	   |	  1 |	   1 |	 259   (0)| 00:00:04 |	    1 |00:00:00.01 |	1338 |
|*  8 |       INDEX RANGE SCAN		| T1_IDX01 |	  1 |	 258 |	   2   (0)| 00:00:01 |	 4608 |00:00:00.01 |	  12 |
|*  9 |     TABLE ACCESS BY INDEX ROWID | T2	   |	  1 |	   1 |	 259   (0)| 00:00:04 |	    1 |00:00:00.01 |	1338 |
|* 10 |      INDEX RANGE SCAN		| T2_IDX01 |	  1 |	 258 |	   2   (0)| 00:00:01 |	 4608 |00:00:00.01 |	  12 |
|* 11 |    TABLE ACCESS BY INDEX ROWID	| T3	   |	  1 |	   1 |	 259   (0)| 00:00:04 |	    1 |00:00:00.01 |	1338 |
|* 12 |     INDEX RANGE SCAN		| T3_IDX01 |	  1 |	 258 |	   2   (0)| 00:00:01 |	 4608 |00:00:00.01 |	  12 |
------------------------------------------------------------------------------------------------------------------------------

很多时候,执行效率浪费资源严重的TOP SQL往往被日益逐渐增强的系统资源所掩盖,记得一位前辈说过的话,TOP SQL不要看单纯的执行效率,要看所谓的”TOP”来综合判断是否有优化的余地,例如参考平均buffer get与平均返回行数等其他SQL性能指标,是否是合适,还是“血亏”。