高逻辑读Top Sql 优化案例一则

前不久,有个高逻辑读的sql需要进行优化处理,详情如下:

SQL_TEXT如下:
SELECT DISTINCT object_name
FROM t2
WHERE (substr(created, 0, 10) = to_char(sysdate - 1, 'yyyy-mm-dd')
	AND object_name IS NOT NULL
	AND object_type IS NULL
	AND object_name NOT IN (
		SELECT object_name
		FROM t3
		WHERE object_name IS NOT NULL
	)
	AND regexp_like(object_name, '[[:upper:]]{4}'))

SQL的统计信息如下 :
Historical Plans Summary(dba_hist_sqlstat):

  RN PLAN_HASH_VALUE AVG_ETIME_S  AVG_CPU_S    AVG_BUFFERS  AVG_READS   AVG_ROWS      EXECS FIRST_SNAP          LAST_SNAP
---- --------------- ------------ ------------ ----------- ---------- ---------- ---------- ------------------- -------------------
   1       527727976      113.451       32.239      907578     899049         18          8 2020-05-20 10:00:15 2020-05-28 10:00:51


Current Plans Summary(gv_sql):

 RN PLAN_HASH_VALUE AVG_ETIME_S  AVG_CPU_S    AVG_BUFFERS  AVG_READS   AVG_ROWS TOTAL_EXEC FIRST_LOAD_TIME      LAST_ACTIVE
--- --------------- ------------ ------------ ----------- ---------- ---------- ---------- -------------------- --------------------
  1       527727976      154.255       45.998     1131102    1130264         23          1 2020-05-28/09:00:02  2020-05-28 09:02:36


执行计划如下:
select distinct object_name   from t2  where substr(created, 0, 10) =
to_char(sysdate - 1, 'yyyy-mm-dd')    and object_name is not null and
object_type is null    and object_name not in        (select
object_name from t3 where object_name is not null)    and
regexp_like(object_name, '[[:upper:]]{4}')

Plan hash value: 4047468152

--------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |   239 (100)|          |     51 |00:00:00.03 |     863 |
|   1 |  HASH UNIQUE        |      |      1 |      1 |   239   (2)| 00:00:03 |     51 |00:00:00.03 |     863 |
|*  2 |   HASH JOIN ANTI    |      |      1 |      1 |   238   (1)| 00:00:03 |     77 |00:00:00.03 |     863 |
|*  3 |    TABLE ACCESS FULL| T2   |      1 |     70 |   153   (1)| 00:00:02 |     77 |00:00:00.01 |     557 |
|*  4 |    TABLE ACCESS FULL| T3   |      1 |  85011 |    85   (2)| 00:00:02 |  71958 |00:00:00.01 |     306 |
--------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$5DA710D3
   3 - SEL$5DA710D3 / T2@SEL$1
   4 - SEL$5DA710D3 / T3@SEL$2

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.2.0.4')
      DB_VERSION('11.2.0.4')
      OPT_PARAM('_optimizer_extended_cursor_sharing_rel' 'none')
      OPT_PARAM('_optimizer_adaptive_cursor_sharing' 'false')
      OPT_PARAM('_optimizer_use_feedback' 'false')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5DA710D3")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      FULL(@"SEL$5DA710D3" "T2"@"SEL$1")
      FULL(@"SEL$5DA710D3" "T3"@"SEL$2")
      LEADING(@"SEL$5DA710D3" "T2"@"SEL$1" "T3"@"SEL$2")
      USE_HASH(@"SEL$5DA710D3" "T3"@"SEL$2")
      USE_HASH_AGGREGATION(@"SEL$5DA710D3")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_NAME"="OBJECT_NAME")
   3 - filter(("OBJECT_TYPE" IS NULL AND  REGEXP_LIKE
              ("OBJECT_NAME",'[[:upper:]]{4}',HEXTORAW('105FF26E1D000000FCE30302000000000000000000000000703EDC060000
              0000000000000000000000000000000000003600000000000000605FF26E1D00000004000000010000000000000081000000')
               ) AND SUBSTR("CREATED",0,10)=TO_CHAR(SYSDATE@!-1,'yyyy-mm-dd') AND "OBJECT_NAME" IS NOT NULL))
   4 - filter("OBJECT_NAME" IS NOT NULL)

生产中,T2表63127459行记录(8g),T3表5703行记录(3m),join时都使用了全表扫描,其中cretad ndv很高,每天的数据量记录数并不多,可以创建如下索引:

create index t2_idx on t2(substr(created, 0, 10));
create index T3_IDX01 on t3(object_name);

执行计划如下,仍然不是最优的执行计划,更高效的应该是走NESTED LOOPS ANTI

-------------------------------------
select distinct object_name   from t2  where substr(created, 0, 10) =
to_char(sysdate - 1, 'yyyy-mm-dd')    and object_name is not null and
object_type is null    and object_name not in        (select
object_name from t3 where object_name is not null)    and
regexp_like(object_name, '[[:upper:]]{4}')

Plan hash value: 1600442906

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name   | Starts | E-Rows | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |        |      1 |        |    93 (100)|          |    247 |00:00:00.03 |     317 |
|   1 |  HASH UNIQUE                  |        |      1 |      5 |    93   (3)| 00:00:02 |    247 |00:00:00.03 |     317 |
|*  2 |   HASH JOIN ANTI              |        |      1 |      5 |    92   (2)| 00:00:02 |    404 |00:00:00.02 |     317 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| T2     |      1 |    474 |     7   (0)| 00:00:01 |    404 |00:00:00.01 |      11 |
|*  4 |     INDEX RANGE SCAN          | T2_IDX |      1 |    474 |     2   (0)| 00:00:01 |    410 |00:00:00.01 |       3 |
|*  5 |    TABLE ACCESS FULL          | T3     |      1 |  85303 |    85   (2)| 00:00:02 |  72028 |00:00:00.01 |     306 |
--------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$5DA710D3
   3 - SEL$5DA710D3 / T2@SEL$1
   4 - SEL$5DA710D3 / T2@SEL$1
   5 - SEL$5DA710D3 / T3@SEL$2

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.2.0.4')
      DB_VERSION('11.2.0.4')
      OPT_PARAM('_optimizer_extended_cursor_sharing_rel' 'none')
      OPT_PARAM('_optimizer_adaptive_cursor_sharing' 'false')
      OPT_PARAM('_optimizer_use_feedback' 'false')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5DA710D3")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      INDEX_RS_ASC(@"SEL$5DA710D3" "T2"@"SEL$1" "T2_IDX")
      FULL(@"SEL$5DA710D3" "T3"@"SEL$2")
      LEADING(@"SEL$5DA710D3" "T2"@"SEL$1" "T3"@"SEL$2")
      USE_HASH(@"SEL$5DA710D3" "T3"@"SEL$2")
      USE_HASH_AGGREGATION(@"SEL$5DA710D3")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_NAME"="OBJECT_NAME")
   3 - filter(("OBJECT_TYPE" IS NULL AND  REGEXP_LIKE ("OBJECT_NAME",'[[:upper:]]{4}',HEXTORAW('5815F42D1D000000FC
              E30302000000000000000000000000703EDC0600000000000000000000000000000000000000003600000000000000A815F42D1D0000000400
              0000010000000000000081000000') ) AND "OBJECT_NAME" IS NOT NULL))
   4 - access("T2"."SYS_NC00005$"=TO_CHAR(SYSDATE@!-1,'yyyy-mm-dd'))
   5 - filter("OBJECT_NAME" IS NOT NULL)

我们可以添加HINT改变执行计划:

方式1:使用qb_name hint

select /*+ use_nl(t2,t3@bl) */
distinct object_name
  from t2
 where substr(created, 0, 10) = to_char(sysdate - 1, 'yyyy-mm-dd')
   and object_name is not null
   and object_type is null
   and object_name not in
       (select /*+ qb_name(bl) */
         object_name
          from t3
         where object_name is not null)
   and regexp_like(object_name, '[[:upper:]]{4}')

SQL_ID  guj3z7b3w630x, child number 0
-------------------------------------
select /*+ use_nl(t2,t3@bl) */ distinct object_name   from t2  where
substr(created, 0, 10) = to_char(sysdate - 1, 'yyyy-mm-dd')    and
object_name is not null    and object_type is null    and object_name
not in        (select /*+ qb_name(bl) */          object_name from t3
       where object_name is not null)    and regexp_like(object_name,
'[[:upper:]]{4}')

Plan hash value: 3009087575

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Starts | E-Rows | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |      1 |        |   956 (100)|          |    247 |00:00:00.01 |      24 |
|   1 |  HASH UNIQUE                  |          |      1 |      5 |   956   (1)| 00:00:12 |    247 |00:00:00.01 |      24 |
|   2 |   NESTED LOOPS ANTI           |          |      1 |      5 |   955   (0)| 00:00:12 |    404 |00:00:00.01 |      24 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| T2       |      1 |    474 |     7   (0)| 00:00:01 |    404 |00:00:00.01 |      11 |
|*  4 |     INDEX RANGE SCAN          | T2_IDX   |      1 |    474 |     2   (0)| 00:00:01 |    410 |00:00:00.01 |       3 |
|*  5 |    INDEX RANGE SCAN           | T3_IDX01 |    266 |  85303 |     2   (0)| 00:00:01 |      0 |00:00:00.01 |      13 |
----------------------------------------------------------------------------------------------------------------------------

方式2:使用hint:nl_aj + qb_name方式

select /*+ nl_aj(@bl) */
distinct object_name
  from t2
 where substr(created, 0, 10) = to_char(sysdate - 1, 'yyyy-mm-dd')
   and object_name is not null
   and object_type is null
   and object_name not in
       (select /*+ qb_name(bl) */
         object_name
          from t3
         where object_name is not null)
   and regexp_like(object_name, '[[:upper:]]{4}')

Plan hash value: 3009087575

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Starts | E-Rows | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |      1 |        |   956 (100)|          |    247 |00:00:00.01 |      24 |
|   1 |  HASH UNIQUE                  |          |      1 |      5 |   956   (1)| 00:00:12 |    247 |00:00:00.01 |      24 |
|   2 |   NESTED LOOPS ANTI           |          |      1 |      5 |   955   (0)| 00:00:12 |    404 |00:00:00.01 |      24 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| T2       |      1 |    474 |     7   (0)| 00:00:01 |    404 |00:00:00.01 |      11 |
|*  4 |     INDEX RANGE SCAN          | T2_IDX   |      1 |    474 |     2   (0)| 00:00:01 |    410 |00:00:00.01 |       3 |
|*  5 |    INDEX RANGE SCAN           | T3_IDX01 |    266 |  85303 |     2   (0)| 00:00:01 |      0 |00:00:00.01 |      13 |
----------------------------------------------------------------------------------------------------------------------------

方式3: nl_aj ,在子查询中加入hint,可以不用指定查询块

select distinct object_name
  from t2
 where substr(created, 0, 10) = to_char(sysdate - 1, 'yyyy-mm-dd')
   and object_name is not null
   and object_type is null
   and object_name not in
       (select /*+ nl_aj */
         object_name
          from t3
         where object_name is not null)
   and regexp_like(object_name, '[[:upper:]]{4}')


SQL_ID  7hb1ksq4s1v6p, child number 0
-------------------------------------
select distinct object_name   from t2  where substr(created, 0, 10) =
to_char(sysdate - 1, 'yyyy-mm-dd')    and object_name is not null
and object_type is null    and object_name not in        (select /*+
nl_aj */          object_name           from t3          where
object_name is not null)    and regexp_like(object_name,
'[[:upper:]]{4}')

Plan hash value: 3009087575

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Starts | E-Rows | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |      1 |        |   956 (100)|          |    247 |00:00:00.01 |      24 |
|   1 |  HASH UNIQUE                  |          |      1 |      5 |   956   (1)| 00:00:12 |    247 |00:00:00.01 |      24 |
|   2 |   NESTED LOOPS ANTI           |          |      1 |      5 |   955   (0)| 00:00:12 |    404 |00:00:00.01 |      24 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| T2       |      1 |    474 |     7   (0)| 00:00:01 |    404 |00:00:00.01 |      11 |
|*  4 |     INDEX RANGE SCAN          | T2_IDX   |      1 |    474 |     2   (0)| 00:00:01 |    410 |00:00:00.01 |       3 |
|*  5 |    INDEX RANGE SCAN           | T3_IDX01 |    266 |  85303 |     2   (0)| 00:00:01 |      0 |00:00:00.01 |      13 |
----------------------------------------------------------------------------------------------------------------------------