HelloDBA [中文]
Search Internet Search HelloDBA
  Oracle Technology Site. email: fuyuncat@gmail.com  MSN: fuyuncat@hotmail.com Add to circle  acoug  acoug 

A SQL Tuning case to improve performance of SQL with ROWNUM

[中文]

Author:  fuyuncat

Source:  www.HelloDBA.com

Date:  2012-08-01 09:22:02

Share to  Twitter Facebook GMail Blogger Orkut Google Bookmarks Sina Weibo QQ Renren Tieba Kaixin Douban Taobao

Oracle version: 10.2.0.3

Background of the case: We have a Java program to read and execute below SQL from a configuration file to detect if there is any record be found, then process following logic code base on the result.
The original statement is:

SQL代码
  1. Select count(1) as rowcount   
  2. from B2B_BIZ_KEY B  
  3. WHERE EXISTS  
  4. (SELECT ID FROM B2B_TRANSACTION T  
  5.   WHERE T.ID=B.TRANSACTION_ID AND T.MSG_REQ_ID <= RPAD('EDI', 11, TO_CHAR(TO_TIMESTAMP('2012-07-31 00:00:00','YYYY-MM-DD HH24:MI:SS.FF'), 'YYYYMMDD')));  

And the original simple judgement is "if rowcount > 0 then do these things; else do others".

Table B2B_BIZ_KEY is child table who has a foreign key & index on (TRANSACTION_ID) referring to table B2B_TRANSACTION's primary key (ID). For the sake of large volume in both tables, although there is an index on B2B_TRANSACTION(MSG_REQ_ID, ID), performance of this SQL is quite poor. Here is the performance statistics data of the original one:

SQL代码
  1. Elapsed: 00:02:12.48  
  2.   
  3. Execution Plan  
  4. ----------------------------------------------------------  
  5.    0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=165253 Card=1 Bytes=71)  
  6.    1    0   SORT (AGGREGATE)  
  7.    2    1     HASH JOIN (RIGHT SEMI) (Cost=165253 Card=35644057 Bytes=2530728047)  
  8.    3    2       INDEX (FAST FULL SCAN) OF 'B2B_TRANSACTION_UN' (INDEX(UNIQUE)) (Cost=15035 Card=8184629 Bytes=384677563)  
  9.    4    2       INDEX (FAST FULL SCAN) OF 'B2B_BIZ_KEY_UN' (INDEX (UNIQUE)) (Cost=65567 Card=36056379 Bytes=865353096)  
  10.   
  11. Statistics  
  12. ----------------------------------------------------------  
  13.         943  recursive calls  
  14.           0  db block gets  
  15.      363128  consistent gets  
  16.      416008  physical reads  
  17.           0  redo size  
  18.         353  bytes sent via SQL*Net to client  
  19.         435  bytes received via SQL*Net from client  
  20.           2  SQL*Net roundtrips to/from client  
  21.           0  sorts (memory)  
  22.           0  sorts (disk)  
  23.           1  rows processed  

With logic analysis, obviously, the exact ROWCOUNT of the query is not necessary. Hence, we can add a ROWNUM<=1 criteria to the SQL to avoid scanning unnecessary rows.

Here is the updated one:

SQL代码
  1. Select count(1) as rowcount   
  2. from B2B_BIZ_KEY B  
  3. WHERE EXISTS  
  4. (SELECT ID FROM B2B_TRANSACTION T  
  5.   WHERE T.ID=B.TRANSACTION_ID AND T.MSG_REQ_ID <= RPAD('EDI', 11, TO_CHAR(TO_TIMESTAMP('2012-07-31 00:00:00','YYYY-MM-DD HH24:MI:SS.FF'), 'YYYYMMDD')))  
  6. AND ROWNUM <= 1;  

However, although the statistics data reduced significant comparing to the original one, the performance of enhanced SQL is not so satifying as we expected. Here is the performance data.

SQL代码
  1. Elapsed: 00:00:19.76  
  2.   
  3. Execution Plan  
  4. ----------------------------------------------------------  
  5.    0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=165253 Card=1 Bytes=71)  
  6.    1    0   SORT (AGGREGATE)  
  7.    2    1     COUNT (STOPKEY)  
  8.    3    2       HASH JOIN (RIGHT SEMI) (Cost=165253 Card=35644057 Bytes=2530728047)  
  9.    4    3         INDEX (FAST FULL SCAN) OF 'B2B_TRANSACTION_UN' (INDEX (UNIQUE)) (Cost=15035 Card=8184629 Bytes=384677563)  
  10.    5    3         INDEX (FAST FULL SCAN) OF 'B2B_BIZ_KEY_UN' (INDEX (UNIQUE)) (Cost=65567 Card=36056379 Bytes=865353096)  
  11.   
  12. Statistics  
  13. ----------------------------------------------------------  
  14.         249  recursive calls  
  15.           0  db block gets  
  16.       66098  consistent gets  
  17.       13694  physical reads  
  18.           0  redo size  
  19.         350  bytes sent via SQL*Net to client  
  20.         435  bytes received via SQL*Net from client  
  21.           2  SQL*Net roundtrips to/from client  
  22.           0  sorts (memory)  
  23.           0  sorts (disk)  
  24.           1  rows processed  

Added the "ROWNUM <= 1", we want to improve the SQL avoid to scan redundant rows of the tables, and Opimtizer can shoose a best plan based on the goal, which is known as "First K rows" optimziation.

However, because of COUNT(1), Optimizer knows that server will just response client once, and therefore believes it's meaningess to optimize execution plan with "First K rows". If we just simply remove the COUNT, the result will like this:

SQL代码
  1. HELLODBA.COMSelect 1 as rowcount  
  2.   2  from B2B_BIZ_KEY B  
  3.   3  WHERE EXISTS  
  4.   4  (SELECT ID FROM B2B_TRANSACTION T  
  5.   5    WHERE T.ID=B.TRANSACTION_ID AND T.MSG_REQ_ID <= RPAD('EDI', 11, TO_CHAR(TO_TIMESTAMP('2012-07-31 00:00:00','YYYY-MM-DD HH24:MI:SS.FF'), 'YYYYMMDD')))  
  6.   6    AND ROWNUM<=1;  
  7.   
  8. Elapsed: 00:00:00.43  
  9.   
  10. Execution Plan  
  11. ----------------------------------------------------------  
  12.    0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=8 Card=1 Bytes=71)  
  13.    1    0   COUNT (STOPKEY)  
  14.    2    1     NESTED LOOPS (SEMI) (Cost=8 Card=3 Bytes=213)  
  15.    3    2       TABLE ACCESS (FULLOF 'B2B_BIZ_KEY' (TABLE) (Cost=2 Card=22996310 Bytes=551911440)  
  16.    4    2       INDEX (RANGE SCAN) OF 'B2B_TRANSACTION_IX2' (INDEX) (Cost=2 Card=5456419 Bytes=256451693)  
  17.   
  18. Statistics  
  19. ----------------------------------------------------------  
  20.           1  recursive calls  
  21.           0  db block gets  
  22.           7  consistent gets  
  23.           0  physical reads  
  24.           0  redo size  
  25.         350  bytes sent via SQL*Net to client  
  26.         435  bytes received via SQL*Net from client  
  27.           2  SQL*Net roundtrips to/from client  
  28.           0  sorts (memory)  
  29.           0  sorts (disk)  
  30.           1  rows processed  

That's right. That's what we are expecting. However, if implement the SQL like this, there is will be a defect --- we did not handle the NO_DATA_FOUND exception in the Java program, we we dont want to change the Java code and redeploy the program to production. What can we do?

I was thinking that if the query result could be write into a temp table and then count rows in the temp, the problem could be resolve. With this in mind, I tried to re-write the SQL as below,

SQL代码
  1. HELLODBA.COMwith V as  
  2.   2  (Select /*+ MATERIALIZE qb_name(wv)*/1  
  3.   3  from B2B_BIZ_KEY B  
  4.   4  WHERE EXISTS  
  5.   5  (SELECT ID FROM B2B_TRANSACTION T  
  6.   6    WHERE T.ID=B.TRANSACTION_ID AND T.MSG_REQ_ID <= RPAD('EDI', 11, TO_CHAR(TO_TIMESTAMP('2012-07-31 00:00:00','YYYY-MM-DD HH24:MI:SS.FF'), 'YYYYMMDD')))  
  7.   7    AND ROWNUM<=1)  
  8.   8  select count(1) from v;  
  9.   
  10. Elapsed: 00:00:15.84  
  11.   
  12. Execution Plan  
  13. ----------------------------------------------------------  
  14.    0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=165255 Card=1)  
  15.    1    0   TEMP TABLE TRANSFORMATION  
  16.    2    1     LOAD AS SELECT  
  17.    3    2       COUNT (STOPKEY)  
  18.    4    3         HASH JOIN (RIGHT SEMI) (Cost=165253 Card=35644057 Bytes=2530728047)  
  19.    5    4           INDEX (FAST FULL SCAN) OF 'B2B_TRANSACTION_UN' (INDEX (UNIQUE)) (Cost=15035 Card=8184629 Bytes=384677563)  
  20.    6    4           INDEX (FAST FULL SCAN) OF 'B2B_BIZ_KEY_UN' (INDEX(UNIQUE)) (Cost=65567 Card=36056379 Bytes=865353096)  
  21.    7    1     SORT (AGGREGATE)  
  22.    8    7       VIEW (Cost=2 Card=1)  
  23.    9    8         TABLE ACCESS (FULLOF 'SYS_TEMP_0FD9FC8C0_51579C6B'(TABLE (TEMP)) (Cost=2 Card=1 Bytes=13)  
  24.   
  25. Statistics  
  26. ----------------------------------------------------------  
  27.         520  recursive calls  
  28.          10  db block gets  
  29.       66112  consistent gets  
  30.       11595  physical reads  
  31.        1460  redo size  
  32.         350  bytes sent via SQL*Net to client  
  33.         435  bytes received via SQL*Net from client  
  34.           2  SQL*Net roundtrips to/from client  
  35.           0  sorts (memory)  
  36.           0  sorts (disk)  
  37.           1  rows processed  

Just look at the execution plan, it quite similar to what we want. SQL engine will query the subquery and write the result to a system temporary table, then count the content of the temp table. However, the process is not separated, and the COUNT still exists, as a result, the cost of execution plan is close to the previous one.

OK, let's use some tricks to get rid off the COUNT.

SQL代码
  1. HELLODBA.COM> with V as  
  2.   2  (Select 1  
  3.   3  from B2B_BIZ_KEY B  
  4.   4  WHERE EXISTS  
  5.   5  (SELECT ID FROM B2B_TRANSACTION T  
  6.   6    WHERE T.ID=B.TRANSACTION_ID AND T.MSG_REQ_ID <= RPAD('EDI', 11, TO_CHAR(TO_TIMESTAMP('2012-07-31 00:00:00','YYYY-MM-DD HH24:MI:SS.FF'), 'YYYYMMDD')))  
  7.   7    AND ROWNUM<=1)  
  8.   8  select nvl2((select 1 from v),1,0) as aaa from dual;  
  9.   
  10. Elapsed: 00:00:00.21  
  11.   
  12. Execution Plan  
  13. ----------------------------------------------------------  
  14.    0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1)  
  15.    1    0   VIEW (Cost=8 Card=1)  
  16.    2    1     COUNT (STOPKEY)  
  17.    3    2       NESTED LOOPS (SEMI) (Cost=8 Card=3 Bytes=213)  
  18.    4    3         TABLE ACCESS (FULLOF 'B2B_BIZ_KEY' (TABLE) (Cost=2Card=22996310 Bytes=551911440)  
  19.    5    3         INDEX (RANGE SCAN) OF 'B2B_TRANSACTION_IX2' (INDEX)(Cost=2 Card=5456419 Bytes=256451693)  
  20.    6    0   FAST DUAL (Cost=2 Card=1)  
  21.   
  22. Statistics  
  23. ----------------------------------------------------------  
  24.           0  recursive calls  
  25.           0  db block gets  
  26.           7  consistent gets  
  27.           0  physical reads  
  28.           0  redo size  
  29.         345  bytes sent via SQL*Net to client  
  30.         435  bytes received via SQL*Net from client  
  31.           2  SQL*Net roundtrips to/from client  
  32.           0  sorts (memory)  
  33.           0  sorts (disk)  
  34.           1  rows processed  

Ok. Performance is acceptable and there is no NO_DATA_FOUND exception to be raised.

--- Fuyuncat ---

Top

Copyright ©2005, HelloDBA.Com All reseverd.

Declaration
by fuyuncat