Monday, April 1, 2013

Why did the Optimizer not choose an index?

I'm always learning about the optimizer (CBO) and enjoy opportunities to investigate to solve real-world problems presented to me. During a recent stress test of some BI (WebFOCUS) reports, I decided to look at the top SQL statement as reported by Quest's Foglight Performance Analysis Tops Report. 



After cleaning the SQL up some, I had something that basically looked like:

SELECT * FROM ODSMGR.OPERATING_LEDGER_MV T1
WHERE (T1."ACCOUNT_LEVEL_4" = '8060') AND
(T1."FUND_LEVEL_3" = '3U0032') AND 
(T1."FISCAL_PERIOD" = '05') 
AND (T1."FISCAL_YEAR" = '2013')


-----------------------------------------------------------------------------------------
| Id  | Operation         | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                     |     1 | 26471 |   121K  (6)| 00:12:10 |
|*  1 |  TABLE ACCESS FULL| OPERATING_LEDGER_MV |     1 | 26471 |   121K  (6)| 00:12:10 |
-----------------------------------------------------------------------------------------


This table has a index on it - but Period & Year are at the end (columns 5 & 6 of 6):


CREATE INDEX ODSMGR.OPERATING_LEDGER_MV_02 ON ODSMGR.OPERATING_LEDGER_MV
(INDEX_CODE, ORGANIZATION_LEVEL_5, ORGANIZATION_LEVEL_3, FUND, FISCAL_PERIOD, 
FISCAL_YEAR)

After working with the SQL some, I found I could get the optimizer to utilize this index (via skip scan) by passing the hint:  /*+ INDEX_SS_DESC(T1) */ - which made sense: Period and Year were the last two columns.

------------------------------------------------------------------------------------------------------

| Id  | Operation                   | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |

------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                        |     1 | 26471 |  4837   (1)| 00:00:29 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| OPERATING_LEDGER_MV    |     1 | 26471 |  4837   (1)| 00:00:29 |
|*  2 |   INDEX SKIP SCAN DESCENDING| OPERATING_LEDGER_MV_02 |   317 |       |  4580   (1)| 00:00:28 |

------------------------------------------------------------------------------------------------------


So where to go from here? First I would like to point out that this is a static table, refreshed nightly and did not have stats on it. As a quick test to see if stats would help – it didn't change the plan to use the Index when used without the hint. It did remove the use of Dynamic Sampling (Level 2: 64 blocks) though.

Ok, what next? After reviewing our optimizer's init.ora settings:


optimizer_features_enable            string   11.2.0.2
optimizer_index_caching              integer  90
optimizer_index_cost_adj             integer  100
optimizer_mode                       string   ALL_ROWS
optimizer_secure_view_merging        boolean  FALSE
optimizer_use_invisible_indexes      boolean  FALSE
optimizer_use_pending_statistics     boolean  FALSE
optimizer_use_sql_plan_baselines     boolean  TRUE


I decided to explore the reasons why Oracle is not using the Index for Period and Year by first changing the value of OPTIMIZER_INDEX_COST_ADJ from 100 to 90. 

Initially, a FTS is still performed as stated above. Then when OPTIMIZER_INDEX_COST_ADJ is changed from 100 to 90, it still uses a FTS. It is not until statistics is gathered that the optimizer now chooses the Index (Skip Scan) without any hint used. Also note that once stats are gathered, Dynamic Sampling goes away.


OPTIMIZER_INDEX_COST_ADJ lets you tune optimizer behavior for access path selection to be more or less index friendly—that is, to make the optimizer more or less prone to selecting an index access path over a full table scan.

The default for this parameter is 100 percent, at which the optimizer evaluates index access paths at the regular cost. Any other value makes the optimizer evaluate the access path at that percentage of the regular cost. For example, a setting of 50 makes the index access path look half as expensive as normal.

When this test was repeated the following work day, the value at which the Index was used was down from 90 to 87 so I looked into how to use this parameter.

Several trusted sites recommended not setting this at the instance level from the default value (100). I did find a way to calculate the value by using:


select event, average_wait from v$system_event
where event like 'db_file s%read';

which returned:

EVENT                          AVERAGE_WAIT
------------------------------ ------------
db file sequential read                  .4
db file scattered read                 1.27

A single block I/O as performed typically by an index range scan is measured via the “db file sequential read” wait event while the multi-block I/O as typically performed during a FTS is measured via the “db file scattered read” wait event - suggesting a value of around 70 in this case. Rather than setting at the session or system level, I tried setting this for my query with the following hint:


SELECT /*+ opt_param('optimizer_index_cost_adj',70) */ count(*) 
FROM ODSMGR.OPERATING_LEDGER_MV T1  -- 84 rows
WHERE (T1."ACCOUNT_LEVEL_4" = '8060') AND
(T1."FUND_LEVEL_3" = '3U0032') AND 
(T1."FISCAL_PERIOD" = '05') 
AND (T1."FISCAL_YEAR" = '2013')

which selected the Index as expected.

At this point, I would like to add that I suspect a possible issue with System Stats. Based on the scattered and sequential read times shown above, when I query:

select * from sys.aux_stats$
where pname like '%TIM%' or pname LIKE '%IO%';














and observed values for SREADTIM = 32.497 and MREADTIM = 5.515 which would suggest the opposite - that it's 6x faster to read a multiblock than a single block?

When computed manually, these numbers should be closer to:

SREADTIM = (10 + 32768) / 4096 = 8.0ms
MREADTIM = (10 + 32 * 32768) / 4096 = 266ms

(db_block_size = 32k and db_file_multiblock_read_count = 32)

While setting a HINT will help make this particular query run much better, I suspect that the source of the problem is our System Stats. 






No comments:

Post a Comment