Friday, November 25, 2011

Why is my index not being used by Oracle?

One of the most FAQ on the internet forum is "Why isn't Oracle using my index?" among many DBAs and
developers.

Oracle Optimizer will choose an index access path only when it thinks it is appropriate. There are many
situations where it will not choose an index unique (range) scan. The following case studies will high light
some of them.

First let us create 2 tables.

CREATE TABLE T1
(
OWNER                          VARCHAR2(30),
OBJECT_NAME            VARCHAR2(128),
SUBOBJECT_NAME   VARCHAR2(30),
OBJECT_ID                   NUMBER      ,
DATA_OBJECT_ID      NUMBER,
OBJECT_TYPE            VARCHAR2(18),
CREATED                     DATE,
LAST_DDL_TIME         DATE,
TIMESTAMP                  VARCHAR2(19),
STATUS                         VARCHAR2(7),
TEMPORARY               VARCHAR2(1),
GENERATED              VARCHAR2(1),
SECONDARY             VARCHAR2(1)
)
TABLESPACE USERS ;

Then, created an index on OBJECT_ID column.

CREATE INDEX T1_IDX ON T1(OBJECT_ID)
LOGGING TABLESPACE USERS;

Table T1 has 25000 rows.
CREATE TABLE T2
(
OWNER                         VARCHAR2(30),
OBJECT_NAME                VARCHAR2(128),
SUBOBJECT_NAME          VARCHAR2(30),
OBJECT_ID                    NUMBER      ,
DATA_OBJECT_ID            NUMBER,
OBJECT_TYPE                VARCHAR2(18),
CREATED                       DATE,
LAST_DDL_TIME             DATE,
TIMESTAMP                   VARCHAR2(19),
STATUS                        VARCHAR2(7),
TEMPORARY                   VARCHAR2(1),
GENERATED               VARCHAR2(1),
SECONDARY               VARCHAR2(1)
)
TABLESPACE USERS ;

CREATE INDEX T2_IDX ON T2(OBJECT_ID)
LOGGING TABLESPACE USERS;

Table T2 has 10000 rows.
Please note columns  T1.OBJECT_ID AND T2.OBJECT_ID are NULL .

Indexes are created on both T1 and T2 tables on object_id column.

Case 1: When a column is declared as NULL
Oracle RDBMS 9.2.0.4  is used in test cases.

Case Study 1 surprises me. This discovery is accidental. I never thought NULL  or NOT NULL definition on a
column will force the optimizer to ignore / choose an index access path.

My query is:

SELECT COUNT(*) FROM T1
WHERE NOT EXISTS
( SELECT NULL FROM T2
WHERE T2.OBJECT_ID=T1.OBJECT_ID) ;
Execution Plan
----------------------------------------------------------
0   SELECT STATEMENT Optimizer=CHOOSE (Cost=34 Card=1 Bytes=4)
1   0   SORT (AGGREGATE)
2   1     FILTER
3   2       TABLE ACCESS (FULL) OF 'T1' (Cost=32 Card=1251 Bytes=5004)
4   2       INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE) (Cost=1 Card=1 Bytes=4)

As you noticed, T1_IDX is not being used even though an index is created on OBJECT_ID column in Table T1.
Now I alter the table, T1.

SQL> ALTER TABLE T1 MODIFY OBJECT_ID NOT NULL;
Table altered.

Re-run the query again.
SELECT COUNT(*) FROM T1
WHERE NOT EXISTS
( SELECT NULL FROM T2
WHERE T2.OBJECT_ID=T1.OBJECT_ID) ;

Execution Plan
---------------------------------------------------------
0   SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=1 Bytes=4)
1   0   SORT (AGGREGATE)
2   1     FILTER
3   2       INDEX (FAST FULL SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=7 Card=1251 Bytes=50
4   2       INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE) (Cost=1 Card=1 Bytes=4)

!!Wow !!, the optimizer starts using the index T1_IDX and the cost has come down from 34 to 9.  No init
parameter has been changed.

Case 2:  When using CONSTANT in the predicate
The query is:

SELECT t1.object_name FROM T1
WHERE  OBJECT_ID < 1000
and EXISTS ( SELECT NULL FROM T2 WHERE T2.OBJECT_ID=T1.OBJECT_ID)

And the plan is:
Execution Plan
----------------------------------------------------------
0                    SELECT STATEMENT Optimizer=CHOOSE (Cost=34 Card=300 Bytes=7200)
1                  0   NESTED LOOPS (SEMI) (Cost=34 Card=300 Bytes=7200)
2                  1     TABLE ACCESS (FULL) OF 'T1' (Cost=32 Card=605 Bytes=12100)
3                  1     INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE)

With "OBJECT_ID < 1000" predicate, I am sure that the result set from the outer query is going to be 999 rows
out of 25000 rows, but still Oracle chose FULL table scan T1.

Instead of constant, if I use bind variable, then the index is used.

var b1 varchar2(30);
exec :b1 := 1000 ;
SELECT T1.OBJECT_NAME
FROM T1, T2
WHERE  T1.object_id < :B1
and    T2.OBJECT_ID=T1.OBJECT_ID

Execution Plan
---------------------------------------------------------
0                    SELECT STATEMENT Optimizer=CHOOSE (Cost=29 Card=500 Bytes=12000)
1                  0   NESTED LOOPS (Cost=29 Card=500 Bytes=12000)
2                  1     TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=28 Card=1251 Bytes=25020)
3                  2       INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=2 Card=225)
4                  1     INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE)

Case Study 3: When you use NOT IN instead of NOT EXISTS.
The SQL is:
SELECT t1.object_name FROM T1
WHERE  OBJECT_ID < 1000
AND (t1.object_id) not in  (SELECT t2.object_id from T2) ;
Execution Plan
----------------------------------------------------------
0                    SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=30 Bytes=600)
1                  0   TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=6 Card=30 Bytes=600)
2                  1     INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=3 Card=30)
3                  2       TABLE ACCESS (FULL) OF 'T2' (Cost=14 Card=500 Bytes=2000)

Now I change the query to use NOT EXISTS.
SELECT t1.object_name FROM T1
WHERE  OBJECT_ID < 1000
AND NOT EXISTS ( Select NULL from T2 where t2.object_id = T1.object_id );

Execution Plan
----------------------------------------------------------
0                    SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=30 Bytes=600)
1                  0   TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=6 Card=30 Bytes=600)
2                  1     INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=3 Card=30)
3                  2       INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE) (Cost=1 Card=1 Bytes=4)

Also, Clustering Factor statistics affects the selectivity and cardinality of a table which in turn force the
optimizer to go for Full Table Scan. See  "Computing Clustering Factor" article on this web site.

Hope this helps....
Tamilselvan G

No comments:

Post a Comment