One of the most FAQ on the internet forum is "Why isn't Oracle using my index?" among many DBAs and
developers.
Oracle Optimizer will choose an index access path only when it thinks it is appropriate. There are many
situations where it will not choose an index unique (range) scan. The following case studies will high light
some of them.
First let us create 2 tables.
CREATE TABLE T1
(
OWNER VARCHAR2(30),
OBJECT_NAME VARCHAR2(128),
SUBOBJECT_NAME VARCHAR2(30),
OBJECT_ID NUMBER ,
DATA_OBJECT_ID NUMBER,
OBJECT_TYPE VARCHAR2(18),
CREATED DATE,
LAST_DDL_TIME DATE,
TIMESTAMP VARCHAR2(19),
STATUS VARCHAR2(7),
TEMPORARY VARCHAR2(1),
GENERATED VARCHAR2(1),
SECONDARY VARCHAR2(1)
)
TABLESPACE USERS ;
Then, created an index on OBJECT_ID column.
CREATE INDEX T1_IDX ON T1(OBJECT_ID)
LOGGING TABLESPACE USERS;
Table T1 has 25000 rows.
CREATE TABLE T2
(
OWNER VARCHAR2(30),
OBJECT_NAME VARCHAR2(128),
SUBOBJECT_NAME VARCHAR2(30),
OBJECT_ID NUMBER ,
DATA_OBJECT_ID NUMBER,
OBJECT_TYPE VARCHAR2(18),
CREATED DATE,
LAST_DDL_TIME DATE,
TIMESTAMP VARCHAR2(19),
STATUS VARCHAR2(7),
TEMPORARY VARCHAR2(1),
GENERATED VARCHAR2(1),
SECONDARY VARCHAR2(1)
)
TABLESPACE USERS ;
CREATE INDEX T2_IDX ON T2(OBJECT_ID)
LOGGING TABLESPACE USERS;
Table T2 has 10000 rows.
Please note columns T1.OBJECT_ID AND T2.OBJECT_ID are NULL .
Indexes are created on both T1 and T2 tables on object_id column.
Case 1: When a column is declared as NULL
Oracle RDBMS 9.2.0.4 is used in test cases.
Case Study 1 surprises me. This discovery is accidental. I never thought NULL or NOT NULL definition on a
column will force the optimizer to ignore / choose an index access path.
My query is:
SELECT COUNT(*) FROM T1
WHERE NOT EXISTS
( SELECT NULL FROM T2
WHERE T2.OBJECT_ID=T1.OBJECT_ID) ;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=34 Card=1 Bytes=4)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'T1' (Cost=32 Card=1251 Bytes=5004)
4 2 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE) (Cost=1 Card=1 Bytes=4)
As you noticed, T1_IDX is not being used even though an index is created on OBJECT_ID column in Table T1.
Now I alter the table, T1.
SQL> ALTER TABLE T1 MODIFY OBJECT_ID NOT NULL;
Table altered.
Re-run the query again.
SELECT COUNT(*) FROM T1
WHERE NOT EXISTS
( SELECT NULL FROM T2
WHERE T2.OBJECT_ID=T1.OBJECT_ID) ;
Execution Plan
---------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=1 Bytes=4)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 INDEX (FAST FULL SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=7 Card=1251 Bytes=50
4 2 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE) (Cost=1 Card=1 Bytes=4)
!!Wow !!, the optimizer starts using the index T1_IDX and the cost has come down from 34 to 9. No init
parameter has been changed.
Case 2: When using CONSTANT in the predicate
The query is:
SELECT t1.object_name FROM T1
WHERE OBJECT_ID < 1000
and EXISTS ( SELECT NULL FROM T2 WHERE T2.OBJECT_ID=T1.OBJECT_ID)
And the plan is:
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=34 Card=300 Bytes=7200)
1 0 NESTED LOOPS (SEMI) (Cost=34 Card=300 Bytes=7200)
2 1 TABLE ACCESS (FULL) OF 'T1' (Cost=32 Card=605 Bytes=12100)
3 1 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE)
With "OBJECT_ID < 1000" predicate, I am sure that the result set from the outer query is going to be 999 rows
out of 25000 rows, but still Oracle chose FULL table scan T1.
Instead of constant, if I use bind variable, then the index is used.
var b1 varchar2(30);
exec :b1 := 1000 ;
SELECT T1.OBJECT_NAME
FROM T1, T2
WHERE T1.object_id < :B1
and T2.OBJECT_ID=T1.OBJECT_ID
Execution Plan
---------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=29 Card=500 Bytes=12000)
1 0 NESTED LOOPS (Cost=29 Card=500 Bytes=12000)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=28 Card=1251 Bytes=25020)
3 2 INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=2 Card=225)
4 1 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE)
Case Study 3: When you use NOT IN instead of NOT EXISTS.
The SQL is:
SELECT t1.object_name FROM T1
WHERE OBJECT_ID < 1000
AND (t1.object_id) not in (SELECT t2.object_id from T2) ;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=30 Bytes=600)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=6 Card=30 Bytes=600)
2 1 INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=3 Card=30)
3 2 TABLE ACCESS (FULL) OF 'T2' (Cost=14 Card=500 Bytes=2000)
Now I change the query to use NOT EXISTS.
SELECT t1.object_name FROM T1
WHERE OBJECT_ID < 1000
AND NOT EXISTS ( Select NULL from T2 where t2.object_id = T1.object_id );
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=30 Bytes=600)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=6 Card=30 Bytes=600)
2 1 INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=3 Card=30)
3 2 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE) (Cost=1 Card=1 Bytes=4)
Also, Clustering Factor statistics affects the selectivity and cardinality of a table which in turn force the
optimizer to go for Full Table Scan. See "Computing Clustering Factor" article on this web site.
Hope this helps....
Tamilselvan G
developers.
Oracle Optimizer will choose an index access path only when it thinks it is appropriate. There are many
situations where it will not choose an index unique (range) scan. The following case studies will high light
some of them.
First let us create 2 tables.
CREATE TABLE T1
(
OWNER VARCHAR2(30),
OBJECT_NAME VARCHAR2(128),
SUBOBJECT_NAME VARCHAR2(30),
OBJECT_ID NUMBER ,
DATA_OBJECT_ID NUMBER,
OBJECT_TYPE VARCHAR2(18),
CREATED DATE,
LAST_DDL_TIME DATE,
TIMESTAMP VARCHAR2(19),
STATUS VARCHAR2(7),
TEMPORARY VARCHAR2(1),
GENERATED VARCHAR2(1),
SECONDARY VARCHAR2(1)
)
TABLESPACE USERS ;
Then, created an index on OBJECT_ID column.
CREATE INDEX T1_IDX ON T1(OBJECT_ID)
LOGGING TABLESPACE USERS;
Table T1 has 25000 rows.
CREATE TABLE T2
(
OWNER VARCHAR2(30),
OBJECT_NAME VARCHAR2(128),
SUBOBJECT_NAME VARCHAR2(30),
OBJECT_ID NUMBER ,
DATA_OBJECT_ID NUMBER,
OBJECT_TYPE VARCHAR2(18),
CREATED DATE,
LAST_DDL_TIME DATE,
TIMESTAMP VARCHAR2(19),
STATUS VARCHAR2(7),
TEMPORARY VARCHAR2(1),
GENERATED VARCHAR2(1),
SECONDARY VARCHAR2(1)
)
TABLESPACE USERS ;
CREATE INDEX T2_IDX ON T2(OBJECT_ID)
LOGGING TABLESPACE USERS;
Table T2 has 10000 rows.
Please note columns T1.OBJECT_ID AND T2.OBJECT_ID are NULL .
Indexes are created on both T1 and T2 tables on object_id column.
Case 1: When a column is declared as NULL
Oracle RDBMS 9.2.0.4 is used in test cases.
Case Study 1 surprises me. This discovery is accidental. I never thought NULL or NOT NULL definition on a
column will force the optimizer to ignore / choose an index access path.
My query is:
SELECT COUNT(*) FROM T1
WHERE NOT EXISTS
( SELECT NULL FROM T2
WHERE T2.OBJECT_ID=T1.OBJECT_ID) ;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=34 Card=1 Bytes=4)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'T1' (Cost=32 Card=1251 Bytes=5004)
4 2 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE) (Cost=1 Card=1 Bytes=4)
As you noticed, T1_IDX is not being used even though an index is created on OBJECT_ID column in Table T1.
Now I alter the table, T1.
SQL> ALTER TABLE T1 MODIFY OBJECT_ID NOT NULL;
Table altered.
Re-run the query again.
SELECT COUNT(*) FROM T1
WHERE NOT EXISTS
( SELECT NULL FROM T2
WHERE T2.OBJECT_ID=T1.OBJECT_ID) ;
Execution Plan
---------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=1 Bytes=4)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 INDEX (FAST FULL SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=7 Card=1251 Bytes=50
4 2 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE) (Cost=1 Card=1 Bytes=4)
!!Wow !!, the optimizer starts using the index T1_IDX and the cost has come down from 34 to 9. No init
parameter has been changed.
Case 2: When using CONSTANT in the predicate
The query is:
SELECT t1.object_name FROM T1
WHERE OBJECT_ID < 1000
and EXISTS ( SELECT NULL FROM T2 WHERE T2.OBJECT_ID=T1.OBJECT_ID)
And the plan is:
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=34 Card=300 Bytes=7200)
1 0 NESTED LOOPS (SEMI) (Cost=34 Card=300 Bytes=7200)
2 1 TABLE ACCESS (FULL) OF 'T1' (Cost=32 Card=605 Bytes=12100)
3 1 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE)
With "OBJECT_ID < 1000" predicate, I am sure that the result set from the outer query is going to be 999 rows
out of 25000 rows, but still Oracle chose FULL table scan T1.
Instead of constant, if I use bind variable, then the index is used.
var b1 varchar2(30);
exec :b1 := 1000 ;
SELECT T1.OBJECT_NAME
FROM T1, T2
WHERE T1.object_id < :B1
and T2.OBJECT_ID=T1.OBJECT_ID
Execution Plan
---------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=29 Card=500 Bytes=12000)
1 0 NESTED LOOPS (Cost=29 Card=500 Bytes=12000)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=28 Card=1251 Bytes=25020)
3 2 INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=2 Card=225)
4 1 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE)
Case Study 3: When you use NOT IN instead of NOT EXISTS.
The SQL is:
SELECT t1.object_name FROM T1
WHERE OBJECT_ID < 1000
AND (t1.object_id) not in (SELECT t2.object_id from T2) ;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=30 Bytes=600)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=6 Card=30 Bytes=600)
2 1 INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=3 Card=30)
3 2 TABLE ACCESS (FULL) OF 'T2' (Cost=14 Card=500 Bytes=2000)
Now I change the query to use NOT EXISTS.
SELECT t1.object_name FROM T1
WHERE OBJECT_ID < 1000
AND NOT EXISTS ( Select NULL from T2 where t2.object_id = T1.object_id );
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=30 Bytes=600)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=6 Card=30 Bytes=600)
2 1 INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=3 Card=30)
3 2 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE) (Cost=1 Card=1 Bytes=4)
Also, Clustering Factor statistics affects the selectivity and cardinality of a table which in turn force the
optimizer to go for Full Table Scan. See "Computing Clustering Factor" article on this web site.
Hope this helps....
Tamilselvan G
No comments:
Post a Comment