Saturday, November 5, 2011

Index Leaf Block Fragmentation

Index Leaf Block Fragmentation
Abstract:Many DBAs spend considerable amount of time in re-orging the database over the weekends or holidays instead of
spending their useful time with family.  The reason, they normally say, is database is fragmented. And if you ask
them what type of fragmentation they see in the DB, you get no answer from them or you may get some time
“Number of extents are huge for some tables/indexes”. Oracle database fragmentation consists of many types.
Some are more harmful, some are not and some really helps to speed up the DDL, DML, PDDL, and PDML
operations. Fragmentation, as the name indicates, refers to a component being broken up into two or many parts.
There are two major categories of fragmentation:
1.       Internal
2.       External.

Internal Category is classified into 4 levels:
1.       Index Leaf Block-level fragmentation
2.       Row-level fragmentation
3.       Block-level fragmentation
4.       Segment-level fragmentation.

External category is further classified into 2 levels:
1.       Disk-level fragmentation
2.       Tablespace-Level fragmentation


In this paper, I restricted to only Index Leaf Block Fragmentation as each level of fragmentation needs elaborate
details and write-ups.  
1. Index Leaf Block Fragmentation

What is ILBF?
Oracle’s standard index structure is B*tree structure.
A B*tree is made up of 3 blocks or nodes:
·         Root Block ( Root Node): There is only one block in the root node. All accesses to the index starts from here.
The root node has many child blocks.
·         Branch Blocks: This is the middle layer. There is no restriction in branch blocks. The number of branch blocks
increases depending upon the number of leaf blocks. And they are arranged in multiple levels usually four (Still I
have not found out the reasons why there are four levels at the branch levels). Regardless of deleted rows, branch
blocks are not removed.
·         Leaf Blocks: This is the bottom layer and is made of leaf blocks where the actual index entries along with
ROWIDs are stored.

When rows from table are deleted, all associated index entries are also deleted. This will fragment an index leaf
blocks. As more and more rows from table are deleted, fragmentation level within leaf blocks increases. Some
experts said deleted entries within a leaf block never get reused for the new index entries within a leaf block.
because the application may not insert the same key values again in the table. This is not true. Deleted space will
be reused at later stage  I will demonstrate it later.

The performance impact of Index Leaf Block Fragmentation (ILBF) only significantly affects index scans and can be
more severe than table block fragmentation. ILBF performance ramifications can make even the most seasoned
DBA sweat.

Consider this situation. The Table T1 is described below.

SQL> desc t1
Name                       Null?       Type
----------------------     --------  -----------------------
ID                                            NUMBER
SYEAR                                    NUMBER
OWNER                                   VARCHAR2(30)
OBJECT_NAME                      VARCHAR2(128)
SUBOBJECT_NAME                 VARCHAR2(30)
OBJECT_ID                             NUMBER
DATA_OBJECT_ID                   NUMBER
OBJECT_TYPE                        VARCHAR2(18)
CREATED                                DATE
LAST_DDL_TIME                    DATE
TIMESTAMP                            VARCHAR2(19)
STATUS                                   VARCHAR2(7)
TEMPORARY                           VARCHAR2(1)
GENERATED                           VARCHAR2(1)
SECONDARY                          VARCHAR2(1)

I populated around 300,000 rows in the table. Equal number of rows are populated for the years 2000,2001, and
2002.

SQL> SELECT COUNT(*) FROM TAMIL.T1 WHERE SYEAR = 2000 ;

COUNT(*)
----------
116545

Suppose an index is on columns SYEAR and ID. The ID col is populated with a sequence.
For some reason, all the rows pointing to the year 2001 are deleted. Hence, all the blocks from table data used by
the rows for the SYEAR=2001 are added into the FREELIST. However, what happens to the year 2001 leaf blocks in
the index, which are spread throughout the index structure? They remain! The year 2001 leaf blocks become empty
and remain, that is their space is not released.

Here's the performance killer. When a index range scan query is performed that must pass over the empty blocks,
the blocks must be read from disk, placed into the data block buffer cache, just like any other block.  You may be
asking yourself: Why does Oracle put empty blocks in the buffer cache?

Answer: Even though they are completely empty, Oracle can NOT skip dead index leaf blocks during the index
range scan.

This is one of the reasons for rebuilding the index after a large delete.

The following case study illustrates the above theory:

Created a unqiue index on T1 on columns SYEAR and ID.
SQL> create unique index t1_idx on t1(syear,id) tablespace users
storage (initial 1m next 1m pctincrease 0);

SQL> select segment_name , bytes/1024/1024 ,
           blocks, tablespace_name , extents
   from dba_segments
 where owner='TAMIL' and segment_name = 'T1_IDX';

SEGMENT_NA BYTES/1024/1024     BLOCKS        TABLESPACE_NAME   EXTENTS
----------              ---------------                         ----------       ----------------                ----------
T1_IDX               8.125                                      1040        USERS                           8

SQL> select object_name, object_id, data_object_id
                 from dba_objects
               where object_NAME='T1_IDX' ;

OBJECT_NAME    OJECT_ID          DATA_OBJECT_ID
---------------------     --------------          --------------
T1_IDX                    64530                  64540       ---------Used to join X$BH table

To find out the cached blocks, query from X$BH.
Note: Do not use object_id value to join with X$BH.

Only 2 BLOCKS of T1_IDX object are in the buffer cache.

SQL> select count(*) from x$bh where obj=64540 ;

COUNT(*)
----------
2

Get the explain plan for the query.
SQL> select syear, id from t1 where syear >= 2000 and syear <= 2002 ;

Execution Plan
----------------------------------------------------------
0    SELECT STATEMENT Optimizer=CHOOSE (Cost=24 Card=349635 Bytes=2447445)
1 0   INDEX (FAST FULL SCAN) OF 'T1_IDX' (UNIQUE) (Cost=24 Card=349635 Bytes=2447445)

Run the actual query:

SQL> set autot trace statis
SQL> select syear, id from t1 where syear >= 2000 and syear <= 2002 ;

349635 rows selected.

Elapsed: 00:00:02.51
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
0 bytes sent via SQL*Net to client
0 bytes received via SQL*Net from client

349635 rows processed

12:15:50 SQL> select count(*) from x$bh where obj=64540 ;

COUNT(*)
----------
912                            

This is expected. Oracle has rightly cached all the leaf blocks.

Now, I delete one third of the rows from the table T1.

12:18:09 SQL> delete from t1 where syear=2001;

116545 rows deleted.

12:24:44 SQL> commit;

Commit complete.

I shutdown DB and started up again to clean up buffer cache.

SQL> select count(*) from x$bh where obj=64540 ;

COUNT(*)
----------
0
Ensured that no index leaf block is present in the buffer cache.

Run the actual query.
SQL> select syear, id from tamil.t1 where syear >= 2000 and syear <= 2002;

233090 rows selected.

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
0 sorts (memory)
0 sorts (disk)
233090 rows processed

12:27:06 SQL> select count(*) from x$bh where obj=64540 ;


COUNT(*)
----------
912

I see the same number of index leaf blocks in the buffer cache even after deleting one third of the rows from the
table. Here, I expected around 600 leaf blocks in the buffer cache where as Oracle puts 912 blocks.

This proves that Oracle cannot skip dead leaf blocks during the index range scan.

To verify it, I analyzed the index and checked the DEL_LF_ROWS col value from the INDEX_STATS.

SQL> analyze index tamil.t1_idx validate structure;

Index analyzed.

SQL> select height, blocks, lf_blks, lf_rows, br_blks, br_rows  , del_lf_rows from index_stats;

HEIGHT     BLOCKS       LF_BLKS             LF_ROWS    BR_BLKS                 BR_ROWS            DEL_LF_ROWS
----------      --------------          ----------                 ----------             ----------                 ----------                   -----------
3       1040                    908                          349635                  3                              907                         116545

In order to completely remove from the index leaf blocks, I have 2 options:
1 . Rebuild the index.  Or 2. Insert new rows.

First, I will rebuild the index.

SQL> alter index tamil.t1_idx rebuild;

Index altered.

SQL> analyze index tamil.t1_idx validate structure;

Index analyzed.

SQL> select height, blocks, lf_blks, lf_rows, br_blks, br_rows
        , del_lf_rows from index_stats;

HEIGHT   BLOCKS    LF_BLKS    LF_ROWS    BR_BLKS    BR_ROWS    DEL_LF_ROWS
----------        ----------         ----------          ----------           ----------            ----------            -----------
3           650               600                233090            3                      599                   0

2. When does oracle reuse the dead leaf blocks?

Many DBAs believe that dead leaf blocks will never be reused by Oracle. It is NOT true.
Oracle reuses the deleted space in the index as and when new rows are inserted into the table.

Now the table T1 has data for the year 2000 and 2002 after deleting the rows for the year 2001. Now I inserted new
rows for the year 2003.

SQL> insert into t1 select t1_seq.nextval, 2003, a.* from dba_objects a;
23311 rows created.
SQL> /
23311 rows created.
SQL> /
23311 rows created.
SQL> /
23311 rows created.
SQL> /
23311 rows created.
SQL> commit;
Commit complete.

SQL> select syear , count(*) from t1 group by syear ;

SYEAR   COUNT(*)
---------- ----------
2000     116545
2002     116545
2003     116555    --------------- New rows inserted

The new rows for the year reuses the index leaf blocks previously used by the rows for the year 2001. The next
query proves that.

SQL> select segment_name , bytes/1024/1024 ,
           blocks, tablespace_name , extents
   from dba_segments
where owner='TAMIL' and segment_name = 'T1_IDX' ;

SEGMENT_NAM           BYTES/1024/1024     BLOCKS  TABLESPACE_NAME
-----------------------              ---------------                    ---------- ---------------------------
T1_IDX                          8.125                           1040                 USERS

After delete and insert, the number of blocks used by the index never went up. It remains same, 1040.

14:29:23 SQL> analyze index tamil.t1_idx validate structure ;

Index analyzed.

SQL> select height, blocks, lf_blks, lf_rows, br_blks, br_rows,  del_lf_rows
  from index_stats ;

HEIGHT     BLOCKS    LF_BLKS    LF_ROWS    BR_BLKS    BR_ROWS   DEL_LF_ROWS
----------       ----------       ----------       ----------          ----------       ----------         --------------------
3       1040             907              360716          4                906               11071

The total number of blocks, 1040 never got increased.

Even though majority of the index dead leaf blocks are reused for the new rows (SYEAR=2003), I can see some
rows 11,071 still in the deleted index leaf blocks.  Originally the DEL_LF_ROWS has 116,545 rows.

Instead of drop and create, now I can rebuild the index online, thus avoiding maintenance window.

SQL> alter index tamil.t1_idx rebuild tablespace users nologging online;

Index altered.


SQL> analyze index tamil.t1_idx validate structure ;

Index analyzed.

SQL> select height, blocks, lf_blks, lf_rows, br_blks, br_rows, del_lf_rows
  from index_stats ;

HEIGHT     BLOCKS      LF_BLKS    LF_ROWS    BR_BLKS    BR_ROWS    DEL_LF_ROWS
----------         ----------         ----------          ----------           ----------        ----------           -----------
3           1040             908                 349645             3                 907                    0

From the above tests, it proves that deleted space is reused by Oracle.

3. Half Empty Leaf Blocks

What is half-empty leaf blocks?

A leaf block was initially filled with index keys. After some time some of the keys got deleted because of deletion
occurred at the table level. Now the block contains deleted keys as well as some undeleted keys. This block is said
to be half empty leaf block.
According to B*tree Index definition Oracle has to coalesce the half empty leaf blocks, but it does not do when keys
are deleted/added, thus leaving many blocks half filled.
To over come this fragmentation, Oracle introduced “alter index coalesce” command in 8i.
For example, SQL> alter index t1_idx coalesce;
However, index coalesce command works only with leaf blocks, it does not release branch blocks.

The next case study explains about half empty blocks:
I used the same T1 table for this case study also .
The ID col is populated with SEQ.

SQL> select syear, count(*) from t1 group by syear ;

SYEAR   COUNT(*)
---------- ----------
2000     116545
2001     116550
2002     116545
2003     116555

Create a unique index on SYEAR + ID columns
SQL> create unique index t1_idx on t1(syear,id) tablespace users
storage (initial 1m next 1m pctincrease 0);

SQL> select segment_name, blocks, bytes from dba_segments where segment_name = 'T1_IDX';

SEGMENT_NAME     BLOCKS      BYTES
------------                    ----------          ----------
T1_IDX                        1300            10649600

Out of 116545 rows I delete 50% of the rows for the year 2002.

SQL> delete t1 where syear = 2002 and mod(id,2) = 0 ;

58272 rows deleted.

SQL> commit;

Commit complete.

Get the object id from dba_objects.

SQL> select object_id from dba_objects where object_name = 'T1_IDX';

OBJECT_ID
----------
64579

To get the tree dump, run the following the command:
SQL> alter session set events 'immediate trace name treedump level 64579';

Session altered.

A trace file is generated in the udump dir. Sample output is given below:
leaf: 0xc002cc 12583628 (33: nrow: 378 rrow: 378)
leaf: 0xc002cd 12583629 (34: nrow: 378 rrow: 378)
leaf: 0xc002ce 12583630 (35: nrow: 378 rrow: 378)
leaf: 0xc002cf 12583631 (36: nrow: 378 rrow: 378)
leaf: 0xc002d0 12583632 (37: nrow: 378 rrow: 378)
leaf: 0xc002d1 12583633 (38: nrow: 378 rrow: 378)
leaf: 0xc002d2 12583634 (39: nrow: 378 rrow: 378)
leaf: 0xc002d3 12583635 (40: nrow: 378 rrow: 378)
leaf: 0xc002d4 12583636 (41: nrow: 378 rrow: 198)
leaf: 0xc002d5 12583637 (42: nrow: 378 rrow: 189)
leaf: 0xc002d6 12583638 (43: nrow: 378 rrow: 189)
leaf: 0xc002d7 12583639 (44: nrow: 378 rrow: 189)
leaf: 0xc002d8 12583640 (45: nrow: 378 rrow: 189) ----------------- Half empty blocks
leaf: 0xc002d9 12583641 (46: nrow: 378 rrow: 189)
leaf: 0xc002da 12583642 (47: nrow: 378 rrow: 189)
leaf: 0xc002db 12583643 (48: nrow: 378 rrow: 189)

Majority of the leaf blocks contain around 378 rows. The blocks that have deleted keys have 189 rows.

SQL> validate index t1_idx ;

SQL> select blocks, lf_rows, lf_blks, br_blks, del_lf_rows from index_stats;

BLOCKS    LF_ROWS    LF_BLKS    BR_BLKS    DEL_LF_ROWS
----------       ----------         ----------       ----------         -----------
1300     466195           1216            4                  58272

To merge the half empty blocks, run the coalesce command

SQL> alter index t1_idx coalesce;

Index altered.

Again run validate command.

SQL> validate index t1_idx;

Index analyzed.

SQL> select blocks, lf_rows, lf_blks, br_blks, del_lf_rows from index_stats;

BLOCKS    LF_ROWS    LF_BLKS    BR_BLKS     DEL_LF_ROWS
----------       ----------         ----------       ----------          -----------
1300     407923           1072          4                      0

After coalescing the deleted keys are gone. The number of leaf blocks is 1072, reduced from 1216.

Will the half empty blocks are reused?

The answer is YES and NO because the new keys must be placed in an appropriate blocks.
In the next test case, you will see oracle adds new leaf blocks even though some the blocks are half empty. Oracle
will not coalesce the leaf blocks when the new keys are added.

SQL> select segment_name, blocks, bytes from dba_segments where segment_name='T1_IDX';

SEGMENT_NAM     BLOCKS      BYTES
-----------                    ----------         ----------
T1_IDX                       1300          10649600

SQL> select syear, count(*) from t1 group by syear ;

SYEAR   COUNT(*)
---------- ----------
2000     116545
2001     116550
2002     116555
2003     116555

SQL> delete t1 where syear = 2002 and mod(id,2) = 0 ;

58277 rows deleted.

SQL> commit;

Commit complete.

Now inserted 40000 new rows for the year 2004.

SQL> insert into t1 select t1_seq.nextval, 2004, a.* from dba_objects a ;

23311 rows created.

SQL> /

23311 rows created.

SQL> commit;

Commit complete.

SQL> select segment_name, blocks, bytes from dba_segments where segment_name='T1_IDX';

SEGMENT_NAM     BLOCKS      BYTES
----------------------         ----------        ----------
T1_IDX                       1430        11714560

Oracle added 130 new blocks so the total went upto 1430.
SummaryIn general Oracle’s B*Tree Structure is well maintained except in few occasions. You will not notice a problem in an
OLTP application in which massive delete/update may not happen. However, in DSS or DW systems, you will see
performance degradation after massive delete/update. Some of the root causes discussed above for the index leaf
blocks fragmentation and how to detect them may help you in future.
Conclusion:
When do you rebuild/recreate index?

  • If you suspect DEL_LF_ROWS column value remains constant over a period and the number of deleted rows is more than 20 %, then you can rebuild the index. Do not take this as a rule of thumb.
  • In a batch job, incorporate the “alter index rebuild” command with in the script that does large delete/update and does not insert new rows.
  • Avoid “drop and create index “. Some times, you may forget to write correct storage parameters.
  • Use “alter index coalesce” option if you find  many leaf blocks are half-empty or near empty.  Run coalesce command when the system has low usage.
  • Starting from 9i, you can rebuild an index online with nologging and in parallel threads.  Example: alter index tamil.t1_idx rebuild nologging parallel 4 online ;
  • Index created on monotonically increasing sequence or date column needs periodic "coalescing".

Cheers, Tamil

No comments:

Post a Comment