Tamilselvan Govindasamy: parallel DML

Recently a developer asked my help to improve the performance of delete statements. The
development team wanted to delete 12 million rows in one table and 4 million rows in the
second table. He said that his 2 delete statements ran for 6 hrs in a test environment and he
did not have 6 hrs maintenance window in production environment.
The original delete statement is:
DELETE FROM T1 XA
WHERE EXISTS
      (SELECT 1
         FROM T2 TMP
        WHERE TMP.ID = XA.ID);
He verified that the delete statement would run in a maintenance window, and the DB will not be accessed by
end users.
Immediately I thought of using parallel DML operation because the table T1 has 180 Million rows, out of
which we wanted to delete 12 Million rows.
The oracle document says for parallel delete/update operation, the table has to be partitioned. I am having a
small doubt about the statement. However, there are few restrictions for parallel DML operations.
Oracle doc says:
Rules for UPDATE, MERGE, and DELETE
UPDATE, MERGE, and DELETE operations are parallelized by partition or subpartition. Update, merge, and
delete parallelism are not possible within a partition, nor on a nonpartitioned table. See "Limitation on the
Degree of Parallelism" for a possible restriction.
http://download.oracle.com/docs/cd/B19306_01/server.102/b14223/usingpe.htm
I tried the below script and the parallel delete operation completed in 26 minutes.
SET ECHO ON
SET PAGESIZE 60
SET LINESIZE 132
SPOOL pll_del.log
SELECT to_char(SYSDATE,'DD-MON-YYYY HH24:mi:ss') FROM DUAL;
alter session set db_file_multiblock_read_count=128;
alter table T1 parallel 8;
---- force parallel DML operation
alter session force parallel dml ;
DELETE /*+ PARALLEL(XA,8) */
      FROM T1 XA
WHERE EXISTS
      (SELECT 1
         FROM T2 TMP
        WHERE TMP.ID = XA.ID);
COMMIT;
alter table T1 noparallel ;
SELECT to_char(SYSDATE,'DD-MON-YYYY HH24:mi:ss') FROM DUAL;
SPOOL OFF;
Important points are:

1. Changed db_file_multiblock_read_count to 128 so that Oracle would read maximum of
1MB in a single IO.

2. Set parallel degree 8 on the table.

3. Forced parallel DML at the session level.

4. Ensured that delete statement has parallel hint.

You can monitor the progress of deletion by querying V$transaction view or
v$pq_tqstat, v$pq_sysstat and v$pq_slave.
Having tested this statement in a test environment, the developer would ask for less
maintenance window, when he run the delete statement.
Hope this helps....
Tamilselvan G

There is a myth that if the table is not partitioned, then parallel DML will not work. This is no
longer true. Several years ago I had updated 35 million rows table (size 40GB) using parallel
DML option in 8i. The parallel update ran 6X faster than procedural update using PL/SQL.

In this paper, you will learn how to update a BIG Table using parallel DML option.

All the tests given below are conducted in 10gR2.

First, I created a table, T1 using ALL_OBJECTS. The table has 10,000 rows.

One of the quickest way to find out the effectiveness (throughput) of parallelism in your system
is to run a query (SELECT) using parallel option, and measure the throughput by querying
V$PQ_TQSTAT table. (Note: V$PQ_TQSTAT will be renamed to V$PX_TQSTAT in 11g).
alter session set db_file_multiblock_read_count = 128;

select /*+ FULL(t1) parallel(t1,8) */ * from tamil.t1 ;

select process, num_rows, bytes from v$pq_tqstat order by process;

PROCESS      NUM_ROWS      BYTES
----------       ----------          ----------
P000             1300                664452
P001             1300                664551
P002             1300                664551
P003             1300                664551
P004             1200                613428
P005             1200                613428
P006             1200                613428
P007             1200                613428
QC               10000              5111817     -- Query coordinator
If you see the number of rows evenly distributed among the parallel slave processes, then it
indicates that your system is well configured for parallel processing.
Case # 1 - Simple update on NON-PARTITIONED Table
I have built a table, T1 that has 3.11 Million rows.

SQL> desc t1
Name                       Null?    Type
-------------------------- -------- ------------------
OWNER                       NOT NULL   VARCHAR2(30)
OBJECT_NAME           NOT NULL   VARCHAR2(30)
SUBOBJECT_NAME                        VARCHAR2(30)
OBJECT_ID                 NOT NULL NUMBER
DATA_OBJECT_ID                         NUMBER
OBJECT_TYPE                                VARCHAR2(19)
CREATED                    NOT NULL    DATE
LAST_DDL_TIME       NOT NULL    DATE
TIMESTAMP                                     VARCHAR2(19)
STATUS                                           VARCHAR2(7)
TEMPORARY                                   VARCHAR2(1)
GENERATED                                    VARCHAR2(1)
SECONDARY                                   VARCHAR2(1)

SQL> select count(*) from t1;

COUNT(*)
----------
3111872

SQL> select table_name, num_rows, blocks, empty_blocks,
        last_analyzed from user_tables
where table_name= 'T1' ;

TABLE_NAME     NUM_ROWS     BLOCKS    EMPTY_BLOCKS      LAST_ANALYZ
------------         ----------           ----------   ------------               -----------
T1                        3116183            43085            0                       05-MAR-2007

[I used DBMS_STATS.GATHER_TABLE_STATS procedure to gather table statistics with estimate
percent NULL(meaning 100 percent), but it reported more number of rows as you see in the
USER_TABLES].
Let me do the first test.

SQL> alter session set db_file_multiblock_read_count = 128;

Session altered.

Elapsed: 00:00:00.01

SQL> update tamil.t1 set object_name = lower(object_name);
3111872 rows updated.

Elapsed: 00:09:34.33
SQL> commit;

Commit complete.

Elapsed: 00:00:00.13

A single process ran for 9 Min and 34 seconds.
Case # 2 Parallel Update on NON-PARTITIONED TABLE
I have shutdown the instance and restared.

SQL> alter session enable parallel dml ;

Session altered.

Elapsed: 00:00:00.00

SQL> alter session force parallel dml parallel 8;

Session altered.

Elapsed: 00:00:00.01

SQL> alter session set db_file_multiblock_read_count = 128;

Session altered.

Elapsed: 00:00:00.00

SQL> update /*+ full(t1) parallel(t1,8) */
2         tamil.t1 set object_name = lower(object_name);
3111872 rows updated.

Elapsed: 00:00:28.03
SQL> commit;

Commit complete.

Elapsed: 00:00:01.12
Conclusion:
The parallel update took just 28 seconds versus 9 min 34 seconds in a single process update.
By setting highest value to DB_FILE_MULTIBLOCK_READ_COUNT parameter, "PARALLEL DML"
and "FORCE PARALLEL DML" with correct degree of parallelism at the session level, you can
update a big table in a shorter time.

Do not attempt to process in parallel on a small table. In fact, it will run longer than what a single
process will take.
Hope, this helps.
Tamilselvan G

Tamilselvan Govindasamy

Friday, November 25, 2011

Parallel Delete on a nonpartitioned Table in Oracle 10g

Parallel DML