The benefits of analytical function are many:
1. Reduced IO
2. Complex Query can be written in a simple SQL
3. Less CPU time
Let us start with a small table.
SELECT * FROM T1 ;
EMPLID PAY_DATE
--------------- -----------
A1 31-JAN-2000
A1 28-FEB-2000
A1 31-MAR-2000
A1 31-MAY-2000
B1 31-JAN-2000
B1 31-MAR-2000
B1 31-JUL-2000
C1 31-JAN-2000
C1 28-FEB-2000
C1 31-MAR-2000
C1 31-DEC-2000
D1 31-JAN-2000
D1 31-MAR-2000
D1 31-OCT-2000
14 rows selected.
Elapsed: 00:00:00.30
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
569 bytes sent via SQL*Net to client
253 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
14 rows processed
The Table T1 has 14 rows and there are 4 distinct EMPLID.
Our requirement is: List out employee with the last pay date.
Without analytical function, the following SQL performs the task.
I have used max function in the sub query.
SELECT EMPLID, PAY_DATE FROM T1 A
WHERE A.PAY_DATE = ( SELECT MAX(PAY_DATE) FROM T1 B
WHERE B.EMPLID = A.EMPLID) ;
EMPLID PAY_DATE
--------------- -----------
A1 31-MAY-2000
B1 31-JUL-2000
C1 31-DEC-2000
D1 31-OCT-2000
Elapsed: 00:00:00.80
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
16 consistent gets
1 physical reads
0 redo size
319 bytes sent via SQL*Net to client
253 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
4 rows processedThe total number of logical IOs is 16.You can see FULL table scan is happening for the sub query also.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'T1'
3 1 SORT (AGGREGATE)
4 3 TABLE ACCESS (FULL) OF 'T1'
To speed up execution time, I create an index on emplid, pay_date columns
create index t1_idx on t1(emplid, pay_date)
tablespace psindex storage(initial 32k next 32k pctincrease 0) pctfree 0;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'T1'
3 1 SORT (AGGREGATE)
4 3 INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
333 bytes sent via SQL*Net to client
254 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
4 rows processed
====================================================================
Still, 6 LIO are needed.
Let me change the query to use ROW_NUMBER function.
SELECT EMPLID, PAY_DATE
FROM ( SELECT EMPLID,
PAY_DATE,
(ROW_NUMBER() OVER
(PARTITION BY EMPLID ORDER BY PAY_DATE DESC)) RN
FROM T1
ORDER BY EMPLID, RN
)
WHERE RN = 1 ;
EMPLID PAY_DATE
--------------- -----------
A1 31-MAY-2000
B1 31-JUL-2000
C1 31-DEC-2000
D1 31-OCT-2000
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
320 bytes sent via SQL*Net to client
253 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
4 rows processed
====================================================================
I created an index T1_IDX on T1 table (EMPLID,PAY_DATE).
SELECT EMPLID, PAY_DATE
FROM ( SELECT EMPLID,
PAY_DATE,
(ROW_NUMBER() OVER (PARTITION BY EMPLID ORDER BY PAY_DATE DESC)) RN
FROM T1
ORDER BY EMPLID, RN
)
WHERE RN = 1;
A1 31-MAY-2000
B1 31-JUL-2000
C1 31-DEC-2000
D1 31-OCT-2000
Elapsed: 00:00:04.66
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 VIEW
2 1 WINDOW (SORT)
3 2 TABLE ACCESS (FULL) OF 'T1'
rows will be truncated
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
334 bytes sent via SQL*Net to client
254 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
4 rows processed
You will frequently see WINDOW (SORT) in the execution plan when the SQL uses analytical
function. I get consistently 3 logical IOs whether an index is present on the table or not.
And Oracle sorts the each employee rows within a WINDOW.TOP-N SQLselect * from emp;
EMPID SMONTH SAL
---------- ---------- ----------
AAA JAN 100
BBB JAN 200
CCC JAN 300
DDD JAN 400
CCC FEB 400
BBB FEB 300
DDD FEB 30
AAA FEB 80
AAA MAR 100
BBB MAR 200
CCC MAR 300
DDD MAR 20
12 rows selected.
Question: List top 3 salaried employees by month.
Answer:
SELECT EMPID, SMONTH, SAL
FROM ( SELECT EMPID, SMONTH, SAL, ROW_NUMBER()
OVER (PARTITION BY SMONTH ORDER BY SAL DESC ) RN
FROM EMP)
WHERE RN <= 3
ORDER BY 2,1;
EMPID SMONTH SAL
---------- ---------- ----------
AAA FEB 80
BBB FEB 300
CCC FEB 400
BBB JAN 200
CCC JAN 300
DDD JAN 400
AAA MAR 100
BBB MAR 200
CCC MAR 300
9 rows selected.Ratio_To_Report Analytic FunctionSQL>DESC SALES
Name Null? Type
------------ -------- ---------------------
PRODUCT VARCHAR2(20)
SALES NUMBER
SQL>SELECT * FROM SALES ;
PRODUCT SALES
-------------------- ----------
P1 280
P2 400
P3 200
P4 300
P5 190
P6 310
6 rows selected.
Elapsed: 00:00:00.50
SQL>SELECT PRODUCT, SALES ,
TO_CHAR((RATIO_TO_REPORT(SALES)
OVER())*100,999.99) AS SALES_PER
FROM SALES ;
PRODUCT SALES SALES_P
-------------------- ---------- -------
P1 280 16.67
P2 400 23.81
P3 200 11.90
P4 300 17.86
P5 190 11.31
P6 310 18.45
6 rows selected.
Oracle is continuously introducing new analytical functions in each release.
Check out the manual for the enhancements.
No comments:
Post a Comment