22 February 2007

Peoplesoft performance on Oracle 10g -our experience

This note highlights the actions we did in order to get acceptable performance for Peoplesoft 8.49 on Oracle 10g R2 (10.2.0.3, Solaris 64-bit). Database size is 700G, with the growth of 15-20G per month.

Hopefully, this note might help some other DBAs with Peoplesoft apps.

Actions taken to stabilize the performance:
Some of the actions taken to improve the SQL performance:
Other relevant actions:
Keywords: "Peoplesoft on Oracle 10g" "Peoplesoft performance on Oracle 10g"

Join selectivity in Oracle 10g

We have been having few Peoplesoft queries whose plans were producing wrong cardinality in joins. I started digging a bit deeper and found that Oracle couldn't make an accurate decision about how many rows will be returned.

I have raised today an enhancement request with Oracle. It will be interesting to hear what they say about it.

TEST EXAMPLE

As a test, I created 2 tables, each with 10 unique rows. I made them so that only 3 rows overlap. After joining that tables, Oracle kept returning 10 rows as an estimated cardinality (check example below).

I was moving around table's columns and creating and removing histograms and nothing helped.

I found yesterday Wolfgang Breitling's document about join selectivity with the similar example, so just to point out I did not copy it, I came to my own conclusions about this. What I did not know is the join selectivity formula, which comes as this:
Example:

create table t1 (id number);
create table t2 (id number);

insert into t1 values (1);
insert into t1 values (2);
insert into t1 values (3);
insert into t1 values (4);
insert into t1 values (5);
insert into t1 values (6);
insert into t1 values (7);
insert into t1 values (8);
insert into t1 values (9);
insert into t1 values (10);

insert into t2 values (0);
insert into t2 values (3);
insert into t2 values (4);
insert into t2 values (5);
insert into t2 values (11);
insert into t2 values (12);
insert into t2 values (13);
insert into t2 values (14);
insert into t2 values (15);
insert into t2 values (16);
commit;

execute dbms_stats.gather_table_stats('SYS','T1', method_opt=>'for all columns size 11');
execute dbms_stats.gather_table_stats('SYS','T2',method_opt=>'for all columns size 11');

set autotrace on explain

select * from t1,t2 where t1.id=t2.id;

ID ID
---------- ----------
3 3
4 4
5 5


Execution Plan
----------------------------------------------------------
Plan hash value: 1819147781

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 50 | 5 (20)| 00:00:01 |
|* 1 | HASH JOIN | | 10 | 50 | 5 (20)| 00:00:01 |
| 2 | TABLE ACCESS FULL | T1 | 10 | 30 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | T2 | 10 | 20 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T1"."ID"="T2"."ID")


As you can see the estimated cardinality for the join selectivity is 10 (achieved by 10*10*1/10 =10 formula).
Obviously, it should be 3.

Suggestion on how to fix this

If the optimizer in the future versions can figure it out on its own, then great. Otherwise, Oracle should allow DBAs to manually set join statistics. For example, Oracle can create a new statistic table:


Then, a DBA can insert these values inside:


So, when optimizer is determining the join selectivity for
it can check the new statistics table and use 3/100 instead of the maximum predicate selectivity, which would at the end produce the correct cardinality of 3 rows.

UPDATE: 26 Mar 2007

I tested this further and found if I insert non-unique rows into T1 and T2 tables, then cardinality starts working correctly. But, if ID columns have unique values, then cardinality is not correct.

I can only assume that when optimizer looks at unique ID column, it can see that it is unique (NDV=#rows) and then it just uses to the general join selectivity formula.

On the other hand, if ID is not a unique column, the optimizer will check the histograms and produce a pretty good cardinality guess.

Join selectivity formula

The selectivity of a join is defined as the selectivity of the most selective join column adjusted by the proportion of not null values in each join column. Sel = 1/max[NDV(t1.c1),NDV(t2.c2)] * ( (Card t1 - # t1.c1 NULLs) / Card t1) * ( (Card t2 - # t2.c2 NULLs) / Card t2)

This page is powered by Blogger. Isn't yours?