Oracle Corner: September 2006

13 September 2006

Setting up Tivoli environment for database duplication

Test case on duplicating database with RMAN and Tivoli

Database to be refreshed: SvD on testsv

Source, production database: SvP on prodsv

Check the existing set-up

Check whether Tivoli environment has already been set

Test server: testsv

Find out whether you are using 32-bit or 64-bit oracle
file $ORACLE_HOME/bin

If 32-bit => cd /opt/tivoli/tsm/client/oracle/bin
If 64-bit => cd /opt/tivoli/tsm/client/oracle/bin64

Check whether the file tdpo-SvP.opt exist in the directory

If not, ftp/scp it from the production server

tdpoconf SHOWENVironment -TDPO_OPTfile=tdpo-SvP.opt

The results should be like:

IBM Tivoli Storage Manager for Databases:
Data Protection for Oracle
Version 5, Release 2, Level 0.0
(C) Copyright IBM Corporation 1997, 2003. All rights reserved.
DATA PROTECTION FOR ORACLE INFORMATION
Version: 5
Release: 2
Level: 0
Sublevel: 0
Platform: 32bit TDP Oracle SUN
TSM SERVER INFORMATION
Server Name: prodsvO
Server Address: TSMB1
Server Type: Solaris 8/9
Server Port: 1500
Communication Method: TCP/IP
SESSION INFORMATION
Owner Name: oracle
Node Name: prodsvO
Node Type: TDP Oracle SUN
DSMI_DIR: /opt/tivoli/tsm/client/api/bin
DSMI_ORC_CONFIG: /opt/tivoli/tsm/client/oracle/bin/dsmo_temp.opt
TDPO_OPTFILE: tdpo-SvP.opt
Password Directory: /opt/tivoli/tsm/client/oracle/bin
Compression: FALSE

If the node name and server name (both in bold, red fonts) are the same as the TSM name for the production server, then the Tivoli environment is set up
If they are different or if you get an error:

Proceed to the next section
TSM sever names might be different from the real server names

If not sure, run the tdpoconf command on the production server to see

Set-up Tivoli environment

Test server: testsv

ftp or scp the server password file

ftp prodsv

> bin

> get /opt/tivoli/tsm/client/oracle/bin/TDPO.PRODSVO

Change the tdpo-SvP.opt

Change the line

DSMI_ORC_CONFIG /usr/bin/dsmo.opt

into

DSMI_ORC_CONFIG /opt/tivoli/tsm/client/oracle/bin/dsmo_temp.opt

Create a new file called /opt/tivoli/tsm/client/oracle/bin/dsmo_temp.opt with only 2 lines:

SErvername prodsvo
quiet

Make a copy of /opt/tivoli/tsm/client/api/bin/dsm.sys (or /bin64 for 64-bit)
Add the entries to the /opt/tivoli/tsm/client/api/bin/dsm.sys file

Use the backup server from the source server
If not sure, check the dsm.sys file on the production server and immitate
Example, if prodsvo is using tmsb1 on the port 1500, add these lines at the bottom of the file

SErvername prodsvo
COMMmethod TCPip
TCPPort 1500
TCPServeraddress tsmbp1
inclexcl /opt/tivoli/tsm/client/ba/bin/inclexcl
Nodename prodsvo
passwordaccess prompt
schedlogname /var/applogs/backup/dsmsched.log
errorlogname /var/applogs/backup/dsmerror.log

Key words
ANS1302E (RC2) No objects on server match query
ANS1353E (RC53) Session rejected: Unknown or incorrect ID entered
Tivoli RMAN "duplicate database" "Tivoli environment"

// posted by Tatjana (Tanja) Kane @ 9/13/2006 0 comments

RMAN-20011 After the duplicate database

Problem description

When duplicating the database, one way of handling the RMAN catalogue is:

Unregister the database to be refreshed from the catalogue
Run duplicate
Change the DBID on the new database

Use "nid" command. It requires an outage.

If you are not sure whether your duplicate will succeed, you can delay the unregister until after the duplication and:

Connect to RMAN catalogue
select * from rc_database;
Find your old database and unregister with

execute dbms_rcvcat.unregisterdatabase(db_key, db_id);

If it happened that you did not change the DBID and ran some catalogue commands, you might experience the error:
RMAN-20011: target database incarnation is not current in recovery catalogue
You might even get the empty list for the database incarnations.

Fix

To fix this, log into your RMAN catalogue database with sqlplus:

sqlplus rman@rmancat
select * from rc_database;

You will notice at least 2 DBINC_KEY values for your database
Find the key that was valid before the current one.

If you are not sure which one, check the resetlogs dates

It will probably have the status "PARENT"

If, for example, old DBINC_KEY is 18234933

rman target / catalog rman@rmancat

reset database to incarnation 18234933;

// posted by Tatjana (Tanja) Kane @ 9/13/2006 0 comments

08 September 2006

Oracle raw trace file: Waits per object

Raw trace files give you few additional information over the tkprof-ed file. The rest is the same. The most interesting information you can get from the raw trace file is how much waiting was for each object.

Oracle raw trace file general statistics

PARSE , EXEC , FETCH

mis>0 for the library cach misses
e= for the seconds (divide by million for >=9i)
c= for the CPU consumption
p= for the physical reads
cr= for the consistent requests
cu= for the current requests
r= for the number of rows
og= for th eoptimizer goal (4 is choose, 3 is rule, 2 is first_rows, 1 is all_rows)

WAIT

ela= for the E-06 seconds
p1= , p2= and p3= , with the same meaning as in the wait views

Waits per object
If you see lots of I/O waits in the raw trace file, tkprof will not divide waits per object. Tkprof gives you waits per SQl statement, but if your SQL statement is complex, you might need to know how much waits was for each of the objects.
To find that, you need to process the raw trace file.

I usually do it like this (not the super efficient way, but fast enough):

Find your query and cut out the waiting part into the new file, for example

more justwaits.trc

WAIT #45: nam='db file sequential read' ela= 10987 p1=48 p2=12266 p3=1

WAIT #45: nam='db file sequential read' ela= 5605 p1=48 p2=55970 p3=1

WAIT #45: nam='db file sequential read' ela= 17531 p1=51 p2=30972 p3=1

... and so on

Process the justwaits.trc any way you want to get just 2 columns: p1 and p2.

For example:

48 12266
48 55970
51 30972
51 98012
51 98179
30 60962
30 60963
51 98184
51 98182
...

Well, now we have the File ID and Block ID for our objects
Write a SQL script, for example called object_wait.sql :

set feedback off
set head off
set verify off
set linesi 58

select segment_name||'.'||segment_type from dba_extents

where file_id = &1 and &2 between block_id and block_id + blocks - 1;

exit

Run a shell command:

while read p1 p2; do sqlplus -s '/ as sysdba' @object_wait $p1 $p2 >>a.a ; done "<"justwaits.trc (remove the ", I had to put them there because of the HTML tags)
cat a.a | sort | uniq -c

And you have the result, for example:

20 EMS.CNS_OBL_UNQ
2 EMS.I_OBJECTS_NAME
5 EMS.I_OBJECTS_PK
3 EMS.OBJECTS

// posted by Tatjana (Tanja) Kane @ 9/08/2006 0 comments

06 September 2006

Event 10053

When tuning SQL, I mostly use the event 10046, level 12 (or dbms_suport.start_sql_trace_in_session for easier typing) and tkprof afterwards. Rarely, when I need to understand why the optimizer has made the decision to use a certain path, I turn on the event 10053, level 1. This event catches the hard parse stage, so the statement needs to be "new", which is easily achieved with a unique comment or similar. The trace file is kind of upside-down, so I mostly start from the end and dig to the beginning (unless I am looking for the specific information).

The optimizer trace file has the following sections:

Query itself
Parameters used by the CBO
Objects' statistics
Table access paths' calculations
Joins
Final plan with adjustments

Metalink has a good article called "Case Study: Analyzing 10053 Trace Files". Also, if you Google "Wolfgang Breitling", you will get heaps of presentations he wrote and they are excellent.

Cost Formulas

I figured out the simple ones myself, as they are quite logical, but others come from WB's presentation and from Metalink.

Table Access Costs:

Number of blocks/MbDivisor

MbDivisor (multi block read divisor) is multi_block_readcount for <9.2v
For Oracle >=9.2, if the system statistics are collected, Oracle will estimate it
To find out what it is estimated to, run a simple query with the single table access path and then divide "NBLKS:/"Resc:"

Index access costs:

Unique scan: blevel + 1
Fast full scan: leaf_ blocks / MbDivisor
Full scan: Num_blocks/MbDivisor
Index: blevel + FF* leaf_ blocks
Range scan: blevel + FF* leaf_ blocks + FF* clustering_ factor

FF is a Filtering Factor, which is either the density if "=" is used or a formula depending on the predicate.

Join access costs:

NL join: COST(outer) + [ CARD(outer) * COST(inner)]
Merge join: Resc(outer) + Resc(inner) + Sort_Cost(outer) + Sort_Cost(inner)
Hash join: Resp(outer) + Resp(inner) + Par_Sort_Cost(outer) + Par_Sort_Cost(inner)

Note that if the index does not have statistics, you will get the default numbers in the file. The default stats for LEAF_BLOCKS=25, BLEVEL=1, DISTINCT_KEYS=100 and CLUSTERING_FACTOR=800.

For a table, you will get a message that the table is not analyzed. Default assumptions for a table are: 100 block, 100 rows, even data distribution.

What to look for

In the example below the index access is chosen for the table OBJECT_LINKS, with the cost of 4. It initially chooses the wrong index (path 4: *FK2), but later, when optimizer considers the joins, the index with better selectivity looses to the index with less I/Os.

**************************************
SINGLE TABLE ACCESS PATH
Column: TO_OBJECT_ Col#: 3 Table: OBJECT_LINKS Alias: OBJECT_LINKS
NDV: 6543450 NULLS: 10 DENS: 1.5282e-07
HEIGHT BALANCED HISTOGRAM: #BKT: 200 #VAL: 196
Column: LINK_CODE Col#: 2 Table: OBJECT_LINKS Alias: OBJECT_LINKS
NDV: 2 NULLS: 0 DENS: 5.0000e-01
FREQUENCY HISTOGRAM: #BKT: 5144 #VAL: 2
Column: STATUS Col#: 7 Table: OBJECT_LINKS Alias: OBJECT_LINKS
NDV: 3 NULLS: 10387994 DENS: 3.3333e-01
HEIGHT BALANCED HISTOGRAM: #BKT: 3230 #VAL: 2
TABLE: OBJECT_LINKS ORIG CDN: 10800497 ROUNDED CDN: 1 CMPTD CDN: 1
Access path: tsc Resc: 6440 Resp: 6440 -- Path 1, full table scan
Skip scan: ss-sel 0 andv 9169389
ss cost 9169389
table io scan cost 6440
Access path: index (no sta/stp keys) -- Path 2, via CNS* index
Index: CNS_OBL_UNQ
TABLE: OBJECT_LINKS
RSC_CPU: 0 RSC_IO: 113262
IX_SEL: 1.0000e+00 TB_SEL: 7.6412e-08
Access path: index (equal) -- Path 3, via *CB2 index
Index: I_OBJECT_LINKS_CB2
TABLE: OBJECT_LINKS
RSC_CPU: 0 RSC_IO: 4
IX_SEL: 0.0000e+00 TB_SEL: 7.6412e-08
Access path: index (equal) -- Path 4, via *FK2 index
Index: I_OBJECT_LINKS_FK2
TABLE: OBJECT_LINKS
RSC_CPU: 0 RSC_IO: 4
IX_SEL: 0.0000e+00 TB_SEL: 1.5282e-07
BEST_CST: 4.00 PATH: 4 Degree: 1
***************************************

Table scan has a cost of 6440.
Index scan via CNS_OBL_UNQ has a cost of 113262.
Index scan via I_OBJECT_LINKS_CB2 has a cost of 4; both index columns are used in the query, better selectivity, so it appears to be a better candidate, while optimizer chose FK2.
Index scan via I_OBJECT_LINKS_FK2 has also cost of 4, so it is possible that optimizer just picks one of the index for the best cost, not necessarily planning to use it till the end.

***************************************
NL Join
Outer table: cost: 6 cdn: 1 rcz: 30 resp: 6
Inner table: OBJECT_LINKS
Access path: tsc Resc: 6440
Join: Resc: 6446 Resp: 6446
Access path: index (iff)
Index: CNS_OBL_UNQ
TABLE: OBJECT_LINKS
RSC_CPU: 0 RSC_IO: 17191
IX_SEL: 0.0000e+00 TB_SEL: 1.0000e+00
Inner table: OBJECT_LINKS
Access path: iff Resc: 17191
Join: Resc: 17197 Resp: 17197
Access path: index (no sta/stp keys)
Index: CNS_OBL_UNQ
TABLE: OBJECT_LINKS
RSC_CPU: 0 RSC_IO: 113261
IX_SEL: 1.0000e+00 TB_SEL: 1.3622e-07
Join: resc: 113267 resp: 113267
Access path: index (join index)
Index: I_OBJECT_LINKS_CB2
TABLE: OBJECT_LINKS
RSC_CPU: 0 RSC_IO: 17
IX_SEL: 0.0000e+00 TB_SEL: 1.3054e-07
Join: resc: 23 resp: 23
Access path: index (join index)
Index: I_OBJECT_LINKS_FK2
TABLE: OBJECT_LINKS
RSC_CPU: 0 RSC_IO: 29
IX_SEL: 0.0000e+00 TB_SEL: 1.5282e-07
Join: resc: 35 resp: 35
Best NL cost: 23 resp: 23
Join cardinality: 1 = outer (1) * inner (9225774) * sel (1.5947e-07) [flag=0]
***************************************

Join with the serial cost of 23 is chosen and IO cost of 17, which is *_CB2 index.

Generally, it is sometimes interesting to see how optimizer does not even consider some objects. If you think they need to be considered, check whether stats or histograms are missing.

// posted by Tatjana (Tanja) Kane @ 9/06/2006 1 comments

Oracle Optimizer: Density

Problem description

I encountered a slow query with bind variables. When I ran it with the literal values, it was very fast. Obviously, an indication that histograms are not used. The explain plans were different, which was not surprising at all, as Oracle optimizer is using different cost formulas for literal and bind values, therefore the possibility of 2 different explain plans.

SELECT *
FROM EMP, ATP, CC
WHERE EMP.NAME = :B1 <-- index over NAME column chosen
AND ATP.ID = :B2
AND ... so on...

Columns NAME and ID had very similar selectivity, similar table sizes and both had single column indexes, but because NAME column had a slightly better selectivity than ID the optimizer kept using the access via EMP.NAME, instead of ATP.ID, which made the query run for 2 minutes, instead of under a second.

The problem with this query was that the NAME value was always a "popular" value when executed in the application, but because the histograms couldn't be used, the optimizer couldn't see that.

Possible fixes

In a case like this, targeted bind variable peeking or cursor_sharing=similar or outlines resolve the issue, none of which were used in the database at that time.

Long story short, the application vendor was able to delivery a code fix, but we needed a quick fix before it arrived. The only thing left to do while waiting for the code fix was tweaking the statistics. That is not my favorite way of fixing the performance and I prefer using it only as a last resort.

Code fix - Easy

As this query always selects the popular value for EMP.NAME, I suggested introducing the "like" operator to ensure the index on ATP.ID will always get used:

SELECT *
FROM EMP, ATP, CC
WHERE EMP.NAME LIKE :B1
AND ATP.ID = :B2
AND ... so on...

Optimizer is using a different selectivity formula with the "like" (25%), so this change put the explain plan and speed back to where we wanted.

With the LIKE operator, make sure the parameter "_like_with_bind_as_equality" is set to false, otherwise the optimizer will treat the selectivity of your column the same as with the "=" operator.

Density fix

A quick fix was required until the new code arrives. I decided to increase the density to be high enough not to influence the other explain plans, but low enough to lose to the ID column. By changing the density, I am essentially telling the optimizer that it will get more rows as a result, therefore increasing the cost of access over that column.

Bit of a risky thing to do, so we tested it well and it had proved to be an excellent fix until new code arrived.

Changing the density:

Export the statistics into the table
Update the value in the stats table

update stats_table set n2='1.1E-05' where c5='SCOTT' and c1='EMP' and c4='NAME'

Re-import

Update (18-Jan-2007): There is dbms_stats procedure that can be used to change the column statistics:

execute DBMS_STATS.SET_COLUMN_STATS (ownname=>'SCOTT', tabname=>'EMP', colname=>'NAME', density=>'1.1E-05' NO_INVALIDATE=>false);

Density formulas

As I understand, density shows the frequency of the distinct values, kind of like a complex selectivity. It is used when CBO estimates the rows cardinality. It is a number between 0 and 1, where 0 means no rows will be selected and 1 means for all rows will be selected after the predicate filters them.

If there are no histograms, the formula is simple:

density = 1/#num_distinct

The same formula applies for bind variables, "=" and height-balanced histograms.
If the histogram exists, then the formula becomes:

density = sum( cnt*2 )/power ( num_rows * sum( cnt ),2))
I've been trying to use this WB's formula in the database, but to no avail. Is it really correct?

For the frequency histograms:

density = 1/ (2 * num_rows)

Reference: "A look under the hood of CBO" by Wolfgang Breitling
Metalink Note 68992.1