Archive for the ‘Tuning’ Category

better update with sub query(subquery)-co-related query

January 3, 2007

create table t1( x int constraint t1_pk primary key,y int );
create table t2( x int constraint t2_pk primary key,y int );

insert into t1 values ( 1, 1 );
insert into t1 values ( 2, 1 );
insert into t1 values ( 3, 1 );
insert into t2 values ( 3, 2 );

set autotrace on explain
update
( select /*+ USE_NL(t1) INDEX( t1 t1_pk ) */
t1.y t1_y, t2.y t2_y
from t1, t2
where t1.x = t2.x )
set t1_y = t2_y
This is used with hints to see that hint is really not required

insert into t1 select rownum+5, 1
from all_objects
analyze table t1 compute statistics
analyze table t2 compute statistics
set autotrace on explain
update
( select t1.y t1_y, t2.y t2_y
from t1, t2
where t1.x = t2.x )
set t1_y = t2_y
/
see the plan its same as first
set autotrace off

It should be NOTED that in order for the update of a join to work, T2 in the
above example must have a primary or unique key on it. If T2 does not,
you’ll not be able to update the join. The optimizer needs to know that T1 will
be joined to AT MOST 1 row in T2 and the way it does this is by seeing that a
primary key or unique constraint exists on the join conditions

good hints to use are

first_rows
o all_rows
o optimizer_dynamic_sampling (new in 9ir2)
o driving_site
update ( select a.substcriber_id, a.app_username, a.app_pwd, b.sap_id, b.mgr_pw
from WWSSO_APPUSERINFO_T A, T B
where A.SSO_USERNAME = B.SW_ONE_ID AND A.APP_ID = B.APP_ID )
SET SUBSCRIBER_ID = 1,
APP_USERNAME=SAP_ID,
APP_PWD=MGR_PW

Advertisements

faster update (e.g 1million out of 10m)

January 3, 2007

Purpose
This example shows how to updte a million row out of 10million rows.

Time Needed : 20min

Preparation

1) First create your dummy hold table: create table xyz_HOLD as select * from xyz where rownum<1. Alter tablexyz nologging.

2) insert /*+ append parallel (xyzhold,12) */ into xyz_hold xyzhold (field1, field2, field3) select /*+ parallel (x,12) */ xyz.field1,
my_new_value_for_field2, xyz.field3 from xyz x where blah blah blah.

3) when done, either rename the table, or swap the partition if your original
table is partitioned, and you only updated one partition as we do. Obviously
you need to rebuild indecies, etc as required.

Backout Procedure
Drop table xyz_HOLD;

Problem with high HWM

January 3, 2007

Purpose :
Shows that HWM can create problem.

Time Needed : 5Min.

Preparation
create table t pctfree 90 pctused 10
as
select * from all_objects;
delete from t where object_id <> 55;
set autotrace on

1. Test Case 1 (This can happen with high HWM)
select object_name from t;
(6328 consistent gets == see the higher consistent gets. it took 07.07 sec to get answer)

2. Test Case 2 (Now with low HWM)

create table tmp as select * from t;
truncate table t;
insert into t select * from tmp;
drop table tmp;
set autotrace on
select object_name from t;
(3 consistent gets = see the lower consistent gets.it took 07.03 sec to get answer)

Backout Procedure
drop table t;
set autotrace off

union and union all and exists performance improvement

January 3, 2007

Purpose :
This document shows that one need to avoid union and union all as far as possible to avoid the performance impact.

Time Needed : 10min

Preparation

set autotrace on

1. Test Case 1 (Normally developer writes like this)

select * from dual
union
select * from dual;

Observe the sorting required.

2. Test Case 2 (This is what it should be)

select * from dual
union ALL
select * from dual;
Observe compare to first case it does not need sorting.But then you get extra row.

See the Large Scale Impact and real life example
select *
from dual
where exists
(select null from all_objects
union
select null from dba_objects
union
select null from all_users);

Statistics
——————————————————-
0 recursive calls
0 db block gets
534050 consistent gets
0 physical reads
0 redo size
212 bytes sent via SQL*Net to client
276 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed

now replace union with union all since union all start giving the row faster exists cuts of faster. in Union it does not return the first row till the end of the sorting of complete query.

select *
from dual
where exists
(select null from all_objects
union all
select null from dba_objects
union all
select null from all_users);

Statistics
———————————————————-
0 recursive calls
0 db block gets
13 consistent gets
0 physical reads
0 redo size
212 bytes sent via SQL*Net to client
276 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

look at the reduction in consistent gets and sorting. So as far as possible use the union all and avoid sorting and especially in subquery with exists this could be really powerful.

Backout Procedure
set autotrace off

Histogram Usage Example

January 3, 2007

Purpose :
This document shows the importance of histogram. Mainly required for non-bind values(OLAP system mainly uses the non-bind values. Not OLTP)

Time Needed : 10min

Preparation

create table t as select mod(rownum,3) X, a.* from all_objects a;
insert /*+ append */ into t select * from t;
commit;
insert /*+ append */ into t select * from t;
commit;
insert /*+ append */ into t select * from t;
commit;
select count(*) from t;
— around 1 million rows.
update t set x = 4 where rownum <= 10;
so, I have 1million records. 10 of which have the #4. the rest have values
of 0, 1, 2. Now I index that column
create index t_idx on t(x) unrecoverable parallel 8
analyze table t compute statistics;
set autotrace traceonly explain

1. Test Case 1 (This is what generally happens without histogram)

select * from t where x = 4;

Execution Plan
———————————————————-
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=446 Card=276944 Bytes=24371072)
1 0 TABLE ACCESS (FULL) OF ‘T’ (Cost=446 Card=276944 Bytes=24371072) == CARD = TOTAL ROWS/DISTINCT VALUE = 1107776/4

full tablescan just like you. Now, lets play with the histograms.

2. Test Case 2 (Now see the difference with Histogram in place)

analyze table t compute statistics for columns x size 100;
select * from t where x = 4;

Execution Plan
———————————————————-
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=10 Bytes=880)
1 0 TABLE ACCESS (BY INDEX ROWID) OF ‘T’ (Cost=4 Card=10 Bytes=880) == now cardinality knows how many rows are reqired.
2 1 INDEX (RANGE SCAN) OF ‘T_IDX’ (NON-UNIQUE) (Cost=3 Card=10)

and the optimizer has the right info — does the right thing. Just to
validate, lets see what happens when we user the #1:

select * from t where x = 1;

Execution Plan
———————————————————-
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=446 Card=369276 Bytes=32496288)
1 0 TABLE ACCESS (FULL) OF ‘T’ (Cost=446 Card=369276 Bytes=32496288)

back to a full scan – AS IT SHOULD be for that one

Backout Procedure

set autotrace off
drop table t;

Improooooove not in query

January 3, 2007

Purpose :
Show that Hash_AJ hint for not in can improve a query lot when the column in not in comparision is nullable. if not null then one can use not exists also.

Time Needed : 10min

Many times not exists improves over not in. But check not in is not always equal to not exists. The thread is in the same folder.But read this it definately helps in any case with not in.

Preparation

change you optimizer mode if it is RULE based optimizer(alter session set optimizer_mode=choose)
drop table invdatew;
drop table investor;
create table invdatew
as
select 1 inv, a.*, sysdate dt from all_objects a where 1=0;
create table investor
as
select 1 inv, a.* from all_objects a where 1=0;
create index invdatew_idx on invdatew(inv);
exec gen_data( ‘INVDATEW’, 5000); — this generated random data. code at the end
exec gen_data( ‘INVESTOR’, 5000); — this generated random data. code at the end
select count(1) from invdatew — 5000 rows created
select count(1) from investor– 5000 rows created
analyze table invdatew compute statistics for
table for all indexes for all indexed columns;
analyze table investor compute statistics for
table for all indexes for all indexed columns;
alter session set sql_trace=true;

1. Test Case 1 (Normally developer writes like this)

insert into invdatew
select a.*, sysdate from investor a
where inv not in (select inv from invdatew);

call count cpu elapsed disk query current rows
——- —— ——– ———- ———- ———- ———- ———-
Parse 1 0.01 0.30 0 0 0 0
Execute 1 34.68 33.91 0 1324021 11546 5000
Fetch 0 0.00 0.00 0 0 0 0
——- —— ——– ———- ———- ———- ———- ———-
total 2 34.69 34.22 0 1324021 11546 5000

Rows Row Source Operation
——- —————————————————
5000 FILTER
5000 TABLE ACCESS FULL INVESTOR
0 TABLE ACCESS FULL INVDATEW
This query was processed like this:

for each row in investor
loop
FULL SCAN INVDATEW making sure that INV not in there
end loop

that is — 5000 full scans of INVDATEW (5000 = number of rows in investor!)

rollback

2. Test Case -2 (Use HASH_AJ to improve)

show parameter hash : Make sure hash_join_enabled = TRUE and hash_area_size is big enough(1048576 — this is for me better to be larger)
insert into invdatew
select a.*, sysdate from investor a
where inv is not null
and inv not in (select /*+ HASH_AJ */ inv from invdatew where inv is not null );

call count cpu elapsed disk query current rows
——- —— ——– ———- ———- ———- ———- ———-
Parse 1 0.01 0.02 0 0 0 0
Execute 1 0.96 1.05 140 566 10612 5000
Fetch 0 0.00 0.00 0 0 0 0
——- —— ——– ———- ———- ———- ———- ———-
total 2 0.97 1.07 140 566 10612 5000

Rows Row Source Operation
——- —————————————————
5000 HASH JOIN ANTI
5000 TABLE ACCESS FULL INVESTOR
5000 INDEX FAST FULL SCAN INVDATEW_IDX (object id 63822)

Wow — that is different — from over 34.69 cpu seconds down to .97.

This was processed sort of like this:

for every row in outer join investor to invdatew
if invdatew.inv is null then insert this record

Much more efficient…

rollback

3. Test Case -3 (but the o/p will be same only if not in column is also not null. Else see the thread in this folder for not in/exists comparision).

insert into invdatew
select a.*, sysdate from investor a
where not exists ( select *
from invdatew
where invdatew.inv = a.inv );

call count cpu elapsed disk query current rows
——- —— ——– ———- ———- ———- ———- ———-
Parse 1 0.00 0.02 0 0 0 0
Execute 1 1.08 1.05 0 10686 10600 5000
Fetch 0 0.00 0.00 0 0 0 0
——- —— ——– ———- ———- ———- ———- ———-
total 2 1.08 1.08 0 10686 10600 5000

Rows Row Source Operation
——- —————————————————
5000 FILTER
5000 TABLE ACCESS FULL INVESTOR
0 INDEX RANGE SCAN INVDATEW_IDX (object id 63822)

Now, not as good as the hash_aj but…. very close, very close. The logical IO’s would make me avoid this approach however, rather do 566 LIO’s then 10,686.
This was processed like this:

for every record in investor
run a subquery that uses an index to find a row in invdatew
if not found
then insert
end if;
end loop

rollback;
alter session set sql_trace=false
tkprof the output file to see above results in your environment.

Package gen_data

create or replace procedure gen_data( p_tname in varchar2, p_records in number )
authid current_user
as
l_insert long;
l_rows number default 0;
begin

dbms_application_info.set_client_info( ‘gen_data ‘ || p_tname );
l_insert := ‘insert /*+ append */ into ‘ || p_tname ||
‘ select ‘;

for x in ( select data_type, data_length,
nvl(rpad(
‘9’,data_precision,’9′)/power(10,data_scale),9999999999) maxval
from user_tab_columns
where table_name = upper(p_tname)
order by column_id )
loop
if ( x.data_type in (‘NUMBER’, ‘FLOAT’ ))
then
l_insert := l_insert || ‘dbms_random.value(1,’ || x.maxval || ‘),’;
elsif ( x.data_type = ‘DATE’ )
then
l_insert := l_insert ||
‘sysdate+dbms_random.value+dbms_random.value(1,1000),’;
else
l_insert := l_insert || ‘dbms_random.string(”A”,’ ||
x.data_length || ‘),’;
end if;
end loop;
l_insert := rtrim(l_insert,’,’) ||
‘ from all_objects where rownum <= :n’;

loop
execute immediate l_insert using p_records – l_rows;
l_rows := l_rows + sql%rowcount;
commit;
dbms_application_info.set_module( l_rows || ‘ rows of ‘ || p_records, ”
);
exit when ( l_rows >= p_records );
end loop;
end;
/

Backout Procedure

drop procedure gen_data;
drop table invdatew;
drop table investor;

How big db_file_multiblock_read_count ??

January 3, 2007

Purpose
This example shows what can be maximum MBRC. Setting more than this would be really giving the wrong information to optimizer.

Time Needed : 10min

Preparation

How to check what OS is able to read in one fatch. Setting too high of db_file-multiblock_read_count does not help on the other hand it tells optimizer to choose most of the time full scan even though the value is not correct.

Example
create table t storage ( initial 100m next 100m ) tablespace xxx as select * from all_objects;

create a table with a pretty large initial extent. We never will read more
then an extent at a time (eg: I initially did this test in a locally managed
tablespace with 512k extents — I read 64 8k blocks at a time, then realized “oh
yeah, thats my extent size”…)

insert /*+ append */ into t select * from t;
Commit;
insert /*+ append */ into t select * from t;
Commit;
insert /*+ append */ into t select * from t;
Commit;
insert /*+ append */ into t select * from t;
Commit;
alter session set db_file_multiblock_read_count = 1024;

now, set this paramter to something really really high…
alter session set events ‘10046 trace name context forever, level 12’; –set the tracing with wait events
select /*+ FULL(t) */ count(*) from t;
COUNT(*)
———-
365728

1. Test Case 1 (Look maximum block size for OS)

and full scan. Now, look at the trace file:

PARSING IN CURSOR #1 len=38 dep=0 uid=439 oct=3 lid=439 tim=3742056778
hv=194957013 ad=’86989890′
select /*+ FULL(t) */ count(*) from t
END OF STMT
PARSE #1:c=0,e=1,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=4,tim=3742056778
BINDS #1:
EXEC #1:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=3742056778
WAIT #1: nam=’SQL*Net message to client’ ela= 0 p1=1650815232 p2=1 p3=0
WAIT #1: nam=’file open’ ela= 0 p1=0 p2=0 p3=0
WAIT #1: nam=’db file scattered read’ ela= 14 p1=10 p2=2514 p3=128
WAIT #1: nam=’db file scattered read’ ela= 12 p1=10 p2=2642 p3=128
WAIT #1: nam=’db file scattered read’ ela= 10 p1=10 p2=2770 p3=128
WAIT #1: nam=’db file scattered read’ ela= 10 p1=10 p2=2898 p3=128
WAIT #1: nam=’db file scattered read’ ela= 12 p1=10 p2=3026 p3=128
WAIT #1: nam=’db file scattered read’ ela= 12 p1=10 p2=3154 p3=128
WAIT #1: nam=’db file scattered read’ ela= 10 p1=10 p2=3282 p3=128
WAIT #1: nam=’db file scattered read’ ela= 9 p1=10 p2=3410 p3=128

db file scattered read is our wait event — p3 is the number of blocks we
tried to read. Apparently on my OS+filesystem+db version, I can read 1m at a
time (128 * 8K block)…

Conclusion : Yes, setting db_file_multiblock_read_count to an unfeasibly high number can unduly affect the optimizer. If you can never obtain that read rate, and it
thinks you will, it could lead it down the incorrect path.

Backout Procedure

Drop table t;
Make sure you exit the current SQL Plus session which is laready tracing the events.

Find out Temp Space requirement for big operation

January 3, 2007

Purpose :
Shows how to find temp tablespace for big operations.

Time Needed : 5Min.
Preparation
There is no preparation required. If you have bigger table then just query and order by many columns.
OR you can artificially inflate the statistics and get the information like
exec dbms_stats.set_table_stats( user, ‘EMP’, numrows => 2000000, numblks => 1000000 );
exec dbms_stats.set_column_stats( user, ‘EMP’, ‘DEPTNO’, distcnt => 2000000 );

1. Test Case 1 (Already with existing big table – Siebel.s_dnb_org(few million)

explain plan for
select ROW_ID,CREATED,CREATED_BY,LAST_UPD,LAST_UPD_BY,MODIFICATION_NUM,CONFLICT_ID,DUNS_NUM,PRMTD_TO_ACCNT_FLG,PRMTD_TO_LIST_FLG,count(*)
from siebel.s_dnb_org
group by ROW_ID,CREATED,CREATED_BY,LAST_UPD,LAST_UPD_BY,MODIFICATION_NUM,CONFLICT_ID,DUNS_NUM,PRMTD_TO_ACCNT_FLG,PRMTD_TO_LIST_FLG
/

select * from table( dbms_xplan.display );
—————————————————————————-
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost |
—————————————————————————-
| 0 | SELECT STATEMENT | | 18M| 771M| | 247K|
| 1 | SORT GROUP BY | | 18M| 771M| 2393M | 247K|
| 2 | TABLE ACCESS FULL | S_DNB_ORG | 18M| 771M| | 33856 |
—————————————————————————-

2. Test Case 2 (To estimate the space required to create index on big table)

Delete from plan_table;
explain plan for
select BUSINESS_NAME,row_id
from siebel.s_dnb_org
order by business_name

select * from table( dbms_xplan.display );
—————————————————————————-
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost |
—————————————————————————-
| 0 | SELECT STATEMENT | | 21M| 724M| | 249K|
| 1 | SORT ORDER BY | | 21M| 724M| 1834M | 249K|
| 2 | TABLE ACCESS FULL | S_DNB_ORG | 21M| 724M| | 39794 |
—————————————————————————-

3. Test Case 3 (Small table inflate statistics)

Small table os space that does not go out of temp does not mention on this ??? Some time reducing the sort_area_size can show this(again disbale to pga_aggregate_target).

delete from plan_table;
explain plan for select deptno, count(*) from emp group by deptno;
select * from table( dbms_xplan.display );

Backout Procedure
None.