100% found this document useful (1 vote)
373 views

Wait Events2

This document provides more examples of how to interpret wait events in Oracle databases to boost system performance. It begins with an introduction explaining that the paper will provide new real-world examples of using wait event data to solve performance problems. It then defines what wait events are, how they can be collected, and why the data they provide is useful for performance tuning. Finally, it discusses some common wait event names and what they typically indicate.

Uploaded by

siva_dba
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
373 views

Wait Events2

This document provides more examples of how to interpret wait events in Oracle databases to boost system performance. It begins with an introduction explaining that the paper will provide new real-world examples of using wait event data to solve performance problems. It then defines what wait events are, how they can be collected, and why the data they provide is useful for performance tuning. Finally, it discusses some common wait event names and what they typically indicate.

Uploaded by

siva_dba
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 40

MORE EXAMPLES OF INTERPRETING WAIT EVENTS TO BOOST SYSTEM

PERFORMANCE
Roger Schrag and Terry Sutton, Database Specialists, Inc.

INTRODUCTION

In early 2002 Roger Schrag published a paper entitled, “Interpreting Wait Events to Boost
System Performance.” That paper began with a definition of wait events and techniques for
collecting wait event information, and then went on to discuss four examples of how wait event
data had been used to solve real-life performance problems. The impetus for writing that paper
had been that, while many resources were available that listed the wait event v$ views, we were
not aware of any publication that used real-life examples to demonstrate how to use the wait
event interface to solve problems or boost system performance.

As we write this paper in early 2003, we still are not aware of any such publications. Now Roger
Schrag has teamed up with fellow Oracle DBA Terry Sutton to document more examples of how
Oracle’s wait event interface can be used to solve real-world problems. In essence, this paper is
a sequel to last year’s paper. If you haven’t read the first one, you may view it at
http://www.dbspecialists.com/presentations.html. You can read the two papers in either order.

In this paper, we will once again define what wait events are and how to collect wait event
information. If you have read the previous paper or are already familiar with the wait event
interface, we still encourage you to read this section as we’ve added some enhancements and
new details regarding Oracle 9i. In the second part of this paper we'll move on to new examples
of how wait event information paved the way to solving real-world performance problems. Our
feeling is that the more examples of wait event usage you are exposed to, the better position
you’ll be in to put the wait event interface to work for you when you most need it.

WHAT ARE WAIT EVENTS?

The wait event interface appeared in the first production release of Oracle 7. For many years it
remained undocumented, and few people outside the kernel group at Oracle Corporation knew
of its existence. By Oracle 7.3 the word was out, and the documentation that comes with Oracle
8.0 and later gives a pretty good explanation of the wait event interface. In Oracle 9i, wait event
tracing has been made much easier with the inclusion of the DBMS_SUPPORT package in the
kernel, and with the new wait events reporting feature of TKPROF.

The basic idea behind the wait event interface hasn't changed since its inception, but Oracle
continues to add new wait events as they add new functionality to the Oracle kernel. Also, newer
enhancements make the wait event interface easier to use.

-- 1 --
In this section we will explain what wait events are and why the data provided by the wait event
interface can be helpful.

“WAIT EVENT” DEFINED

At any given moment, every Oracle process is either busy servicing a request or waiting for
something specific to happen. By “busy” we mean that the process wishes to use the CPU. For
example, a dedicated server process might be searching the buffer cache to see if a certain data
block is in memory. This process would be said to be busy and not waiting. The ARC0 process,
meanwhile, might be waiting for LGWR to signal that an online redo log needs archiving. In this
case, the ARC0 process is waiting.

The kernel developers at Oracle have defined a list of every possible event that an Oracle
process could be waiting for. In Oracle 8.0 there were 158 such wait events. In Oracle 9i Release
2 there are 363. At any moment, every Oracle process that is not busy is waiting for one of these
events to occur. Suppose an application has submitted a COMMIT statement and the server
process is waiting for the LGWR process to signal that the redo log buffer has been flushed to
disk. This wait event is called “log file sync.” Another dedicated server process might be waiting
for a row-level lock on the INVOICES table to be freed so that a SELECT FOR UPDATE statement
can continue. That wait event is called “enqueue.”

It is very beneficial that Oracle is diligent about tracking wait event information and making it
available to DBAs. We call this the “wait event interface.” By querying v$ views, we can see
what events processes are waiting on to an amazing level of detail. For example, we might learn
that a dedicated server process has been waiting 30 milliseconds for the operating system to
read eight blocks from data file 42, starting at block 18042. We can also see summary
information of how much time each Oracle process has spent waiting on each type of wait event
for the duration of the process. In addition, we can direct an Oracle process to write detailed
wait event data to a trace file for later analysis using TKPROF.

WHY WAIT EVENT INFORMATION IS USEFUL

Using the wait event interface, you can get insights into where time is being spent. If a report
takes four hours to complete, for example, the wait event interface will tell you how much of
that four hours was spent waiting for disk reads caused by full table scans, disk reads caused by
index lookups, latch contention, and so on.

The wait event interface provides much more information to work with than cache hit ratios do.
The wait event interface gives both breadth and depth in the information it provides. You get
data that can touch upon so many different areas of your database such as disk I/O, latches,
parallel processing, network traffic, checkpoints, and row-level locking. At the same time, you
get incredible detail such as the file number and block number of a block being read from disk,
or the name of a latch being waited on along with the number of retries.

-- 2 --
The wait event interface will not always give you all of the information you need in order to
diagnose and solve a problem, but it will certainly point you in the right direction. You might
think the buffer cache is too small because the cache hit ratio is only 70%, but in fact, the
application's slow response time could be caused by latch contention in the shared pool, a
bottleneck in the log writer, or any of a number of other things.

COMMON WAIT EVENT NAMES AND WHAT THEY MEAN

Although there are many different types of wait events, the majority of them come up very
infrequently or tend not to be significant. In practice, only a few dozen wait events tend to be of
interest to most DBAs. The rest are rather obscure, pertain to Oracle features not in use, or
occur so infrequently that you don't need to worry about them. You'll see different wait events
surfacing in different environments based on which Oracle features have been implemented and
which capabilities of the database are being taxed the most.

For example, the PX wait events won't appear if you aren't using parallel query, and the virtual
circuit status wait event won't appear if you are not using the multi-threaded server architecture
(or shared server architecture as it is more recently called). Along those lines, the log file sync
and enqueue wait events probably won’t be prevalent in a primarily read-only system.

Here are some of the most common wait events and what they mean:

Wait Event Description


buffer busy waits The session wants to access a data block that is either 1) currently not
in memory, but another process has already issued an I/O request to
read the block into memory, or 2) in memory but in an incompatible
mode (current versus consistent, for example).
control file parallel The session has issued multiple I/O requests in parallel to write blocks
write to all control files, and is waiting for all of the writes to complete.
control file The session is waiting for blocks to be read from a control file.
sequential read
db file parallel read The session has issued multiple I/O requests in parallel to read blocks
from data files into memory and is waiting for all requests to complete.
This may occur during recovery or during regular activity when a
session batches many single block I/O requests together and issues
them in parallel.
db file parallel write The process, typically DBWR, has issued multiple I/O requests in parallel
to write dirty blocks from the buffer cache to disk and is waiting for all
requests to complete.
db file scattered The session has issued an I/O request to read a series of contiguous
read blocks from a data file into the buffer cache and is waiting for the
operation to complete. This typically happens during a full table scan or
fast full index scan.
db file sequential The session has issued an I/O request to read one block from a data file

-- 3 --
read into the buffer cache and is waiting for the operation to complete. This
typically happens during an index lookup or a fetch from a table by
ROWID when the required data block is not already in memory.
direct path read, The session has issued asynchronous I/O requests that bypass the
direct path write buffer cache and is waiting for them to complete. These wait events
often involve temporary segments.
enqueue The session is waiting on an enqueue (a lock you can see in v$lock).
This commonly occurs when one user is trying to update a row in a table
that is currently being updated by another user.
free buffer waits The session needs a free buffer so it can bring a data block into the
buffer cache and is waiting for a buffer that is not dirty to become
available. This can occur if DBWR is not writing dirty buffers to disk fast
enough.
latch free The session is waiting for a latch held by another session. (This event
does not apply to processes that are spinning while waiting for a latch;
when a process is spinning, it is not waiting.)
library cache load The session is waiting for the opportunity to load an object or a piece of
lock an object into the library cache. (Only one process can load an object or
a piece of an object at a time.)
library cache pin The session wants to pin an object in memory in the library cache for
examination, ensuring no other processes can update the object at the
same time. This happens when you are compiling or parsing a PL/SQL
object or a view.
log buffer space The session is waiting for space in the log buffer. (Space becomes
available only after LGWR has written the current contents of the log
buffer to disk.) This typically happens when applications generate redo
faster than LGWR can write it to disk.
log file parallel write The session is waiting for blocks to be written to all online redo log
members in one group. LGWR is typically the only process to see this
wait event. It will wait until all blocks have been written to all members.
log file sequential The session is waiting for blocks to be read from the online redo log into
read memory. This primarily occurs at instance startup and when the ARCH
process archives filled online redo logs.
log file switch The session is waiting for a log file switch to complete, typically so more
completion redo can be generated.
log file sync The session is waiting for LGWR to finish flushing the log buffer to disk.
This occurs when a user commits a transaction. (A transaction is not
considered committed until all of the redo to recover the transaction has
been successfully written to disk.)
undo segment The session is waiting for an undo segment to be extended or shrunk.
extension
write complete The session is waiting for a requested buffer to be written to disk; the
waits buffer cannot be used while it is being written.

-- 4 --
There are several wait events that we call “idle events” because each of these wait events
typically occurs when the Oracle process has nothing to do and is waiting for somebody to give it
a task. Idle events are usually not very interesting from a tuning standpoint, so we usually
overlook them when evaluating data extracted from the wait event interface. The common idle
events are as follows:

Idle Events
client message PX Deq: Execute Reply
dispatcher timer PX Deq: Execution Msg
gcs for action PX Deq: Signal ACK
gcs remote message PX Deq: Table Q Normal
ges remote message PX Deque wait
i/o slave wait PX Idle Wait
jobq slave wait queue messages
lock manager wait for remote rdbms ipc message
message
null event slave wait
parallel query dequeue smon timer
pipe get SQL*Net message from client
PL/SQL lock timer SQL*Net message to client
pmon timer SQL*Net more data from
client
PX Deq Credit: need buffer virtual circuit status
PX Deq Credit: send blkd wakeup time manager

WHAT WAIT EVENT INFORMATION DOES NOT PROVIDE

If an Oracle process has work to do but must wait for something to happen before it can
continue, then the process will be waiting on a non-idle wait event. If a process has nothing to
do, it will be waiting on an idle wait event. So what happens if a process has work to do and is
busy doing it? When a process is busy, there will be no information in the wait event interface
since the process is not waiting.

When we look at the wait event information extracted from an Oracle instance, we see detailed
information about how many times and how much time was spent waiting for specific events to
occur. But we do not see anything about the time periods in which the process requested use of
the CPU. This means that the wait event interface is not able to provide information about the
following:

• Time spent using the CPU


• Time spent waiting for a CPU to become available

-- 5 --
• Time spent waiting for requested memory to be swapped back in to physical memory

This is important to keep in mind because there is an easy trap to fall into. You could be
troubleshooting a very slow SELECT statement and learn from the wait event interface that the
process does not have significant wait events when running the query. This could lead you to
think that the statement is optimal, and that it just takes a long time to run. In fact, however,
the query might be performing huge numbers of logical reads and the number of buffer gets
could be reduced by rewriting the query.

When Oracle needs to access a data block and the block is already in the buffer cache, a logical
read occurs with no physical read. The process is able to read the block without the occurrence
of any wait events. Large amounts of CPU time could be consumed on significant numbers of
logical reads, and the wait event interface will have nothing to show for the elapsed time.

Statement parsing and spinning while waiting for a latch to become available are two other
examples of activities not accounted for by the wait event interface. An Oracle process uses CPU
time while parsing a statement or spinning on a busy latch; since no waiting is involved, the wait
event interface does not have anything to report.

Oracle Corporation has done something interesting with Statspack recently. Beginning in Oracle
9i, the portion of the Statspack report that lists wait event information also lists CPU usage. This
is very helpful information. It allows us to easily compare time spent waiting to time spent
processing, so we know where to focus our tuning efforts. However, it should be pointed out that
the CPU usage information in this section of the Statspack report does not come from the wait
event interface. Instead, Statspack merges data collected from the wait event interface with CPU
usage information collected from the v$sysstat dynamic performance view.

A NOTE ABOUT THE TIMED_STATISTICS PARAMETER

The Oracle kernel is capable of timing many activities including wait events. In Oracle 8i and
earlier, timed statistics can be tracked to a resolution of one centisecond (0.01 second) but are
disabled by default. Beginning in Oracle 9i Release 2 timed statistics are collected by default,
and some timings are collected in units of microseconds. The timed_statistics instance
parameter is used to enable and disable timed statistics collection. When timed_statistics is set
to FALSE, all times in the wait event interface will appear as zero. You may enable timed
statistics collection at the instance or the session level with the following commands:

• ALTER SYSTEM SET timed_statistics = TRUE;

• ALTER SESSION SET timed_statistics = TRUE;

You may enable timed statistics at the instance level on the next and all subsequent instance
restarts by adding the following line to the instance parameter file:

timed_statistics = true

-- 6 --
In practice, the overhead of collecting timed statistics is extremely small. In most cases, the
benefit you'll get from having timing information at your disposal will outweigh the performance
overhead. For several years, many DBAs have been running their production systems with timed
statistics enabled at all times. With Oracle 9i Release 2, this is now the default behavior.

COLLECTING WAIT EVENT INFORMATION

The wait event interface consists of four dynamic performance views (also known as “v$ views”)
and a tracing facility. Any user with the SELECT ANY TABLE system privilege or the
SELECT_CATALOG_ROLE role can query the v$ views. Only users who can connect to the
database as SYSDBA or execute the DBMS_SUPPORT or DBMS_SYSTEM packages can activate
wait event tracing in other database sessions. In a typical environment, DBAs have access to the
wait event interface but general users do not.

The v$system_event and v$session_event views provide cumulative wait event information for
the instance as a whole and for each process, respectively. The v$session_wait view provides
detailed information about the active or most recent wait event for each process. The contents
of the v$event_name view, meanwhile, do not change. This view lists all wait events built into
the Oracle kernel and the parameters for each one.

In addition to the four dynamic performance views, Oracle provides a tracing facility where
extremely detailed wait event information is written to a trace file for later evaluation. In more
recent releases of Oracle, a PL/SQL package is provided for activating the wait event tracing
facility. In older releases of Oracle, the wait event tracing facility is activated by setting debug
event number 10046 (not to be confused with a wait event).

THE V$SYSTEM_EVENT VIEW

The v$system_event view shows one row for each wait event name, along with the total number
of times a process has waited for this event, the number of timeouts, the total amount of time
waited, and the average wait time. All of these figures are cumulative for all Oracle processes
since the instance started. Wait events that have not occurred at least once since instance
startup do not appear in this view.

-- 7 --
The v$system_event view has the following columns:

Name Null? Type


------------------------------- -------- ----
EVENT VARCHAR2(64)
TOTAL_WAITS NUMBER
TOTAL_TIMEOUTS NUMBER
TIME_WAITED NUMBER
AVERAGE_WAIT NUMBER
TIME_WAITED_MICRO NUMBER

EVENT is the name of the wait event. You can see a list of all wait event names known to your
Oracle kernel with the following query:

SELECT name FROM v$event_name;

TOTAL_WAITS is the total number of times a process has waited for this event since instance
startup. This includes daemon processes like PMON and SMON, in addition to dedicated server
and shared server processes. It also includes processes from database sessions that have
subsequently ended.

TOTAL_TIMEOUTS is the total number of times a process encountered a timeout while waiting for
an event. When a process begins to wait for an event, it specifies a timeout period after which
the operating system should wake it up if the event has not yet transpired. For example, when
an Oracle process issues an I/O request to read a block from a data file (the db file sequential
read wait event), the process sets a timeout of one second. Usually the I/O request will complete
in less than one second and no timeout will occur. But if the read should take longer than one
second for whatever reason, a timeout will occur and the process will wake up. The process
might do some minor housekeeping, but it will likely just begin another timeout period of one
second and continue waiting for the same event.

TIME_WAITED and AVERAGE_WAIT show the cumulative and average time spent by processes
waiting for this event, in centiseconds. Divide these figures by 100 in order to get the wait time
in seconds. These two columns will show as zero if timed statistics are not enabled.

TIME_WAITED_MICRO is the same as TIME_WAITED, except that the time is in microseconds.


Divide this figure by 1000000 in order to get the wait time in seconds. Note that this column is
not present in Oracle 8i and earlier.

Consider the following query from v$system_event:

SQL> SELECT event, time_waited


2 FROM v$system_event
3 WHERE event IN ('SQL*Net message from client', 'smon timer',
4 'db file sequential read', 'log file parallel write');

EVENT TIME_WAITED
------------------------------------------------------ -----------
-- 8 --
log file parallel write 159692
db file sequential read 28657
smon timer 130673837
SQL*Net message from client 16528989

SQL>

Since instance startup, processes on this system have waited a total of 286.57 seconds while
reading single data file blocks from disk, and over 1,596 seconds (26 minutes) while writing redo
to the online redo logs. A huge amount of time has been spent waiting on the smon timer and
SQL*Net message from client events, but these are both idle wait events so we don't worry
about them. (The SMON process spends a lot of time sleeping between consecutive checks of
the system, and many dedicated server processes spend a lot of their time waiting for the
application to submit a SQL statement for processing.)

THE V$SESSION_EVENT VIEW

The v$session_event view is a lot like the v$system_event view, except that it shows separate
rows of information for each Oracle process. As with v$system_event, event names do not
appear in this view if the process has not waited for them at least once. Also, when an Oracle
process terminates (as in the case of when a user logs off the database) all of the rows in
v$session_event for that process permanently disappear.

The v$session_event view has the following columns:

Name Null? Type


------------------------------- -------- ----
SID NUMBER
EVENT VARCHAR2(64)
TOTAL_WAITS NUMBER
TOTAL_TIMEOUTS NUMBER
TIME_WAITED NUMBER
AVERAGE_WAIT NUMBER
MAX_WAIT NUMBER
TIME_WAITED_MICRO NUMBER

SID indicates the session ID of the process waiting for the event. You can query v$session in
order to determine the SID of the session whose wait events you want to investigate. The next
five columns in the v$session_event view are the same as in the v$system_event view, except
that now they pertain to the one specific process instead of all processes.

MAX_WAIT indicates the maximum amount of time the process had to wait for the event. Like
TIME_WAITED and AVERAGE_WAIT, the unit of measure is centiseconds and will display as zero if
timed statistics are not enabled.

As in v$system_event, TIME_WAITED_MICRO is the same as TIME_WAITED except that the time is


in microseconds. Again, this column is not present in Oracle 8i and earlier.

-- 9 --
Consider the following query which displays all wait event information for the current SQL*Plus
session:

SQL> SELECT event, total_waits, time_waited_micro, max_wait


2 FROM v$session_event
3 WHERE SID =
4 (SELECT sid FROM v$session
5 WHERE audsid = USERENV ('sessionid') );

EVENT TOTAL_WAITS TIME_WAITED_MICRO MAX_WAIT


------------------------------ ----------- ----------------- ----------
log file sync 1 19629 2
db file sequential read 3 26128 1
db file scattered read 678 1154632 8
SQL*Net message to client 76 352 0
SQL*Net more data to client 1 63 0
SQL*Net message from client 75 1118656279 26930
SQL*Net break/reset to client 4 22281 2

SQL>

You can see that the Oracle process serving this session has spent 1.180760 seconds waiting for
disk I/O. Although there have been 76 instances where the process waited for the networking
software to allow it to send a message to the client (the SQL*Plus program), each of these waits
was shorter than the 0.01 second resolution of the MAX_WAIT column in v$session_event. (All 76
waits combined added up to only 0.000352 seconds.) Far and away, the Oracle process has
spent the vast majority of its time (over 18 minutes) waiting for a SQL statement to be entered
at the SQL*Plus prompt.

If you run the query against v$session_event shown above and get back no rows, it is possible
that you are encountering Oracle bug number 2429929. In some releases of Oracle 9i, including
9.2.0.1, the SID column in v$session_event actually shows the SID minus one instead of the SID.

THE V$EVENT_NAME VIEW

The v$event_name view shows reference information rather than up-to-the-moment information
about instance performance. This view shows one row for each wait event known to the Oracle
kernel. Associated with each wait event are up to three parameters—additional pieces of
information that provide more detail about a wait situation.

The v$event_name view has the following columns:

Name Null? Type


------------------------------- -------- ----
EVENT# NUMBER
NAME VARCHAR2(64)
PARAMETER1 VARCHAR2(64)
PARAMETER2 VARCHAR2(64)

-- 10 --
PARAMETER3 VARCHAR2(64)

EVENT# indicates a unique number assigned to each wait event name. These numbers
frequently change from one release of Oracle to the next, so it is better to reference wait events
by their names and not their numbers. Wait event names appear to stay consistent across
Oracle releases.

The v$system_event and v$session_event views show cumulative information about past waits
in summary form, leaving out parameter information from each individual wait. As we will see in
the next sections, wait event parameters come into play in the v$session_wait view and wait
event trace files.

THE V$SESSION_WAIT VIEW

The v$session_wait view shows one row for each Oracle process. The row indicates the name of
the wait event and additional parameters that provide further information about exactly what
the process is waiting for (or information about the most recent wait event if the process is not
currently waiting). While the v$system_event and v$session_event views show cumulative wait
event information, the v$session_wait view shows information as of the present moment only.

The v$session_wait view has the following columns:

Name Null? Type


------------------------------- -------- ----
SID NUMBER
SEQ# NUMBER
EVENT VARCHAR2(64)
P1TEXT VARCHAR2(64)
P1 NUMBER
P1RAW RAW(8)
P2TEXT VARCHAR2(64)
P2 NUMBER
P2RAW RAW(8)
P3TEXT VARCHAR2(64)
P3 NUMBER
P3RAW RAW(8)
WAIT_TIME NUMBER
SECONDS_IN_WAIT NUMBER
STATE VARCHAR2(19)

SID indicates the process. SEQ# is a sequentially increasing number that starts at one for each
process and increments each time the process begins a new wait.

The STATE column indicates how we should interpret the data in this row of the view. If the value
in the STATE column is WAITING, then the process is currently waiting for an event. In this case,
we can see information about the event and how long the process has been waiting so far.
Otherwise, the process is currently not waiting, but we can see information about the last event
that the process waited for.
-- 11 --
EVENT is the name of a wait event. P1TEXT is the name of a parameter for the wait event, P1 is
the value of the parameter, and P1RAW is the value in binary form. The P2 and P3 columns
provide additional parameter information.

When the value in the STATE column is WAITING, the value in the WAIT_TIME column will be zero
and SECONDS_IN_WAIT will show the number of seconds the process has been waiting for the
event thus far. Note that SECONDS_IN_WAIT shows the time in seconds, not centiseconds or
microseconds.

When the value in the STATE column is WAITED KNOWN TIME, WAIT_TIME will show the length of
the last wait (in centiseconds) and SECONDS_IN_WAIT will not be relevant. (It appears to be the
number of seconds since the last wait began, but this is not clear.) The STATE could also be
WAITED UNKNOWN TIME or WAITED SHORT TIME, the latter indicating that the last wait was less
than one centisecond in duration.

The following query shows the parameters associated with the db file scattered read wait event:

SQL> SELECT *
2 FROM v$event_name
3 WHERE name = 'db file scattered read';

EVENT# NAME
---------- ----------------------------------------------------------------
PARAMETER1 PARAMETER2 PARAMETER3
------------------------- ------------------------- -------------------------
188 db file scattered read
file# block# blocks

SQL>

This tells us that when a process is waiting for a multi-block read from disk to complete (as in
the case of a full table scan where the data blocks were not already in the buffer cache), we can
see the file from which the blocks are being read as well as the starting block and number of
blocks.

The following query was run while a session was performing a full table scan:

SQL> SELECT * FROM v$session_wait WHERE sid = 16;

SID SEQ# EVENT


---------- ---------- ------------------------------
P1TEXT P1 P1RAW
------------------------------ ---------- --------
P2TEXT P2 P2RAW
------------------------------ ---------- --------
P3TEXT P3 P3RAW WAIT_TIME SECONDS_IN_WAIT
------------------------------ ---------- -------- ---------- ---------------
STATE

-- 12 --
-------------------
16 303 db file scattered read
file# 17 00000011
block# 2721 00000AA1
blocks 8 00000008 -1 0
WAITED SHORT TIME

SQL>

You can see that the process was not waiting at the moment this query was run, but its last wait
had been for an I/O request to read eight blocks from file 17 starting at block 2721. That I/O
request had completed in less than 0.01 second. (Note the -1 in the WAIT_TIME column when the
STATE is WAITED SHORT TIME.) Why did Oracle choose to read eight blocks at a time? Because
the db_file_multiblock_read_count instance parameter was set to eight.

TRACING WAIT EVENT ACTIVITY

DBAs should be familiar with the SQL trace facility built into Oracle. By using the commands
below, you can enable SQL trace for your session or another session:

• ALTER SESSION SET sql_trace = TRUE;

• EXECUTE SYS.dbms_system.set_sql_trace_in_session (sid, serial#, TRUE);

When SQL trace is enabled for a session, the Oracle process writes detailed trace information
(including timing data if timed statistics are enabled) to a trace file in a directory specified by
the user_dump_dest instance parameter. These trace files are plain text files and human
readable, but rather tedious and repetitive. You can optionally run trace files through a
processor such as TKPROF instead of looking at them directly.

Oracle has the ability to direct a process to include additional information in the trace file,
including wait event information. In earlier releases of Oracle, we activated the wait event
tracing facility by setting debug event 10046. Debug event 10046 still works in Oracle 9i, but
recent releases of Oracle include a PL/SQL package built-in that gives a more friendly means for
activating wait event tracing.

Setting debug events allows a DBA to instruct an Oracle instance to take on a special, atypical
behavior. Debug events can be used, for example, to cause Oracle to write a system level dump
file whenever an ORA-0600 error occurs or skip over corrupted blocks in a table when
performing a full table scan. Most debug events should never be set unless you are directed to
do so by Oracle Support. Some debug events can put your database at risk.

Debug event 10046 affects the amount of information written to trace files. It is a very safe
debug event, and one of the few that you are allowed to set without special permission from
Oracle Support. Debug event 10046 can be set to the following values:

-- 13 --
Debug Event Setting Effect
10046 trace name context forever, Enables ordinary SQL trace
level 1
10046 trace name context forever, Enables SQL trace with bind variable values included
level 4 in trace file
10046 trace name context forever, Enables SQL trace with wait event information
level 8 included in trace file
10046 trace name context forever, Equivalent of level 4 and level 8 together
level 12
10046 trace name context off Turns off tracing

You can set the 10046 debug event to trace your session and collect wait event information in
the trace file with either of the following commands:

• EXECUTE SYS.DBMS_SUPPORT.START_TRACE

• ALTER SESSION SET events '10046 trace name context forever, level 8';

You can set the 10046 debug event to trace another session on the database with any of the
following commands:

• EXECUTE SYS.DBMS_SUPPORT.START_TRACE_IN_SESSION (sid, serial#)

• oradebug setorapid [Oracle PID from v$process]


oradebug session_event 10046 trace name context forever, level 8

• EXECUTE SYS.DBMS_SYSTEM.SET_EV (sid, serial#, 10046, 8, '')

Calling DBMS_SUPPORT as shown here is equivalent to activating debug event 10046 at level 8.
You may include optional additional parameters in the procedure call to activate level 4 or 12.
The DBMS_SUPPORT package is not installed in the database by default. You need to run the
dbmssupp.sql script found in the rdbms/admin directory as SYS before you can call
DBMS_SUPPORT. This script was left out of many releases of Oracle 8i by mistake, but it is
available separately as a patch.

It should also be pointed out that the SET_EV procedure in the DBMS_SYSTEM package is not
officially supported by Oracle Corporation—they prefer that you use DBMS_SUPPORT instead.
The reason is that DBMS_SYSTEM.SET_EV allows you to set any debug event in any session. As
mentioned above some debug events can potentially be dangerous, and DBMS_SYSTEM.SET_EV
lets you set such events in any session.

Unlike timed statistics, tracing consumes a significant amount of system resources. Therefore it
is important to use the tracing facility sparingly. Trace only the sessions you need to trace, and
only for as long as you need. Turn off tracing as soon as it is no longer needed with any of the
following commands:

-- 14 --
• EXECUTE SYS.DBMS_SUPPORT.STOP_TRACE

• ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';

• ALTER SESSION SET sql_trace = FALSE;

• EXECUTE SYS.DBMS_SUPPORT.STOP_TRACE_IN_SESSION (sid, serial#)

• oradebug setorapid [Oracle PID from v$process]


oradebug session_event 10046 trace name context forever, level 0

• EXECUTE SYS.DBMS_SYSTEM.SET_EV (sid, serial#, 10046, 0, '')

When you set debug event 10046 to a level of 8 or 12 (or call START_TRACE or
START_TRACE_IN_SESSION in DBMS_SUPPORT), the Oracle process will write a line into the trace
file every time it finishes waiting for an event. The line in the trace file will contain almost the
same information that would have appeared in the v$session_wait view, but perhaps in a slightly
less friendly format. You can also see in the trace file which cursor (and therefore which SQL
statement) the wait event was associated with.

Here is an excerpt from a trace file generated by setting debug event 10046 to level 12 on an
Oracle 8i database:

=====================
PARSING IN CURSOR #1 len=80 dep=0 uid=502 oct=3 lid=502
tim=2293771931 hv=2293373707 ad='511dca20'
SELECT /*+ FULL */ SUM (LENGTH(notes))
FROM customer_calls
WHERE status = :x
END OF STMT
PARSE #1:c=0,e=0,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=0,tim=2293771931
BINDS #1:
bind 0: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0
size=24 offset=0
bfp=09717724 bln=22 avl=02 flg=05
value=43
EXEC #1:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=2293771931
WAIT #1: nam='SQL*Net message to client' ela= 0 p1=675562835 p2=1 p3=0
WAIT #1: nam='db file scattered read' ela= 3 p1=17 p2=923 p3=8
WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=931 p3=8
WAIT #1: nam='db file scattered read' ela= 2 p1=17 p2=939 p3=8
WAIT #1: nam='db file sequential read' ela= 0 p1=17 p2=947 p3=1
WAIT #1: nam='db file scattered read' ela= 3 p1=17 p2=1657 p3=8
WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=1665 p3=8
WAIT #1: nam='db file scattered read' ela= 2 p1=17 p2=1673 p3=8
WAIT #1: nam='db file scattered read' ela= 0 p1=17 p2=1681 p3=8
WAIT #1: nam='db file scattered read' ela= 3 p1=17 p2=1761 p3=8
WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=1769 p3=8
WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=1777 p3=8
WAIT #1: nam='db file scattered read' ela= 0 p1=17 p2=1785 p3=8
WAIT #1: nam='db file scattered read' ela= 2 p1=17 p2=1841 p3=8
WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=1849 p3=8

-- 15 --
The “ela=” figures indicate the duration of the wait. In trace files generated by Oracle 8i and
earlier, the elapsed time is shown in centiseconds. Beginning in Oracle 9i, the elapsed time is
shown in microseconds.

You can see that when the session executed the query, there was a wait shorter than one
centisecond for a message to be sent to the client, followed by a bunch of waits for I/O requests
against file 17. Most of the I/O requests were multi-block reads, reading eight blocks at a time
while performing a full table scan of the CUSTOMER_CALLS table.

You can probably imagine how large and overwhelming trace files can get when anything but a
trivial application is traced. This is why Oracle provides the TKPROF utility. TKPROF takes a trace
file as input and generates a nicely formatted, easy to read report. Unfortunately, the TKPROF
that comes with Oracle 8i and earlier releases ignores wait event information written in the trace
file. To analyze the wait event information collected by a trace in an Oracle 8i or earlier
environment, you will need to either pore through the trace file manually or write your own trace
file parser and formatter. (We’ve heard of people writing perl scripts to do this, but we have
never come across them.)

The TKPROF utility that comes with Oracle 9i is capable of reporting wait event information
summarized by distinct statement. By default Oracle 9i TKPROF ignores wait event information
in the trace file as earlier releases did. However, if the “waits=yes” command line argument is
provided, a separate table of wait event statistics will appear in the report for each distinct
statement. Here is a sample excerpt from a TKPROF report showing wait event information:

********************************************************************************
SELECT A.customer_id, A.customer_name, COUNT (*) purchases,
MAX (B.transaction_date) last_purchase_date
FROM customers A, purchase_history B
WHERE B.customer_id = A.customer_id
GROUP BY A.customer_id, A.customer_name
ORDER BY A.customer_id

call count cpu elapsed disk query current rows


------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.06 0.12 0 27 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1461 21.41 25.80 7801 5905 5 21893
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1463 21.47 25.93 7801 5932 5 21893

Misses in library cache during parse: 1


Optimizer goal: CHOOSE
Parsing user id: 17

Rows Row Source Operation


------- ---------------------------------------------------
21893 SORT GROUP BY
1525494 HASH JOIN
31212 TABLE ACCESS FULL CUSTOMERS
1525494 INDEX FAST FULL SCAN PURCHASE_HISTORY_PK (object id 7824)

Elapsed times include waiting on following events:


-- 16 --
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 1461 0.00 0.00
db file sequential read 4 0.00 0.01
db file scattered read 750 0.07 2.25
direct path write 271 0.83 2.79
direct path read 254 0.00 0.01
SQL*Net message from client 1461 1.28 22.40
********************************************************************************

You can see that the query was parsed and executed once, and there were 1461 fetches to read
21893 rows. The Oracle process used 21.47 CPU seconds and 25.93 seconds of elapsed time to
parse, execute, and fetch the query. The elapsed time was greater than the CPU time because
the Oracle process had to wait on several wait events. The listing at the bottom of the report
shows the process waited on network roundtrips to the client and reads and writes to disk.

All of these waits seem to make sense: There was one network roundtrip to the client for each
fetch call, and it took SQL*Plus a long time (over 22 seconds) to display the 21,893 rows of data
on my display. The db file sequential reads and db file scattered reads resulted from the table
access and index fast full scan. The direct path writes and reads resulted from the hash join and
sort operations. If you add the CPU time of 21.47 seconds and the wait times (5.06 seconds not
including the SQL*Net message from client waits) you come up with an elapsed time of 26.53
seconds. However, the report shows the total elapsed time as 25.93 seconds. This is a good
example of the round-off error that is not unusual when thousands of very short time intervals
are added together.

In some situations you may not be able to identify exactly which session to trace. This happens
frequently when your database is supporting a web application that has frequent short
connections, or an application server tier that maintains a pool of database connections all
logged on as the same Oracle user. You may have a specific query you want to collect wait event
information for, but you may not know which session will be executing the query. One way to
deal with this situation is to trace all of the sessions being started by your application for a brief
period. This can be done easily with the following SQL*Plus script:

SPOOL traceall.sql

SET HEADING OFF FEEDBACK OFF

SELECT 'EXECUTE SYS.dbms_system.set_ev (' || TO_CHAR (sid) ||


', ' || TO_CHAR (serial#) || ', 10046, 8, '''')'
FROM v$session
WHERE username = 'WEB_USER';

SPOOL OFF

SET FEEDBACK ON

@traceall.sql

This script will start wait event tracing on all sessions connected to the database as the
WEB_USER user. To stop tracing simply change the 8 after the 10046 to a 0 and run the script
-- 17 --
again. You should think about how many sessions this script will trace and how that will affect
server load before running on a production system.

After you’ve traced a number of sessions, you can scan the trace files for occurrences of a
specific query with a command like the following:

grep –n 1234567890 *.trc

Replace 1234567890 with the hash value or address of the statement you are interested in. (You
can get these values by querying v$sql.) The output of the above grep command will tell you
which trace files the query appears in and on which lines. You can then go to that point in the
trace files and get detailed wait event information.

Trace files are easiest to work with when you use a dedicated server connection to the database.
If you connect to the database using Oracle’s shared server architecture then different SQL
statements may be executed by different server processes. Each server process writes to its own
trace file. Thus the trace information for the session can be spread over several trace files.

HOW WAIT EVENT INFORMATION CAN BE USED TO BOOST SYSTEM PERFORMANCE

In the examples below, we will apply the techniques described above for collecting wait event
information, and we will evaluate the data in order to better understand how the system is
functioning and determine what we can do to boost system performance. Hopefully these
examples will give you insights into researching performance and other types of problems on
your systems.

EXAMPLE #1: BUFFER BUSY WAITS

A magazine publisher had developed a web site that offered web access to many of the articles
in the magazine. At times, the website would get completely bogged down and response time
would become very poor. Operating system statistics on the database server indicated that
during busy times, server idle time would be near zero. We used Statspack to gather database
statistics on an hourly basis. Excerpts from one Statspack report are as follows:

Snap Id Snap Time Sessions


------- ------------------ --------
Begin Snap: 61 11-Dec-02 13:00:52 145
End Snap: 71 11-Dec-02 14:00:26 145
Elapsed: 59.57 (mins)

Cache Sizes
~~~~~~~~~~~
db_block_buffers: 38400 log_buffer: 1048576
db_block_size: 8192 shared_pool_size: 80000000

Load Profile

-- 18 --
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 6,390.41 5,731.32
Logical reads: 33,962.40 30,459.63
Block changes: 44.72 40.11
Physical reads: 536.77 481.41
Physical writes: 163.02 146.21
User calls: 397.75 356.73
Parses: 237.64 213.13
Hard parses: 23.84 21.38
Sorts: 248.38 222.77
Logons: 0.41 0.36
Executes: 213.52 191.50
Transactions: 1.11

% Blocks changed per Read: 0.13 Recursive Call %: 86.07


Rollback per transaction %: 0.23 Rows per Sort: 30.73

Instance Efficiency Percentages (Target 100%)


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 98.39 Redo NoWait %: 99.97
Buffer Hit %: 98.42 In-memory Sort %: 100.00

Top 5 Wait Events


~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
buffer busy waits 1,962,372 1,278,649 50.03
db file sequential read 1,336,870 1,050,878 41.12
db file scattered read 47,717 49,326 1.93
direct path write 8,070 40,574 1.59
latch free 38,220 31,012 1.21

Buffer Pool Statistics for DB:


-> Pools D: default pool, K: keep pool, R: recycle pool
Free Write Buffer
Buffer Consistent Physical Physical Buffer Complete Busy
P Gets Gets Reads Writes Waits Waits Waits
- ----------- ------------- ----------- ---------- ------- -------- ----------
D 12,292 49,863,696 470 3,530 0 0 32
K 1,509,385 20,691,972 1,509,366 11 0 0 1,962,117
R 5 13 0 0 0 0 0

We noticed that the two dominant wait events were buffer busy waits and db file sequential
read. These two wait events accounted for over 23,000 seconds of wait time during the one
hour sample period. We decided to drill down to learn more about these wait events. Looking at
v$event_name, we see that the parameters for the buffer busy waits event are as follows:

SQL> SELECT * FROM v$event_name WHERE name = 'buffer busy waits';

EVENT# NAME PARAMETER1 PARAMETER2 PARAMETER3


---------- -------------------- --------------- --------------- ---------------
75 buffer busy waits file# block# id

SQL>

-- 19 --
For this wait event, the parameter information from v$event_name points you in the right
direction but does not necessarily give you the full story. The first and second parameters tell
you the file and block number of the data block in the database that the session was trying to
access when the wait event occurred. The third parameter is a numeric reason code indicating
why the buffer in the buffer cache was busy and the process had to wait. Metalink bulletin
34405.1 explains what the different reason codes mean.

The following query will list all sessions waiting for a given wait event at the split second the
query is begun:

SELECT sid, event, state, seconds_in_wait, wait_time, p1, p2, p3


FROM v$session_wait
WHERE event = 'buffer busy waits'
ORDER BY sid;

Because only a couple (if any) sessions are likely to be encountering this wait at any one
moment in time, we usually need to run this query several times in quick succession. One of the
times, we got the following output:

SQL> /

SID EVENT STATE SECONDS_IN_WAIT WAIT_TIME P1 P2 P3


---- ----------------- ----- --------------- --------- ------ ------ ------
12 buffer busy waits WAITE 1 0 30 62157 130
31 buffer busy waits WAITE 1 0 30 23558 130

SQL>

We performed the following query for each session wait found, substituting P1 for the file
number and P2 for the block number:

SELECT owner, segment_name, segment_type


FROM dba_extents
WHERE file_id = &absolute_file_number
AND &block_number BETWEEN block_id AND block_id + blocks -1;

Using this query we found that session 12 above was waiting on a busy buffer in the SAMPLES
table in the PRODMGR schema:

SQL> /
Enter value for absolute_file_number: 30
Enter value for block_number: 62157

OWNER SEGMENT_NAME SEGMENT_TYPE


----------------- ------------------- ------------
PRODMGR SAMPLES TABLE

SQL>

-- 20 --
In order to figure out what reason code 130 means, we look at a reference table listed in
Metalink bulletin 34405.1. For convenience this table appears below. We can see that the buffer
is busy because the block is being read by another session.

>=8.1.
<=8.0.6 Reason
6
0 0 A block is being read
We want to NEW the block but the block is currently being read by another session
1003 100
(most likely for undo).
We want to NEW the block but someone else has is using the current copy so we
1007 200
have to wait for them to finish.
Trying to get a buffer in CR/CRX mode , but a modification has started on the buffer
1010 230
that has not yet been completed.
1012 - A modification is happening on a SCUR or XCUR buffer, but has not yet completed
1012
CR/CRX scan found the CURRENT block, but a modification has started on the buffer
(duplicat 231
that has not yet been completed.
e)
Block is being read by another session and no other suitable block image was found,
so we wait until the read is completed. This may also occur after a buffer cache
1013 130
assumed deadlock. The kernel can't get a buffer in a certain amount of time and
assumes a deadlock. Therefor it will read the CR version of the block.
We want the CURRENT block either shared or exclusive but the Block is being read
1014 110
into cache by another session, so we have to wait until their read() is completed.
1014
We want to get the block in current mode but someone else is currently reading it
(duplicat 120
into the cache. Wait for them to complete the read. This occurs during buffer lookup.
e)
The session wants the block in SCUR or XCUR mode. If this is a buffer exchange or
the session is in discrete TX mode, the session waits for the first time and the second
1016 210 time escalates the block as a deadlock and so does not show up as waiting very long.
In this case the statistic: “exchange deadlocks” is incremented and we yield the CPU
for the “buffer deadlock” wait event.
1016
During buffer lookup for a CURRENT copy of a buffer we have found the buffer but
(duplicat 220
someone holds it in an incompatible mode so we have to wait.
e)

Next we looked for sessions waiting on the db file sequential read event—sessions waiting on
index lookups or table accesses by ROWID:

SELECT sid, event, state, seconds_in_wait, wait_time, p1, p2, p3


FROM v$session_wait
WHERE event = 'db file sequential read'
ORDER BY sid;

Following the same approach as used with the buffer busy waits, we then drilled down to find
which segments in the database sessions were accessing with the single block I/Os. In this case
it was again the SAMPLES table.
-- 21 --
From this we concluded that the buffer busy waits appeared related to the single block disk
reads—buffer block reads were waiting for other buffer block reads to complete, and those in
turn were waiting for sequential reads from disk to complete. Put another way, session A was
waiting on a single block disk I/O to bring a block into the buffer cache. Meanwhile, session B
wanted to read the same block. Since the block was not yet in the buffer cache and another
session was already reading it from disk, session B waited on a buffer busy waits event while
session A waited on the db file sequential read event.

When we looked at the SAMPLES table further, we found that it contained a LONG column and
there was a fair amount of row chaining—some rows were larger than the database block size.
Many queries against multiple tables were doing nested loops joins with the SAMPLES table. This
was causing lots of single block reads from the SAMPLES table.

The values stored in the LONG column of the SAMPLES table were quite big. One of the many
weaknesses of the LONG data type is that data is stored inline in the table segment regardless of
size. This led to a very large SAMPLES table segment even though the table did not have a large
number of rows. While most queries against the SAMPLES table were not fetching the LONG
column, Oracle still had to bring the LONG data into the buffer cache because it was stored in
the same blocks as the other columns. This led to a lot more physical I/O and churn in the buffer
cache than should have been necessary.

One way to reduce buffer cache churn and physical I/O would have been to replace the LONG
column with a CLOB column. CLOB data can be stored in a separate segment from the rest of
the table data. Thus the SAMPLES table segment would be a lot smaller and more likely to fit
entirely in the buffer cache. This would reduce the activity rate in the buffer cache and the
amount of physical I/O.

Unfortunately, changing the database schema and the application code to replace the LONG
data type with CLOB was not possible in a quick enough timeframe to address the immediate
performance problem. So we had to develop a quick fix that could be deployed temporarily until
the data type could be changed. First we enlarged the KEEP buffer pool, reducing physical reads
as all of the SAMPLES table could stay in the buffer cache. Next, we created an index on all of
the columns in the SAMPLES table except the LONG column. This allowed Oracle to satisfy most
queries against the SAMPLES table without having to read the excessively large table segment,
which greatly reduced the amount of data accessed during each query.

These two easy and quick changes virtually eliminated the buffer busy waits and reduced db file
sequential read wait events as total physical reads dropped 90%. Response time for the most
common queries involving the SAMPLES table dramatically improved, and overall web site
response times returned to an acceptable level. Excerpts from a Statspack report after the two
changes were made are as follows:

-- 22 --
Snap Id Snap Time Sessions
------- ------------------ --------
Begin Snap: 1192 20-Dec-02 13:00:49 102
End Snap: 1202 20-Dec-02 14:00:18 102
Elapsed: 59.48 (mins)

Cache Sizes
~~~~~~~~~~~
db_block_buffers: 38400 log_buffer: 1048576
db_block_size: 8192 shared_pool_size: 80000000

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
-------- ---------------
Redo size: 2,496.24 3,453.13
Logical reads: 30,270.86 41,874.69
Block changes: 17.08 23.62
Physical reads: 36.90 51.04
Physical writes: 79.62 110.14
User calls: 313.37 433.50
Parses: 173.91 240.57
Hard parses: 17.04 23.57
Sorts: 169.33 234.25
Logons: 0.36 0.50
Executes: 149.11 206.27
Transactions: 0.72

% Blocks changed per Read: 0.06 Recursive Call %: 84.55


Rollback per transaction %: 0.31 Rows per Sort: 62.86

Instance Efficiency Percentages (Target 100%)


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 99.98
Buffer Hit %: 99.88 In-memory Sort %: 99.89

Top 5 Wait Events


~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
direct path write 6,467 13,545 30.61
log file sync 4,914 7,493 16.93
library cache pin 1,175 6,090 13.76
direct path read 5,488 3,428 7.75
latch free 14,528 2,931 6.62

Buffer Pool Statistics for DB


-> Pools D: default pool, K: keep pool, R: recycle pool
Free Write Buffer
Buffer Consistent Physical Physical Buffer Complete Busy
P Gets Gets Reads Writes Waits Waits Waits
- ----------- ------------- ----------- ---------- ------- -------- ----------
D 10,178 39,561,072 330 1,122 0 0 10
K 1 13,075,944 1 4 0 0 0
R 0 0 0 0 0 0 0

As you can see, with a similar amount of querying activity (roughly 300-400 user calls and
30,000-33,000 logical reads per second) we have reduced total wait event time to about 442
seconds during the one hour sample period. (Direct path writes totaled 13545 centiseconds and
-- 23 --
represented 30.61% of total waits; 135.45/0.3061 = 442 seconds of total wait time.) This is quite
a difference from the 23,000 seconds of wait time during the sample period we were seeing
before the KEEP pool was enlarged and the extra index was added to the SAMPLES table.

Over time, application changes were made so that even fewer queries against the SAMPLES
table required retrieving the LONG column. Eventually, the LONG column was replaced with a
CLOB.

EXAMPLE #2: MORE BUFFER BUSY WAITS, PLUS LATCH CONTENTION FOR GOOD MEASURE

The magazine publisher’s application demonstrated buffer busy waits where multiple Oracle
processes want to read the same data block from disk at the same time. The first Oracle process
would issue the I/O request and wait on a db file sequential read event, while subsequent Oracle
processes would wait on a buffer busy waits event until the data block had made its way into the
buffer cache. Now we will look at an example where multiple Oracle processes try to write to
different copies of the same data block in the buffer cache at the same time. Since Oracle has
row-level locking, multiple users can update different rows in the same table at the same time
without locking each other out. Multiple users can even update different rows in the same data
block concurrently. But, as we will see in this example, contention and buffer busy waits can
sometimes occur.

A genetic research company used an Oracle database to store research data. Periodically, they
would load large volumes of raw data into the database. Then they would run a processing
application on multiple workstations. Each workstation would fetch raw data from the database,
process it, and put the processed data back in the database. The company had planned to add
more workstations to speed up the processing, but found that adding more workstations actually
seemed to reduce the throughput. The Oracle database was being used merely as a repository,
we were told. All Oracle had to do was fetch the raw data and then store the processed data.
The actual processing was being performed on the workstations outside of the realm of Oracle.
So it would seem that the Oracle database should not be the bottleneck.

We found about 100 active database sessions, each initiated by a separate workstation. We
modified the application code running on workstation 30 to activate debug event 10046 for wait
event tracing. We did this by inserting the following statement into the code just after
connecting to the database:

ALTER SESSION SET events '10046 trace name context forever, level 8';

We also modified the application to exit after iterating through its main loop 500 times. We ran
the modified application on the workstation and retrieved the trace file that Oracle had created
in the directory specified by the user_dump_dest parameter. The application ran for less than
one minute before exiting. Since this was an Oracle 9i database, we were able to run TKPROF to
generate a report including wait event statistics with the following command:

tkprof prodgen_ora_16466.trc report_16466.prf waits=yes

-- 24 --
An interesting excerpt from the TKPROF report was as follows:

UPDATE processing_stations
SET status = 'ACTIVE',
status_date = SYSDATE,
data_set_id_being_processed = :b1
WHERE station_id = 30

call count cpu elapsed disk query current rows


------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 500 0.23 10.14 0 3616 1010 500
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 501 0.24 10.14 0 3616 1010 500

Optimizer goal: CHOOSE


Parsing user id: 40

Rows Row Source Operation


------- ---------------------------------------------------
500 UPDATE
500 INDEX UNIQUE SCAN PROCESSING_STATIONS_PK (object id 9372)

Elapsed times include waiting on following events:


Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
buffer busy waits 26 0.71 7.87
latch free 17 0.57 2.08
log file switch completion 3 0.09 0.20

The UPDATE statement here seemed quite trivial—changing three column values on one row,
given a primary key value. Oracle only needed to do about nine logical reads per execution, and
no physical reads were required at all. Yet it still took Oracle over 10 seconds to execute the
statement 500 times. If we had not specified the “waits=yes” TKPROF option, we might conclude
that the database server was CPU-starved. But a quick look at the wait event portion of the
TKPROF report shows that most of the 10 seconds of elapsed execution time was spent waiting
on busy buffers and latches.

Some of the buffer busy waits listed in the trace file were as follows:

WAIT #2: nam='buffer busy waits' ela= 527727 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 498765 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 137611 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 124165 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 5237 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 264050 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 270177 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 330912 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 156317 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 710696 p1=18 p2=10 p3=220

-- 25 --
Since this trace file was generated by an Oracle 9i database, the elapsed times are shown in
units of microseconds. The wait shown at the bottom of the list above had an elapsed time of
710,696 microseconds, or approximately 0.71 seconds. This was the longest buffer busy wait for
this statement, as indicated by the “Max. Wait” figure in the TKPROF report.

From the previous example, recall that the parameters for the buffer busy waits event indicate
file number, block number, and reason code. All of the buffer busy waits listed above involve
block 10 in data file 18, with a reason code of 220. By querying the data dictionary, we found
that block 10 in data file 18 was allocated to the PROCESSING_STATIONS table, the table being
updated by the statement that encountered the buffer busy waits. According to Metalink bulletin
34405.1, reason code 220 indicates that the buffer is busy because another session is using a
copy of the buffer in an incompatible mode.

From the TKPROF report we can see that the application updated the PROCESSING_STATIONS
table 500 times during the period traced. Each update affected one row. While performing the
updates, there were 26 times that the process had to wait on a buffer busy waits event. From
looking at the trace file we know that all of the buffer busy waits occurred because data block 10
in data file 18 was already in the buffer cache, but in an incompatible mode. These 26 waits
totaled 7.87 seconds.

A query against the data dictionary revealed the following statistics for the
PROCESSING_STATIONS table:

SQL> SELECT SYSDATE - last_analyzed, blocks, avg_row_len, avg_space,


2 chain_cnt, num_rows
3 FROM user_tables
4 WHERE table_name = 'PROCESSING_STATIONS';

SYSDATE-LAST_ANALYZED BLOCKS AVG_ROW_LEN AVG_SPACE CHAIN_CNT NUM_ROWS


--------------------- ---------- ----------- ---------- ---------- ----------
2.132118056 1 62 1686 0 100

SQL>

-- 26 --
We can see that optimizer statistics were gathered recently and the table contains 100 very
short rows, all stored in one data block. If the application session that we traced updated the
PROCESSING_STATIONS table 500 times in less than a minute, we might suspect that the
database sessions from the other 99 workstations are doing roughly the same thing. This adds
up to over 50,000 updates to the same data block per minute. Since each session is updating a
different row in the table, there is no waiting on row-level locks (which would show up in the wait
event interface as enqueue waits). However, only one process at a time may write to a buffer in
the buffer cache. Therefore, a lot of time is being wasted contending for this one very hot buffer.

All of this begs the question, “Why is the application updating a row in the
PROCESSING_STATIONS table 500 times in less than one minute?” It turns out that the
application uses this table to coordinate efforts between the workstations. There is one row in
the table for each workstation. The application running on each workstation updates its row in
PROCESSING_STATIONS to indicate which data set it is processing so that other workstations will
not attempt to process the same data set. Also, each workstation periodically updates the status
date so that if a workstation crashes, it will be easy to tell when it crashed and what data it was
processing during the crash.

Processing of one data set could take anywhere from a fraction of a second to twenty minutes.
The application code was written to not only update the PROCESSING_STATIONS table when a
new data set was processed, but also to frequently update the timestamp during the processing
of one data set. All stakeholders agreed that 500 updates per minute might be excessive. If the
application code could be modified to update the table only once per second, then writes to the
table would reduce by roughly 90% and contention for the hot buffer in the buffer cache would
likely go away.

However, application changes needed to be tested thoroughly before deployment. A short-term


fix was desired until the long-term fix of the application could be implemented. Luckily, a
temporary fix was available that was simple to implement.

We rebuilt the PROCESSING_STATIONS table with the PCTFREE storage parameter set to 99 and
PCTUSED set to 1. This caused Oracle to leave 99% of the available space in each data block
empty for future row expansion. This change caused the 100 rows in the table to be spread over
100 data blocks instead of all rows being neatly packed into one data block. A query against the
data dictionary confirmed this:

-- 27 --
SQL> SELECT SYSDATE - last_analyzed, blocks, avg_row_len, avg_space,
2 chain_cnt, num_rows
3 FROM user_tables
4 WHERE table_name = 'PROCESSING_STATIONS';

SYSDATE-LAST_ANALYZED BLOCKS AVG_ROW_LEN AVG_SPACE CHAIN_CNT NUM_ROWS


--------------------- ---------- ----------- ---------- ---------- ----------
.130868056 100 62 8014 0 100

SQL>

Tracing another 500 iterations of the main loop in the application on workstation 30 after
rebuilding the PROCESSING_STATIONS table yielded the following TKPROF output:

UPDATE processing_stations
SET status = 'ACTIVE',
status_date = SYSDATE,
data_set_id_being_processed = :b1
WHERE station_id = 30

call count cpu elapsed disk query current rows


------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 500 0.20 2.22 0 500 1009 500
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 501 0.20 2.22 0 500 1009 500

Optimizer goal: CHOOSE


Parsing user id: 40

Rows Row Source Operation


------- ---------------------------------------------------
500 UPDATE
500 INDEX UNIQUE SCAN PROCESSING_STATIONS_PK (object id 9379)

Elapsed times include waiting on following events:


Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
latch free 2 0.35 0.61

The application is still updating the PROCESSING_STATIONS table very rapidly. But now each row
in the table is stored in a separate data block, so each workstation is working with its own hot
buffer in the buffer cache. The sessions are no longer competing with each other for the
opportunity to write to the same buffer. Waits on the buffer busy waits event have been
eliminated. As a side effect, logical reads have been reduced to one third of what they were
originally.

The original TKPROF reports also indicated waits on the latch free wait event, indicating latch
contention. A query against v$event_name shows that the parameters to the latch free wait
event are as follows:

-- 28 --
SQL> SELECT * FROM v$event_name WHERE name = 'latch free';

EVENT# NAME PARAMETER1 PARAMETER2 PARAMETER3


---------- -------------------- --------------- --------------- ---------------
3 latch free address number tries

SQL>

Some of the latch free wait events in the trace file generated before the PROCESSING_STATIONS
table was rebuilt were as follows:

WAIT #2: nam='latch free' ela= 47004 p1=15113593728 p2=97 p3=0


WAIT #2: nam='latch free' ela= 14629 p1=15113593728 p2=97 p3=1
WAIT #2: nam='latch free' ela= 20652 p1=15113593728 p2=97 p3=2
WAIT #2: nam='latch free' ela= 37737 p1=15113593728 p2=97 p3=3

A query against v$latch or v$latchname will give us the name of latch number 97:

SQL> SELECT name


2 FROM v$latch
3 WHERE latch# = 97;

NAME
----------------------------------------------------------------
cache buffers chains

SQL>

So we can see that the four wait events listed from the trace file above amount to a wait of
approximately 0.12 seconds for the cache buffer chains latch. (Remember that in Oracle 9i trace
files, elapsed times are listed in microseconds while in trace files from Oracle 8i and earlier the
elapsed times are in centiseconds.) This is not unexpected, since we had 100 sessions
contending for the right to modify one buffer in the buffer cache.

Comparing the TKPROF report before rebuilding the PROCESSING_STATIONS table with after, we
see that rebuilding the table reduced the latch contention but did not eliminate it. It is likely that
the long-term solution of modifying the application to reduce the update frequency will eliminate
or further reduce the latch contention.

EXAMPLE #3: LOG FILE WAITS

An application performed occasional loads into a small data warehouse. Unfortunately, the loads
were taking a much longer time on the production system than they had in a test environment.
To find out what was happening on the production database during a data load, we ran the
following script that displays the wait event activity for the entire instance over a 30 second
interval. The list of events in the NOT IN clause simply filters out the idle events, making the
output a little easier to read:

-- 29 --
DROP TABLE previous_events;

CREATE TABLE previous_events


AS
SELECT SYSDATE timestamp, v$system_event.*
FROM v$system_event;

EXECUTE dbms_lock.sleep (30);

SELECT A.event,
A.total_waits - NVL (B.total_waits, 0) total_waits,
A.time_waited - NVL (B.time_waited, 0) time_waited
FROM v$system_event A, previous_events B
WHERE B.event (+) = A.event
AND A.event NOT IN
(
'client message', 'dispatcher timer', 'gcs for action',
'gcs remote message', 'ges remote message', 'i/o slave wait',
'jobq slave wait', 'lock manager wait for remote message',
'null event', ‘Null event’, 'parallel query dequeue', 'pipe get',
'PL/SQL lock timer', 'pmon timer', 'PX Deq Credit: need buffer',
'PX Deq Credit: send blkd', 'PX Deq: Execute Reply',
'PX Deq: Execution Msg', 'PX Deq: Signal ACK',
'PX Deq: Table Q Normal', 'PX Deque Wait', 'PX Idle Wait',
'queue messages', 'rdbms ipc message', 'slave wait',
'smon timer', 'SQL*Net message to client',
'SQL*Net message from client', 'SQL*Net more data from client',
'virtual circuit status', 'wakeup time manager'
)
ORDER BY time_waited;

Running this script during a data load yielded the following output:

-- 30 --
EVENT TOTAL_WAITS TIME_WAITED
----------------------------------- ----------- -----------
LGWR wait for redo copy 0 0
SQL*Net break/reset to client 2 0
SQL*Net more data to client 0 0
buffer busy waits 0 0
db file scattered read 6 0
direct path read 27 0
direct path write 34 0
direct path write (lob) 0 0
file identify 6 0
file open 93 0
library cache pin 0 0
local write wait 0 0
process startup 0 0
rdbms ipc reply 0 0
refresh controlfile command 0 0
single-task message 0 0
switch logfile command 0 0
control file sequential read 61 1
latch free 2 1
db file sequential read 6 7
control file parallel write 41 31
log file single write 6 164
db file parallel write 13 220
enqueue 6 486
log buffer space 24 2007
log file sequential read 30 2655
log file switch completion 33 2883
log file parallel write 19 3561
log file sync 113 10249

During the 30 second sample interval, Oracle processes spent roughly 215 seconds waiting on
events related to the online redo log. The likely culprit in this situation is some combination of
slow disks and contention on the disks holding the online redo log files. After a little research, we
found that the redo logs had been striped on a RAID array on the faster test system. On the
production system, however, the redo logs were all on a single disk that was shared with
another very active database. When the online redo logs were moved to separate disks, the data
loads ran more rapidly and the wait events pertaining to the redo log decreased significantly.

EXAMPLE #4: DIRECT PATH READ AND WRITE WAITS

A company experienced delays when querying summarized information about their orders.
Querying individual orders was fast, but queries involving totals or rollups seemed to take
unacceptably long. We ran our Database Rx monitoring tool at regular intervals to collect
performance information from the system while the users were performing their typical tasks.
(Database Rx is a proactive monitoring and alert notification tool that we developed for use with
our remote database administration clients. For a free demo, see our website at
dbrx.dbspecialists.com/guest.) A Performance Summary report produced the following output:

-- 31 --
-- 32 --
Looking at the Top 5 Wait Events section, we see that the leading wait events are direct path
write, db file scattered read, and direct path read. These three wait events combined accounted
for roughly 99% of the non-idle wait time on the system during the sample interval. From our
knowledge of the queries in the application, the db file scattered read (i.e., full table scan)
events didn’t surprise us. The absence of db file sequential read waits also did not surprise us as
the relevant blocks for index lookups and row fetches by ROWID were likely present in the buffer
cache.

We figured we might be able to reduce the db file scattered read waits with a larger buffer cache
or judicious use of the KEEP pool in the short term and application code review in the longer
term. What seemed amiss, however, were the large numbers of direct path read and write wait
events. Querying v$session_wait while applications were running yielded the following:

SQL> SELECT sid, event, state, seconds_in_wait, wait_time, p1, p2, p3


2 FROM v$session_wait
3 WHERE event = 'direct path write'
4 ORDER BY sid;

SID EVENT STATE SECONDS_IN_WAIT WAIT_TIME P1 P2 P3


---- ----------------- ----- --------------- --------- ------ ------ ------
39 direct path write WAITI 0 0 201 65 7
47 direct path write WAITI 0 0 201 2248 7

SQL>

Querying v$event_name shows the parameters for the direct path write event:

SQL> SELECT * FROM v$event_name WHERE name = 'direct path write';

EVENT# NAME PARAMETER1 PARAMETER2 PARAMETER3


---------- -------------------- --------------- --------------- ---------------
115 direct path write file number first dba block cnt

SQL>

We see that the first parameter is supposed to be the file number of the data file containing the
blocks being written, but we couldn’t find data file 201 using the following query:

SQL> SELECT tablespace_name, file_id "AFN", relative_fno "RFN"


2 FROM dba_data_files
3 WHERE file_id = 201;

no rows selected

SQL>

-- 33 --
This is because the database is using temp files in the temporary tablespace, so we used the
following query:

SQL> SELECT tablespace_name, file_id + value "AFN", relative_fno "RFN"


2 FROM dba_temp_files, v$parameter
3 WHERE name = 'db_files';

TABLESPACE_NAME AFN RFN


------------------------------ ---------- ----------
TEMP 201 1

SQL>

This indicated that the TEMP tablespace was experiencing direct path write waits, which is not
surprising as direct path reads and writes usually occur in connection with temporary segments.
Direct path I/O operations were responsible for most of the waits on the system and appeared to
be a bottleneck. Upon further examination, we discovered that the sort_area_size parameter was
set to the default value of 65536. Since the database server had a fair amount of RAM free and
there generally weren’t too many sessions connected simultaneously, we increased the
sort_area_size to 10 Mb, and the direct path waits disappeared as sorts no longer had to be
written to disk.

If increasing the sort_area_size had not been an option, we might have looked at the disk
storage where the TEMP tablespace temp files were located to see if there was I/O contention
with other systems. We might also have looked at application code to see if the amount of
sorting and hashing could be reduced.

This example also demonstrates something that might seem odd at first glance. During the
Database Rx sample period, there were more physical reads than logical reads. How can this be
—don't all physical reads also require logical reads? Well, this is where direct path reads get
their name. As Oracle says in reference note 50415.1 on Metalink, “Direct path reads are
generally used by Oracle when reading directly into PGA memory (as opposed to into the buffer
cache).” Since direct path reads bypass the buffer cache, they aren't counted as logical reads.
This is the reason that some people question the traditional definition of the buffer cache hit
ratio, because it doesn't take account of direct path reads. We won't get into that debate, but be
aware that direct path reads could distort your cache hit ratio. In this example, the cache hit
ratio would be negative!

EXAMPLE #5: DATABASE LINK WAIT EVENTS

Several of the SQL*Net wait events are considered idle events. For example, the “SQL*Net
message from client” event indicates that the process is waiting for the application to submit a
SQL statement for processing. However, the SQL*Net wait events pertaining to database links
are not idle events—they come into play when queries use database links to access remote
databases. The “SQL*Net message from dblink” event, for example, indicates that the local
process has submitted a request to a remote database and is waiting for a response.

-- 34 --
A company had deployed a data import application for receiving batches of customer
transactions from its partners. There were five Oracle databases, one per geographic region.
Customer transactions were sorted by geographic location, and loaded into the database serving
that region. Due to human error, sometimes the same customer transaction would get imported
into multiple Oracle databases. They had developed a reconciliation report to address this. The
report was simply a query that compared transactions between two databases and listed records
found on both. Unfortunately, the report took over 30 minutes to run and the runtime was
expected to get much worse as the customer base grew.

Looking at the report code, we found the main query, which appeared to be the one using the
lion’s share of resources. We logged onto SQL*Plus as an application user in one window and
logged onto SQL*Plus again as a DBA in another window. From the DBA session we queried
v$session and found the SID corresponding to our application user SQL*Plus session. Then we
launched the query from the application user session and monitored v$session_event from the
other. Since the wait event interface does not account for CPU usage, we also monitored
v$sesstat.

After the report query completed 31 minutes later, we queried v$session_event one last time
and the results were as follows:

SQL> SELECT event, total_waits, time_waited, max_wait


2 FROM v$session_event
3 WHERE sid = 47
4 ORDER BY event;

EVENT TOTAL_WAITS TIME_WAITED MAX_WAIT


------------------------------ ----------- ----------- ----------
SQL*Net message from client 32 4435 2432
SQL*Net message from dblink 1525516 104919 31
SQL*Net message to client 33 0 0
SQL*Net message to dblink 1525516 466 9
db file sequential read 27199 8025 28
latch free 40 5 4
log file sync 1 2 2

7 rows selected.

SQL> SELECT A.name, B.value


2 FROM v$statname A, v$sesstat B
3 WHERE A.statistic# = 12
4 AND B.statistic# = A.statistic#
5 AND B.sid = 47;

NAME VALUE
------------------------------ ----------
CPU used by this session 67937

SQL>

We can ignore the SQL*Net message from client wait event because this just indicates time
Oracle spent waiting for us to submit the query in SQL*Plus. However, there were over 1.5
million network roundtrips to a database link totaling over 1053 seconds of wait time. (The 1053
-- 35 --
seconds includes both network latency and waiting on the remote database server to respond to
requests.) There were over 27,000 waits on single block disk I/Os totaling about 80 seconds of
wait. There were also latch free and log file sync wait events, but the wait times were
insignificant. The query from v$sesstat shows that about 679 seconds of CPU time were used.

Knowing nothing about the report query or database size, the number of network roundtrips
seemed very high and the number of single block disk I/Os seemed like it might be high. Before
we could go any further, we really needed to look at the query and the tables involved. The
report query was as follows:

SELECT customer_id, batch_serial_number, batch_date, load_date,


batch_comment, control_total, control_amount
FROM customer_transfer_batches A
WHERE exists
(
SELECT 1
FROM customer_transfer_batches@prdwest B
WHERE B.customer_id = A.customer_id
AND B.batch_serial_number = A.batch_serial_number
)
ORDER BY customer_id, batch_serial_number;

Estimated statistics on the CUSTOMER_TRANSFER_BATCHES table showed that it had


approximately 1,526,000 rows spanning approximately 21,825 blocks. Correlating the wait event
information with the table size, it was clear that Oracle was reading the local table one block at a
time and querying the remote table once per local row. The execution plan for the query
confirmed this:

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT
1 0 FILTER
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'CUSTOMER_TRANSFER_BATCHES'
3 2 INDEX (FULL SCAN) OF 'CUSTOMER_TRANSFER_BATCHES_PK' (UNIQUE)
4 1 REMOTE* PRDWEST

4 SERIAL_FROM_REMOTE SELECT "CUSTOMER_ID","BATCH_SERIAL_NUMBER"


FROM "CUSTOMER_TRANSFER_BATCHES" "B"
WHERE "BATCH_SERIAL_NUMBER"=:1
AND "CUSTOMER_ID"=:2

Performing 1.5 million network roundtrips and making the remote database execute 1.5 million
queries did not seem to be the most efficient way to go. It would probably be more efficient for
the local database to pull all of the required data from the remote database in one shot and then
join it to the local table. We probably could have added an optimizer hint to the query to get
Oracle to rewrite the subquery as a join, but instead we chose to rewrite the query ourselves in a
way that would be easier for Oracle’s query optimizer to do the right thing. The rewritten query
was as follows:

SELECT customer_id, batch_serial_number, batch_date, load_date,


batch_comment, control_total, control_amount
-- 36 --
FROM customer_transfer_batches
WHERE (customer_id, batch_serial_number) IN
(
SELECT customer_id, batch_serial_number
FROM customer_transfer_batches
INTERSECT
SELECT customer_id, batch_serial_number
FROM customer_transfer_batches@prdwest
)
ORDER BY customer_id, batch_serial_number;

We ran the revised query and monitored v$session_event and v$sesstat as before, with the
following results:

SQL> SELECT event, total_waits, time_waited, max_wait


2 FROM v$session_event
3 WHERE sid = 49
4 ORDER BY event;

EVENT TOTAL_WAITS TIME_WAITED MAX_WAIT


------------------------------ ----------- ----------- ----------
SQL*Net message from client 46 3680 2481
SQL*Net message from dblink 24 31 18
SQL*Net message to client 47 0 0
SQL*Net message to dblink 24 0 0
SQL*Net more data from dblink 5978 1337 13
db file scattered read 3430 675 8
db file sequential read 182 60 2
direct path read 148 233 11
direct path write 920 3572 33
log file sync 1 2 2

10 rows selected.

SQL> SELECT A.name, B.value


2 FROM v$statname A, v$sesstat B
3 WHERE A.statistic# = 12
4 AND B.statistic# = A.statistic#
5 AND B.sid = 49;

NAME VALUE
------------------------------ ----------
CPU used by this session 3227

SQL>

This time the query completed in 92 seconds. Both elapsed time and CPU usage were reduced
by about 95%. The waits on the database link were reduced from about 1053 seconds to under
14 seconds, indicating less network overhead and a lot less demand being placed on the remote
database server. Waits on db file sequential read and db file scattered read combined were
significantly shorter than the waits on db file sequential read in the original query because large
numbers of single block I/Os were replaced with fewer multi-block I/Os.

However, there were about 38 seconds of direct path I/O waits. As we’ve seen, direct path I/O
usually occurs in connection with temporary segments. Oracle had to do quite a bit of sorting to
-- 37 --
implement the INTERSECT operation added to the rewritten query, so the direct path I/O waits
were not unexpected.

The performance gains achieved by the query rewrite were significant, and we were tempted to
end the tuning exercise right here. However, the company indicated that this report was critical
to their operation and they wanted to make it run even faster if possible. So, our next step was
to speed up the sort operation and thereby reduce the direct path I/O waits.

Looking at v$parameter, we saw that the sort_area_size was set to 1 Mb. A larger sort area
would allow Oracle to perform more of the sort in memory and less on disk, thus reducing the
direct path I/O. To see the effect of a larger sort area, we repeated the timing exercise, first
setting sort_area_size to 100 Mb dynamically in the SQL*Plus session where the report query
was run. The query completed in 55 seconds this time, and statistics collected from
v$session_event and v$sesstat were as follows:

SQL> SELECT event, total_waits, time_waited, max_wait


2 FROM v$session_event
3 WHERE sid = 46
4 ORDER BY event;

EVENT TOTAL_WAITS TIME_WAITED MAX_WAIT


------------------------------ ----------- ----------- ----------
SQL*Net message from client 47 442 287
SQL*Net message from dblink 25 25 14
SQL*Net message to client 48 0 0
SQL*Net message to dblink 25 0 0
SQL*Net more data from dblink 6050 1378 26
db file scattered read 3430 945 8
db file sequential read 191 59 1
log file sync 1 3 3

8 rows selected.

SQL> SELECT A.name, B.value


2 FROM v$statname A, v$sesstat B
3 WHERE A.statistic# = 12
4 AND B.statistic# = A.statistic#
5 AND B.sid = 46;

NAME VALUE
------------------------------ ----------
CPU used by this session 3296

SQL>

You can see that increasing the sort area size did not reduce the CPU time, database link wait
time, or db file scattered/sequential read times. However, the direct path I/O waits disappeared
completely. This indicates that all sorting was performed in memory. Of course, it may not be
feasible to keep the sort entirely in memory as the CUSTOMER_TRANSFER_BATCHES table grows.
However, a larger sort area will still speed up the sort and overall query speed even if the entire
sort cannot be performed in memory.

-- 38 --
This example demonstrated the practice of iterative tuning: Curing one bottleneck often creates
another or reveals another performance issue that was masked by the larger issue. Boosting
performance of complex systems is usually an iterative process.

CONCLUSION

Because wait events touch all areas of Oracle server operation, the wait event interface can
provide insights into just about any kind of performance problem you might experience. Other
metrics, such as cache hit ratios, can give you valuable information, but it is the wait event
interface that ties everything together. In the examples discussed here, wait event data directed
us to a swift diagnosis of each problem.

The wait event interface, with its four dynamic performance views and tracing facility, enables
you to see how an Oracle process is spending its time when it is not using the CPU. While wait
event information may not give you everything you might need to cure every performance
problem, it does give you access to a wealth of information that can guide you on the path to
better system performance.

The basic concepts behind the wait event interface have not changed since Oracle 7. But each
release of Oracle brings a few more enhancements that make the wait event interface easier to
use or more informative. In Oracle 9i, TKPROF has been enhanced to parse wait event
information from trace files and format it in a useful report. Also, some timing information is
tracked in microseconds for more resolution, and the DBMS_SUPPORT package has been
included for easier trace activation. However, these are just bells and whistles. The wait event
interface has been a solid tool for Oracle performance management since Oracle 7.

ABOUT THE AUTHORS

Roger Schrag, OCP, has been an Oracle DBA and application architect for over twelve years,
starting out at Oracle Corporation on the Oracle Financials development team and moving on to
production DBA and database architect positions at various companies in the San Francisco Bay
Area. In 1995, Roger founded Database Specialists, Inc., a consulting firm specializing in
business solutions based on Oracle technology, including on-site and remote database
administration services. Roger is a frequent speaker at OracleWorld and the IOUG Live
conferences. He is also President of the Northern California Oracle Users Group. You may contact
Roger by email at [email protected].

Terry Sutton, OCP, has been an Oracle DBA for nine years, and has worked in the information
technology area for 16 years. Since 2000, Terry has been a Senior Staff Consultant at Database
Specialists, performing assignments ranging from production database administration to
emergency troubleshooting with a particular focus on Oracle database performance tuning. You
may contact Terry by email at [email protected].

-- 39 --
Note from the authors: We are always looking for interesting wait event situations to learn from.
One can never be exposed to too many examples or techniques. If you are trying to tackle a
problem using the wait event interface, feel free to email us the specifics of your situation. We’ll
do our best to look over what you send us and give you our thoughts if you wish. But please
understand there aren’t enough hours in the day for any one person (or two of us, actually) to
solve the world’s Oracle performance problems. We’ll try to help you if we can.

-- 40 --

You might also like