A session that reads or modifies a buffer in the SGA must first
acquire the cache buffers chains latch and traverse the buffer chain until it
finds the necessary buffer header. Then it must acquire a buffer lock or a pin
on the buffer header in shared or exclusive mode, depending on the operation it
intends to perform. Once the buffer header is pinned, the session releases the
cache buffers chains latch and performs the intended operation on the buffer
itself. If a pin cannot be obtained, the session waits on the buffer busy waits wait event. This wait event does not apply
to read or write operations that are performed in sessions’ private PGAs.
Keep the following key thoughts in mind when dealing with the buffer busy waits event.
-
The Oracle session is waiting to pin a buffer. A buffer must be pinned before it can be read or modified. Only one process can pin a buffer at any one time.
-
buffer busy waits indicate read/read, read/write, or write/write contention.
-
The appropriate action to take depends on the reason encoded in the P3 parameter.The following pieces of information are essential for diagnosing the buffer busy waits problem when it is the leading bottleneck that slows a process down.
-
The primary reason code that represents why a process fails to get a buffer pin.
-
The class of block that the buffer busy waits wait event is for.
-
The SQL statement that is associated with the buffer busy waits event.
-
The segment that the buffer belongs to.
This is typical in applications that spawn multiple concurrent threads or sessions, and each one executes the same query that goes after the same data set. In this case, you can check the sessions’ logon time in the V$SESSION view, and chances are you will find them only a few seconds apart. When multiple sessions request the same data block that is not in the buffer cache, Oracle is smart enough to prevent every session from making the same operating system I/O call. Otherwise, this can severely increase the number of system I/Os. Instead, Oracle allows only one of the sessions to perform the actual I/O, while others wait for the block to be brought into the buffer cache. The other sessions wait for the block on the buffer busy waits event and the session that performs the I/O waits on the db file sequential read (or the db file scattered read) wait event. You will notice that the buffer busy waits and the db file sequential read events share the same P1 (file#) and P2 (block#) values.
Reason code 220 indicates there are multiple sessions trying to
concurrently modify different rows within the same block that is in the buffer
cache. This symptom is typical in applications with high DML concurrency.
Unfortunately, a block can be pinned by only one process at any one time. The
other concurrent processes must wait on the buffer busy
waits wait event until the first change is complete. This is a good thing;
otherwise the block will be corrupted.
If the majority of the buffer busy
waits wait events are centered on data blocks (class #1) and the reason code
is 130, this shows the application runs multiple sessions that query the same
data set at the same time. (You will only know this if you query the
V$SESSION_WAIT view repeatedly or trace the session with the 10046 event or use
the data sampling methods) This is an application
issue. There are three things you can do to minimize this problem:
-
Reduce the level of concurrency or change the way the work is partitioned between the parallel threads.
-
Optimize the SQL statement to reduce the number of physical and logical reads.
-
Increase the number of FREELISTS and FREELIST GROUPS.
From our experience, it is very difficult to get the application
to reduce the level of concurrency. It may not be a good idea because it limits
scalability. However, there are differences between scalability and a blind
attempt by the application to improve performance by spawning multiple sessions.
So far, SQL tuning has worked wonderfully to reduce the occurrences of buffer busy waits. Check the SQL execution plan and optimize
the SQL statement to use the most effective join method and access paths that
reduce the number of physical and logical reads.
If the majority of the buffer busy
waits wait events are centered on data blocks and the reason code is 220,
this indicates there are multiple sessions performing DML on the same object at
the same time. In addition, if the database block size is large (for example,
16K and above), it can only intensify this symptom as larger blocks generally
contain more rows per block. There are three things you can do to minimize this
problem:
-
Reduce the level of concurrency or change the portioning method.
-
Reduce the number of rows in the block.
-
Rebuild the object in another tablespace with a smaller block size (Oracle9i Database and above).
Again, as mentioned earlier, it may not be practical to limit
scalability by reducing the level of concurrency.
If the data blocks belong to tables or indexes, then consider
rebuilding the objects to reduce the number of rows per block and spread the
data over a larger number of blocks. For example, you can rebuild a table or an
index with a higher PCTFREE. In some cases, we have rebuilt indexes with PCTFREE
as high as 50 percent. The downside to this is that index range scans and index
fast full scans will be slower. You can also alter the table to minimize the
number of rows per block with the ALTER TABLE table_name
MINIMIZE RECORDS_PER_BLOCK command. Starting in Oracle9i
Database, you can move or rebuild the object in another tablespace with a
smaller block size. While these actions can minimize the buffer busy waits problem, they will definitely increase full
table scans time and disk space utilization. As the saying goes, there is no
such thing as a free lunch.
If the majority of the buffer busy
waits wait events are centered on data segment headers (that is, the table
or index segment header and not the undo segment header), this usually means
some tables or indexes in the database have high segment header activities.
Processes visit segment headers for two main reasons—to get or modify the
process FREELISTS information and to extend the high watermark (HWM). There are
three things you can do to minimize this problem:
-
Increase the number of process FREELISTS and FREELIST GROUPS of the identified object.
-
Make sure the gap between PCTFREE and PCTUSED is not too small.
-
Make sure the next extent size is not too small.
If you do not want to mess with FREELISTS and FREELIST GROUPS, you can rely on the Automatic Segment Space Management (ASSM) feature to scatter the incoming data from the insert statements.
Finally, you should check the next extent size of the
identified segment. A high insert rate combined with a small next extent size
can cause frequent insertion of new entries into the extent map located in the
segment header. Consider altering or rebuilding the object with a larger next
extent size. If the object resides in a locally managed tablespace, consider
moving the object into a reasonable uniform-size locally managed tablespace.
If the majority of the buffer busy waits wait events are
centered on undo segment headers, this indicates there are either too few
rollback segments in the database or their extent sizes are too small, causing
frequent updates to the segment headers. If you use the system-managed undo
introduced in Oracle9i Database, you shouldn’t have to
deal with this problem as Oracle will create additional undo segments according
to demand.
If the majority of the buffer busy
waits wait events are centered on undo blocks, this usually means there are
multiple concurrent sessions querying data that is being updated at the same
time. Essentially the query sessions are fighting for the read consistent images
of the data blocks. This is an application issue and there is nothing amiss in
the database. The problem should go away when the application can run the query
and DML at different times.
Oracle maintains a number of instance-level statistics on buffer busy waits. These statistics can give you a rough idea
of what you are dealing with, but the information may not be specific enough for
you to formulate a corrective action. They are mentioned here for the sake of
completeness.
The view X$KCBWAIT (kernel cache buffer wait) is the base view for
the V$WAITSTAT view, which keeps track of buffer busy
waits contentions by block class. The class with the highest count deserves
your attention, but unless you are also monitoring the buffer
busy waits symptom at a lower level, you don’t have a clear direction to
proceed. Let’s say the data block class has the highest count of all. Which
segment was affected most, and why were the sessions unable to pin the buffers?
Did they fail to get the pins while attempting to read or change the blocks?
Unfortunately, Oracle does not keep track of buffer busy
waits by block class and reason code. We hope someday Oracle will provide a
matrix of buffer busy waits by SQL statement, segment
name, and block class.
No comments:
Post a Comment