0% found this document useful (0 votes)
106 views

Logical I/O: Julian Dyke Independent Consultant

Logical I / Os are read operations Buffers are cached in shared memory. Most Logical I / Os can be satisfied from cache. The remainder will result in physical I /os. Logical reads include Current Reads Consistent Reads. Consistent Reads can be used to construct consistent versions. Consistent versions can have multiple versions of same block in buffer cache can be single multi can be traced using events 10200 / 10201.

Uploaded by

Nandan Acharya
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views

Logical I/O: Julian Dyke Independent Consultant

Logical I / Os are read operations Buffers are cached in shared memory. Most Logical I / Os can be satisfied from cache. The remainder will result in physical I /os. Logical reads include Current Reads Consistent Reads. Consistent Reads can be used to construct consistent versions. Consistent versions can have multiple versions of same block in buffer cache can be single multi can be traced using events 10200 / 10201.

Uploaded by

Nandan Acharya
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 42

Logical I/O

Julian Dyke
Independent Consultant
Web Version
1

2005 Julian Dyke

Agenda

Introduction

Logical I/Os

Buffer Cache Behaviour

Statistics

Conclusion

2005 Julian Dyke

Logical I/Os

Logical I/Os are read operations

Buffers are cached in shared memory

Most logical I/Os can be satisfied from cache

The remainder will result in physical I/Os

Logical I/Os include


current reads
consistent reads

2005 Julian Dyke

Current Reads

Current reads

Current version of block

Can be updated

Can be dirty

Includes all changes

Only one current version of block in buffer cache

Only one current version of block across all instances

Can be used to construct consistent versions

2005 Julian Dyke

Consistent Reads

Consistent reads

Potentially historic version of block

Consistent to a specific System Change Number (SCN)

Cannot be updated

Cannot be dirty

Can be used to construct consistent versions

Can have multiple versions of same block in buffer cache

Can be
single
multi

block (sequential reads)

block (scattered reads)

Can be traced using events 10200 / 10201

2005 Julian Dyke

Logical I/O statistics

session logical reads statistic


Total number of logical reads in session
Unreliable at system level
At session level
session logical reads = db block gets + consistent gets

db block gets statistic


Number of current reads

consistent gets statistic


Number of consistent reads

2005 Julian Dyke

Buffer Pools

There are up to eight buffer pools


DEFAULT
KEEP, RECYCLE
Oracle 8.0 and above
2K, 4K, 8K, 16K and 32K
Oracle 9.0 and above

32K not available on all platforms

Cannot have non-standard block size same as DEFAULT block


size

2005 Julian Dyke

Buffer Pool Headers

One for each buffer pool (usable or unusable)

Externalized in
V$BUFFER_POOL
V$BUFFER_POOL_STATISTICS

Based on X$KCBWBPD

Created in shared pool permanent memory when instance is


started

Contain one or more working sets

2005 Julian Dyke

X$KCBWBPD

Externalises buffer pool header


ADDR

RAW(4)

INDX

NUMBER

INST_ID

NUMBER

BP_NAME

VARCHAR2(20)

BP_ID

NUMBER

BP_BLKSZ

NUMBER

BP_GRANSZ

NUMBER

BP_BUFPERGRAN

NUMBER

BP_LO_SID

NUMBER

BP_HI_SID

NUMBER

BP_SET_CT

NUMBER

BP_SIZE

NUMBER

BP_STATE

NUMBER

BP_CURRGRANS

NUMBER

BP_TGTGRANS

NUMBER

BP_PREVGRANS

NUMBER

2005 Julian Dyke

Buffer Pool Name


Buffer Pool ID
Block Size
Granule Size
Buffers per Granule
Minimum Working Set ID
Maximum Working Set ID
Number of Working Sets
Number of Buffers

Hash Buckets

Hash value of each block calculated from


Data Block Address (DBA)
Block Class

Number of hash buckets dependent on number of buffers in


cache e.g.
# buffers
# hash buckets

10

500

6000

64

1024

Each hash bucket contains


Cache Buffers Chains latch
Pointer to array of double linked lists

2005 Julian Dyke

Hash Buckets
# hash chains

BH

BH
cache
buffers
chain
latch
BH

11

2005 Julian Dyke

BH

Buffer Headers

12

Each buffer header describes contents of one buffer

All buffers accessed via buffer header

Buffer header contains pointers to


Buffer
Cache Buffers Chains latch

Buffer header includes double linked lists for


Cache Buffers Chain list
Replacement list
Users list
Waiters list

2005 Julian Dyke

X$BH

13

Externalises buffer headers


ADDR

RAW(4)

INDX

NUMBER

Hash List Address

INST_ID

NUMBER

Block Size

HLADDR

RAW(4)

BLSIZ

NUMBER

NXT_HASH

RAW(4)

PRV_HASH

RAW(4)

NXT_REPL

RAW(4)

PRV_REPL

RAW(4)

TS#

NUMBER

FILE#

NUMBER

Relative File Number

DBARFIL

NUMBER

Block Number

DBABLK

NUMBER

OBJ

NUMBER

Object ID

BA

RAW(4)

Buffer Address

CR_SCN_BAS

NUMBER

2005 Julian Dyke

Hash List
Replacement List
Tablespace#
Absolute File Number

Working Sets

14

Introduced in Oracle 8.1.5

Each buffer pool contains one or more working sets

Working set header


created in shared pool permanent memory
associated with one DBWn process
protected by cache buffers lru chain latch

Each working set maintains separate set of LRU lists

2005 Julian Dyke

LRU Lists

15

In Oracle 9.2 each working set maintains 4 LRU lists

LRU - replacement list - normal blocks

LRU-W - write list - dirty blocks

LRU-XO - object list - buffers involved in


DROP
TRUNCATE

LRU-XR - range list - buffers involved in


ALTER TABLESPACE BEGIN BACKUP
ALTER TABLESPACE END BACKUP
ALTER TABLESPACE OFFLINE
ALTER TABLESPACE READ ONLY

2005 Julian Dyke

Main and Auxiliary Lists

16

Each LRU contains


main list
auxiliary list

Auxiliary list includes


dirty buffers identified by DBWn processes
buffers being written

Buffers are moved from main to auxiliary list by DBWn


processes to avoid unnecessary scans

Processes scan auxiliary lists first for free buffers

Buffers also allocated to auxiliary list


at startup
after FLUSH_CACHE

2005 Julian Dyke

Working Set Lists


Working Set
Header
Hot
Replacement
List

Cold

MAIN
AUX

Write
List

MAIN
AUX

Object
List

MAIN
AUX

Range
List

MAIN
AUX

17

2005 Julian Dyke

Buffer
Header

Replacement List

In Oracle 8.1.5 and above a mid-point insertion algorithm is


used
Buffer cache has a hot end and a cold end
Buffers are inserted at mid-point
Mid-point is head of cold end
Starts at hot end - moves down cache
Maximum mid-point determined by _db_percent_hot_default
Default value is 50%
Head of
Hot End

Hot End
18

2005 Julian Dyke

Head of
Cold End

Replacement List

Cold End

X$KCBWDS

19

Externalises working set header

Working Set ID

ADDR

RAW(4)

INDX

NUMBER

INST_ID

NUMBER

SET_ID

NUMBER

DBWR_NUM

NUMBER

BLK_SIZE

NUMBER

AUX Replacement List

NXT_REPL

RAW(4)

PRV_REPL

RAW(4)

Number of buffers on
MAIN Replacement List

NXT_REPLAX

RAW(4)

PRV_REPLAX

RAW(4)

CNUM_REPL

RAW(4)

ANUM_REPL

RAW(4)

Insertion Point

COLD_HD

RAW(4)

HBMAX

NUMBER

Maximum number of
Hot Buffers

HBUFS

NUMBER

Number of Hot Buffers

NXT_WRITE

RAW(4)

2005 Julian Dyke

Database Writer Number


MAIN Replacement List

Number of buffers on
AUX Replacement List

Touch Count

20

Each buffer header maintains


touch count
timestamp

Touch count represents number of 3 second intervals in


which buffer has been accessed since
buffer last read into cache
touch count last reset

Each time buffer is accessed


if timestamp more than 3 seconds ago
increment touch count
set timestamp to current time

2005 Julian Dyke

Touch Count

21

When buffer reaches tail of cold end


If touch count >= 2 then
buffer is moved to hot end
Otherwise used as next free buffer

Hot criteria determined by


_db_aging_hot_criteria
default value is 2 touches

Time interval determined by


_db_aging_touch_time
default value is 3 seconds

2005 Julian Dyke

Single versus Multi-Block Reads

22

Single block reads


Used with current reads
Can be used with consistent reads
Waits recorded by db file sequential read

Multi block reads


Frequently used with consistent reads
Maximum number of physical blocks read specified by
DB_FILE_MULTIBLOCK_READ_COUNT
Waits recorded by db file scattered read
Blocks moved to cold end of buffer cache

2005 Julian Dyke

Single-Block Reads
Head of
Hot End
92
71
42

34
92
71

72
34
92

45
72
34

52
42
45
33
72
87
11

71
52
42
45
33
72
11

66
71
52
42
45
33
11

49
66
71
52
42
45
11

3
0

3
4
0

2
4

2
4
1

2
1
4

0
2
1

0
1
2

Read Block 42
11
33
34
87
Get
Insert
Update
Move
Set
Gettouch
Insert
first
first
block
buffer
buffer
touch
buffer
available
available
count
71
42
at
atcount
contents
to
head
head
head
buffer
buffer
of
of cold
cold
from
end
for
of
on
from
end
hot
block
block
cold
cold
end42
71
42
34
end
end
to zero

23

2005 Julian Dyke

STOP

Block
Number

Head of
Cold End

42
33
87
11
1

Touch
Count

Consistent Reads
Current Block

Head of
Hot End
40

Head of
Cold End
56

34
17
27

27
34
17

95
27
34

33
27
95

132
128

132

150

150

150

Insert
Get
Read
Apply
first
current
consistent
buffer
undo
available
to
at
version
rollback
head
version
buffer
of
ofcold
of
from cold
block
end
27 SCN
to
into
endbuffer
132
128

2005 Julian Dyke

STOP

Consistent Block

17
27

Read Block 27 - SCN 128


132

24

Block
Number

System
Change
27
Number
128
132
150

85
33
95

Multi-Block Reads
DB_FILE_MULTIBLOCK_READ_COUNT = 4
Head of
Hot End

ReadBlock
Block7
28
5
6
Read
1
3
4
Insert
Read
Move
Move
Get
Read
Insert
Move
next
next
block
buffers
next
block
block
buffers
four
four
four
17
2
3
4
5
86available
at
to
available
to
to
at
blocks
blocks
head
cold
cold
cold
headend
end
of
end
of
into
into
Get
first
four
cold
buffers
buffers
coldend
end
from cold
cold end
end
from

25

2005 Julian Dyke

STOP

Head of
Cold End

7
2
1
3
4
8
5
6

6
2
3
7
1
5

5
2
1
6

1
5

41
5

2
3
6

2
3
7

4
1
8

1
5

1
2
5
6

2
1
3
5
6
7

2
3
4
7
6
8
1
5

Dirty Blocks

26

When blocks are updated they are marked dirty

Changes immediately written to redo buffer

Changes written back to disk asynchronously by DBWn


process

DBWn process
scans from cold end of MAIN replacement list
moves dirty blocks to auxiliary list
writes dirty blocks back to disk

Written blocks remain on auxiliary list until re-used

2005 Julian Dyke

Buffer Pinning

27

In Oracle 8.0 and above, Oracle uses pinning to reduce


number of logical I/Os

If buffer will be accessed again by the statement, it is pinned


in the buffer cache

Frequently used with index scans

Only appears to be used with consistent gets


not observed with current gets

If pinning was not implemented, number of logical I/Os would


significantly increase

2005 Julian Dyke

Buffer Pinning Statistics

28

buffer is not pinned count statistic


Number of pin-able buffers not pinned by this session
when visited
Equivalent to number of logical I/Os (for that part of
statement)

buffer is pinned count statistic


Number of buffers already pinned by this session when
visited

Number of buffers visited =


buffer is not pinned count + buffer is pinned count

2005 Julian Dyke

Consistent Gets Statistics

29

consistent gets - examination statistic


Number of consistent gets that could be immediately
performed without pinning the buffer
Generally apply to indexes
Require one latch get
Included in consistent gets statistic

no work - consistent read gets statistic


Number of consistent gets that could be performed
without requiring rollback or cleanout
Generally apply to tables
Require two latch gets
Included in consistent gets statistic

2005 Julian Dyke

Full Table Scan


SELECT SUM(c2) FROM t1;
0
SELECT STATEMENT
1 0
TABLE ACCESS (FULL) OF 'T1'

Read Block 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Segment
Data
Empty
Block
Block
Header 3
1
2

Segment
Header

Data
Blocks

session logical reads


consistent gets
no work - consistent read gets
buffer is not pinned count
table scans (short tables)
table scans rows gotten
table scans blocks gotten

Empty
Blocks

19
20
12
13
15
4
2
23
22
18
21
10
16
17
11
14
1
3
5
6
7
8
9
1
5
9
12
13
15
19
20
10
16
17
11
14
4
2
22
23
18
21
8
3
6
7
6
16
17
10
12
1
2
4
5
9
18
15
13
14
11
3
7
8
10
12
2
3
4
5
9
16
17
13
14
11
1
6
7
8
18
15
1
20
36
40
48
12
16
24
28
32
52
56
44
8
64
68
4
72
60
10
12
9
13
14
11
6
7
8
16
17
1
2
3
4
5
18
15
Unused
Blocks

Table T1
High Water
Mark

30

2005 Julian Dyke

STOP

Full Table Scan - Summary

31

In Oracle 9.2
segment header initially read 3 times
segment header read again every 10 extents

All blocks are read up to high water mark

For longer tables blocks can be prefetched

Algorithm differs for Automatic Segment Space Managed


tablespaces

2005 Julian Dyke

Unique Scan
SELECT c2 FROM t1 WHERE c1 = 42;
0
SELECT STATEMENT
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'T1'
2 1
INDEX (UNIQUE SCAN) OF 'I1'

Read Table
Index Block 34
1
Data
Block
Branch
Leaf
Block
Block
Index I1

session logical reads


consistent gets
consistent gets - examination
buffer is not pinned count
index fetch by key
table fetch by rowid
rows fetched by callback
Branch
Block
Leaf
Blocks

Table T1
32

2005 Julian Dyke

STOP

2
1
3
3
1
2
3
1
2
2
1
1
1
1

Index Organised Table


SELECT c2 FROM t1 WHERE c1 = 42;
0
SELECT STATEMENT
1 0
INDEX (UNIQUE SCAN) OF 'I1'

session logical reads


consistent gets
consistent gets - examination
index fetch by key

Read Index Block 3


1
Leaf
Block
Branch
Block

Index I1

Branch
Block
Leaf
Blocks

33

2005 Julian Dyke

STOP

2
1
1
2
2
1
1

Single Table Hash Cluster


SELECT c2 FROM t1 WHERE c1 = 42;
0
SELECT STATEMENT
1 0
TABLE ACCESS (HASH) OF 'T1'

Read Table Block 7

session logical reads


consistent gets
no work - consistent read gets
cluster key scans
cluster key scan block gets
buffer is not pinned count

Data Block

Table T1
Leaf
Blocks

34

2005 Julian Dyke

STOP

1
1
1
1
1
1

Clustering Factor

Measures relationship between index entries and


corresponding data blocks

Used by CBO to calculate cost of using index

Good clustering factor approaches number of blocks in table;


Bad clustering factor approaches number of rows in table

CBO will favour indexes with a better clustering factor

Bad Clustering Factor

35

2005 Julian Dyke

Good Clustering Factor

Range Scan - Bad Clustering Factor


SELECT c2 FROM t1 WHERE c3 = 42;
0
SELECT STATEMENT
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'T1'
2 1
INDEX (RANGE SCAN) OF 'I2'

Read Table
Index Block 22
23
6
10
14
18
1
Leaf Block
Data
Block
(Pinned)
Branch
Block
Index I2

session logical reads


consistent gets
consistent gets - examination
no work - consistent read gets
index scans kdiixs1
buffer is not pinned count
buffer is pinned count
table fetch by rowid
Branch
Block
Leaf
Blocks

Table T1
36

2005 Julian Dyke

STOP

5
6
2
3
4
7
18
5
6
2
3
4
18
7
11
3
4
6
1
2
5
11
4
5
7
1
2
3
6
3
5
1
2
4
3
4
6
1
2
5

Range Scan - Good Clustering Factor


SELECT c2 FROM t1 WHERE c4 = 42;
0
SELECT STATEMENT
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'T1'
2 1
INDEX (RANGE SCAN) OF 'I3'

Read Table
Index Block 9
83
4
1
Leaf Block
Data
Block
(Pinned)
(Pinned)
Branch
Block
Index I3

session logical reads


consistent gets
consistent gets - examination
no work - consistent read gets
index scans kdiixs1
buffer is not pinned count
buffer is pinned count
table fetch by rowid
Branch
Block
Leaf
Blocks

Table T1
37

2005 Julian Dyke

STOP

4
33
1
2
4
1
2
33
11
2
11
11
3
22
1
9
6
7
8
1
3
5
42
6
5
1
2
3
42

Clustering Factor - Summary

38

Bad
Clustering
Factor

Good
Clustering
Factor

session logical reads

consistent gets

consistent gets - examination

no work - consistent gets

index scans kdiixs1

buffer is not pinned count

buffer is pinned count

table fetch by rowid

Higher clustering factor


Reduces number of logical I/Os required
Increases number of buffers that can be pinned

2005 Julian Dyke

Row Prefetching

39

For queries returning more than one row specify maximum


number of rows per round trip

If prefetch size too small


Increased number of round trips
Degrades performance

If prefetch size too large


Increased number of packets
May degrade performance

2005 Julian Dyke

Row Prefetching

Applies to
OCI
Pro*C
JDBC
PL/SQL
SQL*Plus

OCI_ATTR_PREFETCH_ROWS
Host Array
setRowPrefetch ()
BULK COLLECT
SET ARRAYSIZE

OCI default prefetch value is 1 (returns 2 rows per fetch)


res = OCIAttrSet
(
(dvoid *)stmt,
(ub4)OCI_HTYPE_STMT,
(dvoid *)&prefetchRows,
(ub4)0,
(ub4)OCI_ATTR_PREFETCH_ROWS,
(OCIError *)err
);

40

2005 Julian Dyke

Row Prefetching

41

Example - full table scan


1000 row table
31 blocks (+ segment header)
Prefetch
Size

Consistent Gets

1003

518

337

276

227

10

130

20

82

50

53

100

43

250

37

500

35

1000

34

2005 Julian Dyke

Consistent Gets

Prefetch Size

Thank you for your interest


For more information and to provide feedback
please contact me
My e-mail address is:
[email protected]
My website address is:
www.juliandyke.com

42

2005 Julian Dyke

You might also like