0% found this document useful (0 votes)
62 views61 pages

InnoDB Structures

The document provides a detailed overview of the B+Tree structure and its components, including leaf and non-leaf pages, directory structures, and record management. It describes how records are inserted, deleted, and updated within the B+Tree, along with the implications on garbage collection and space management. Additionally, it outlines the structure of index files and the organization of data within InnoDB tables.

Uploaded by

Rahul Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views61 pages

InnoDB Structures

The document provides a detailed overview of the B+Tree structure and its components, including leaf and non-leaf pages, directory structures, and record management. It describes how records are inserted, deleted, and updated within the B+Tree, along with the implications on garbage collection and space management. Additionally, it outlines the structure of index files and the organization of data within InnoDB tables.

Uploaded by

Rahul Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

High-level Overview

Buffer Pool Buffer Pool LRU Data Dictionary Cache


Page Cache
Caching

Buffer Pool Flush List Additional Mem Pool


Adaptive Hash Indexes

Log Buffer Log Group


Transaction
System

iblogfile0 iblogfile1 iblogfile2

Tables with
IBUF_HEADER Doublewrite Buffer Data Dict. file_per_table
Storage

ibdata1
space 0

IBUF_TREE Block 1 (64 pages) SYS_TABLES


[Link]
TRX_SYS Block 2 (64 pages) SYS_COLUMNS
RSEG_HDR SYS_INDEXES [Link]
DICT_HDR UNDO_LOG SYS_FIELDS
[Link]
B+Tree Simplified Leaf Page
Infimum Page 6 Supremum

Next Next Next


Level 0
Leaf

Record Record Record


Key: 0 Key: 1
Value: A Value: B
B+Tree Simplified Non-Leaf Page
Infimum Page 3 Supremum
Non-Leaf

Next Next Next


Level N

Record Record Record


Min Key: 0 Min Key: 4
Page: 4 Page: 5
B+Tree Simplified Level

Next Next
I Page 6 S Page I Page 7 S Page I Page 8 S
Level 0
Leaf

0 1 2 3 4 5
A B C D E F
Prev Prev
Page Page
B+Tree Structure
Infimum Page 3 Supremum
Level 2

Next
Root

≥0 Record ≥4
→4 →5

I Page 4 S I Page 5 S
Next Page
Internal
Level 1

≥0 ≥2 Prev Page ≥4 ≥6
→6 →7 →8 →9

I Page 6 S I Page 7 S I Page 8 S I Page 9 S


Level 0
Leaf

0 1 2 3 4 5 6 7
A B C D E F G H

Levels are numbered starting from 0 at the leaf pages, incrementing up the tree.
Pages on each level are doubly-linked with previous and next pointers in ascending order by key.
Records within a page are singly-linked with a next pointer in ascending order by key.
Infimum represents a value lower than any key on the page, and is always the first record in the singly-linked list of records.
Supremum represents a value higher than any key on the page, and is always the last record in the singly-linked list of records.
Non-leaf pages contain the minimum key of the child page and the child page number, called a "node pointer".
B+Tree Detailed Page Structure
Offset 99 Page 3 (Leaf+Root) Offset 112

Infimum Supremum
Order: 0 Order: 1
N_Owned: 1 N_Owned: 4

Next Offset 125 Offset 157 Offset 189


Record
= 26 Key: 0 Key: 1 Key: 2
Next Next Next
Trx ID: 1 Trx ID: 1 Trx ID: 1
Record Record Record
Roll Ptr: 1000 Roll Ptr: 1012 Roll Ptr: 1024
= 32 = 32 = -77
Value: "A" Value: "B" Value: "C"
Order: 2 Order: 3 Order: 4
N_Owned: 0 N_Owned: 0 N_Owned: 0
Min_Rec: false Min_Rec: false Min_Rec: false
Deleted: false Deleted: false Deleted: false

Index ID: N_Heap:


15 5
Level: N_Recs: Max_Trx_ID:
0 3 1
Prev: N_Dir_Slots: Direction: Garbage Size:
NULL 2 Right 0
Next: Directory: N_Direction: Garbage Off.: Last Insert: Heap Top:
NULL [99, 112] 2 0 189 216

InnoDB table format is Barracuda with "compact" record structure, non-compressed.


Table created with: CREATE TABLE t (i INT NOT NULL, s CHAR(10) NOT NULL, PRIMARY KEY(i)) ENGINE=InnoDB;
Table populated with: INSERT INTO t (i, s) VALUES (0, "A"), (1, "B"), (2, "C");
Record size: 5 (header) + 4 (PK) + 6 (TRX_ID) + 7 (ROLL_PTR) + 10 (non-key fields) = 32 bytes
B+Tree Page Directory Structure
Offset 99 Page Directory: Offset 112
@ 16374 @ 16372 @ 16370 @ 16368 @ 16366 @ 16364 @ 16362
Infimum 99 221 349 477 605 733 112 Supremum
N_Owned: 1 N_Owned: 5

A page directory slot "owns" records prior to


itself, and represents them in the directory.
@ 125 @ 157 @ 189 @ 221 @ 253 @ 285 @ 317 @ 349

K: 0 K: 1 K: 2 K: 3 K: 4 K: 5 K: 6 K: 7
O: 0 O: 0 O: 0 O: 4 O: 0 O: 0 O: 0 O: 4

@ 381 @ 413 @ 445 @ 477 @ 509 @ 541 @ 573 @ 605

K: 8 K: 9 K: 10 K: 11 K: 12 K: 13 K: 14 K: 15
O: 0 O: 0 O: 0 O: 4 O: 0 O: 0 O: 0 O: 4

@ 637 @ 669 @ 701 @ 733 @ 765 @ 797 @ 829 @ 861

K: 16 K: 17 K: 18 K: 19 K: 20 K: 21 K: 22 K: 23
O: 0 O: 0 O: 0 O: 4 O: 0 O: 0 O: 0 O: 0

@ 38
N_Dir_Slots: 7

Infimum always owns only itself, so will always have a slot in the page directory with N_Owned = 1.
Supremum always owns the last few records in the page, and is allowed to own less than 4 records (if the page has fewer).
All directory slots will own a minimum of 4 and maximum of 8 records, except supremum, which may own fewer.
The page directory grows "downwards" from offset 16376, the beginning of the FIL trailer; the first directory entry starts at 16374.
B+Tree Record Initial State
Offset 99 Offset 112

Infimum Supremum

@ 126 @ 159 @ 192

K: 1, D: No K: 2, D: No K: 3, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 225 @ 258 @ 291

K: 4, D: No K: 5, D: No K: 6, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 324 @ 357 @ 390

K: 7, D: No K: 8, D: No K: 9, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 44 @ 46
Garbage Off.: 0 Garbage Size: 0

SQL: create table t (i int not null, s varchar(100) not null, primary key(i)) engine=innodb;
SQL: insert into t (i, s) values (1, "abcdefghij"); for i in 1..9
B+Tree Record Delete 1
Offset 99 Offset 112

Infimum Supremum

@ 126 @ 159 @ 192

K: 1, D: No K: 2, D: No K: 3, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 225 @ 258 @ 291

K: 4, D: No K: 5, D: Yes K: 6, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 324 @ 357 @ 390

K: 7, D: No K: 8, D: No K: 9, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 44 @ 46
Garbage Off.: 258 Garbage Size: 33

SQL: delete from t where i = 5;


Row is marked as deleted.
Garbage size is incremented by total row size.
Garbage offset is pointed to row, and row next pointer is pointed back to self.
B+Tree Record Delete 2
Offset 99 Offset 112

Infimum Supremum

@ 126 @ 159 @ 192

K: 1, D: No K: 2, D: No K: 3, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 225 @ 258 @ 291

K: 4, D: Yes K: 5, D: Yes K: 6, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 324 @ 357 @ 390

K: 7, D: No K: 8, D: No K: 9, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 44 @ 46
Garbage Off.: 225 Garbage Size: 66

SQL: delete from t where i = 5; delete from t where i = 4;


Garbage size is incremented by total row size for each delete.
Garbage offset is pointed to row @ 258 initially, and row next pointer is pointed back to self.
Garbage offset is updated to row @ 225, and row next pointer is pointed to previous garbage offset (garbage is added to head of list).
B+Tree Record Update - Smaller
Offset 99 Offset 112

Infimum Supremum

@ 126 @ 159 @ 192

K: 1, D: No K: 2, D: No K: 3, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

Wasted
@ 225 @ 258 Space @ 291

K: 4, D: No K: 5, D: No K: 6, D: No
V: abcdefghij V: abcde V: abcdefghij

@ 324 @ 357 @ 390

K: 7, D: No K: 8, D: No K: 9, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 44 @ 46
Garbage Off.: 0 Garbage Size: 5

SQL: update t set s="abcde" where i = 5;


Garbage size is incremented by size of row shrinkage, but wasted space is not tracked in garbage list.
B+Tree Record Update - Larger
Offset 99 Offset 112

Infimum Supremum

@ 126 @ 159 @ 192

K: 1, D: No K: 2, D: No K: 3, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 225 @ 258 @ 291

K: 4, D: No K: 5, D: Yes K: 6, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 324 @ 357 @ 390

K: 7, D: No K: 8, D: No K: 9, D: No
V: abcdefghij V: abcdefghij V: abcdefghij

@ 423

K: 5, D: No
V: abcdefghijklmno

@ 44 @ 46
Garbage Off.: 258 Garbage Size: 33

SQL: update t set s="abcdefghijklmno" where i = 5;


Row is deleted, and a new row is inserted into the heap.
Index File Segment Structure
Page 3 (INDEX)

FSEG Header
Roots
Index

Leaf Inode Internal Inode


(fseg_id=2) (fseg_id=3)
Page 2 (INODE)

Inode 5 (fseg_id=2) Inode 6


Frag Array Full List Not Full List Free List (fseg_id=3)
Inodes

Single Pages Length = 5 Length = 1 Length = 0 Frag Array


Used = 2 First Last First Last First Last Full
XDES XDES XDES XDES XDES XDES Not Full
0 1 ... 31
Free

Page 0 (FSP_HDR) Page 16384 (XDES)


Descriptors
Extent

0 1 2 3 4 5 ... 255 0 1 2 3 4 5 ... 255

Extent 0 / Page ≥0 (64 Pages)


(Extents)
Space

Extent 1 Extent 256 Extent 257


[Link]

Page 62
Page 63
0
1
2
3
4
5
6
7
8
9

Page ≥64 ... Page ≥16384 Page ≥16448


...
Page
Page
Page
Page
Page
Page
Page
Page
Page
Page

(64 Pages) (64 Pages) (64 Pages)


History Structure
Page 5 (TRX_SYS)
Transaction
System

Rollback Segment List


Slot 0 Slot 1 Slot 2
(0, 43) (0, 44) (0, 45)

Page 43 (SYS) Page 44


Undo Segment Array History List (SYS)
Segments
Rollback

Length = 1 Array
1022
1023

... Full
0
1
2

First Last
Not Full
Free
Undo Segments

Page 301 (UNDO_LOG)


Undo Page List
Length = 1

First Last
Undo Logs

Extent 0 / Page ≥0 (64 Pages)


Extent 1 Extent 256 Extent 257
[Link]

Page 62
Page 63
0
1
2
3
4
5
6
7
8
9

Page ≥64 ... Page ≥16384 Page ≥16448


...
Page
Page
Page
Page
Page
Page
Page
Page
Page
Page

(64 Pages) (64 Pages) (64 Pages)


Record History
Record
TRX_ID = 3
id = 1
a = “C”

Undo Update Undo Update


TRX_ID = 2 TRX_ID = 1 Undo Insert
id = 1 id = 1 id = 1
a = “B” a = “A”

History above represents the following SQL statements:


INSERT INTO t (id, a) VALUES (1, “A”);
UPDATE t SET a=“B” WHERE id = 1;
UPDATE t SET a=“C” WHERE id = 1;
Space File Overview
0
FSP_HDR: Filespace Header / Extent Descriptor
16 KiB
IBUF_BITMAP: Insert Buffer Bookkeeping
32 KiB
INODE: Index Node Information
48 KiB

More pages ...


256 MiB
XDES: Extent descriptor for next 16,384 pages
IBUF_BITMAP: IBUF Bitmap for next 16,384 pages

More pages ...


512 MiB
XDES: Extent descriptor for next 16,384 pages
IBUF_BITMAP: IBUF Bitmap for next 16,384 pages

More pages ...


ibdata1 File Overview
0
FSP_HDR: Filespace Header / Extent Descriptor

Fixed Page Number Allocations


16 KiB
IBUF_BITMAP: Insert Buffer Bookkeeping
32 KiB
INODE: Index Node Information
48 KiB
SYS: Insert Buffer Header
64 KiB
INDEX: Insert Buffer Root
TRX_SYS: Transaction System Header
SYS: First Rollback Segment
SYS: Data Dictionary Header

More pages ...


Page 64
Double Write Buffer Block 1 (64 pages)
Page 128
Double Write Buffer Block 2 (64 pages)
Page 192

More pages ...


IBD File Overview
0

Fixed Pages
FSP_HDR: Filespace Header / Extent Descriptor
16 KiB
IBUF_BITMAP: Insert Buffer Bookkeeping
32 KiB
INODE: Index Node Information
48 KiB
INDEX: Root page of first index
64 KiB
INDEX: Root page of second index
INDEX: Node pages ...
INDEX: Leaf pages ...
ALLOCATED: Reserved but unused pages ...

More pages ...


List Base Node
N
List Length (4)
N+4
Page Number (4)
First

N+8
Offset (2)
N+10
Page Number (4)
Last

N+14
Offset (2)
N+16
List Node
N
Prev Page Number (4)
N+4
Offset (2)
N+6
Page Number (4)
Next

N+10
Offset (2)
N+12
Basic Page Overview
0
FIL Header (38)
38

Other headers and page data,


depending on page type.

Total usable space: 16,338 bytes.

16376
FIL Trailer (8)
16384
FIL Header/Trailer
0
Checksum (4)
4
Offset (Page Number) (4)
8
Previous Page (4)
12
Next Page (4)
16
LSN for last page modification (8)
24
Page Type (2)
26
Flush LSN (0 except space 0 page 0) (8)
34
Space ID (4)
38

...
16376
Old-style Checksum (4)
16380
Low 32 bits of LSN (4)
16384
FSP_HDR/XDES Overview
0
FIL Header (38)
38
FSP Header (zero-filled for XDES pages) (112)
150
XDES Entry 0 (pages 0- 63) (40)
190
XDES Entry 1 (pages 64- 127) (40)
230
XDES Entry 2 (pages 128- 191) (40)
270
XDES Entry 3 (pages 192- 255) (40)
310

...
10310
XDES Entry 254 (pages 16256-16319) (40)
10350
XDES Entry 255 (pages 16320-16383) (40)
10390

(Empty Space: 5,986 bytes)


16376
FIL Trailer (8)
16384
FSP Header
38
Space ID (4)
42
(Unused) (4)
46
Highest page number in file (size) (4)
50
Highest page number initialized (free limit) (4)
54
Flags (4)
58
Number of pages used in "FREE_FRAG" list (4)
62
List base node for "FREE" list (16)
78
List base node for "FREE_FRAG" list (16)
94
List base node for "FULL_FRAG" list (16)
110
Next Unused Segment ID (8)
118
List base node for "FULL_INODES" list (16)
134
List base node for "FREE_INODES" list (16)
150
XDES Entry
N
File Segment ID (8)
N+8
List node for XDES list (12)
N+20
State (4)
N+24
Page State Bitmap (16)
N+40
2 bits per page, 1=free, 2=clean
IBUF_BITMAP Overview
0
FIL Header (38)
38

Change Buffer Bitmap (pages 0-16384) (8192)


(4 bits per page)

8230

(Empty Space: 8,146 bytes)

16376
FIL Trailer (8)
16384
IBUF_BITMAP Page Entry
Free Space (2 bits)

Buffered Flag (1 bit)


Change Buffer Flag (1 bit)
INODE Overview
0
FIL Header (38)
38
List node for INODE Page list (12)
50
INODE 0 (192)
242
INODE 1 (192)
434
INODE 2 (192)
626

...
15986
INODE 83 (192)
16178
INODE 84 (192)
16370
(Empty Space, 6 bytes)
16376
FIL Trailer (8)
16384
INODE Entry
N
FSEG ID (8)
N+8
Number of used pages in "NOT_FULL" list (4)
N+12
List base node for "FREE" list (16)
N+28
List base node for "NOT_FULL" list (16)
N+44
List base node for "FULL" list (16)
N+60
Magic Number = 97937874 (4)
N+64
Fragment Array Entry 0 (4)
N+68

...

N+188
Fragment Array Entry 31 (4)
N+192
INDEX Overview
0
FIL Header (38)
38
INDEX Header (36)
74
FSEG Header (20)
94
System Records (26)
120

User Records

Records are un-ordered physically but singly-linked to each other


via "next" pointers to the byte offset of the next record in
ascending order.

Heap
Top
Free Space

Page Directory

The page directory grows downwards from the FIL trailer in


ascending order by key. The number of entries is stored in the
INDEX header.

16376
FIL Trailer (8)
16384
INDEX Header
38
Number of Directory Slots (2)
40
Heap Top Position (2)
42
Number of Heap Records / Format Flag (2)
44
First Garbage Record Offset (2)
46
Garbage Space (2)
48
Last Insert Position (2)
50
Page Direction (2)
52
Number of Inserts in Page Direction (2)
54
Number of Records (2)
56
Maximum Transaction ID (8)
64
Page Level (2)
66
Index ID (8)
74
FSEG Header
74
Leaf Pages Inode Space ID (4)
78

82
Leaf Pages Inode Page Number (4)
84
Leaf Pages Inode Offset (2)
Internal (non-leaf) Inode Space ID (4)
88
Internal (non-leaf) Inode Page Number (4)
92
Internal (non-leaf) Inode Offset (2)
94
INDEX System Records
94
Info Flags (4 bits)
Number of Records Owned (4 bits)
95
Order (13 bits)
Record Type (3 bits)
97
Next Record Offset (2)
99
"infimum\0" (8)
107
Info Flags (4 bits)
Number of Records Owned (4 bits)
108
Order (13 bits)
110
Record Type (3 bits)
112
Next Record Offset (2)
120
"supremum" (8)
INDEX Page Directory
N-(d*2)
Directory Slot d (2)
...
N-4
Directory Slot 1 (2)
N-2
Directory Slot 0 (2)
N
TRX_SYS Overview
0
FIL Header (38)
38
Transaction ID (8)
46
TRX_SYS FSEG Entry (10)
56
Rollback Segment 0: Space (4)
60
Rollback Segment 0: Page (4)
64
...
Rollback Segment 127: Space (4)
Rollback Segment 127: Page (4)
1080

(Empty Space: 13304 bytes)

14384
Master Log Info (112)
14496
(Empty Space: 888 bytes)
15384
Binary Log Info (112)
15396
(Empty Space: 980 bytes)
16184
Doublewrite Buffer Info (38)
16222
(Empty Space: 154 bytes)
16376
FIL Trailer (8)
16384
TRX_SYS MySQL Log Info
N
Magic Number (4)
N+4
Log Offset (8)
N+12
Log Name (100)
N+112
TRX_SYS Doublewrite Buffer Info
16184
Doublewrite Buffer FSEG Entry (10)
16194
Doublewrite Magic Number (4)
16198
Block 1 Start Page (4)
16202
Block 2 Start Page (4)
16206

16210
Doublewrite Magic Number (4)
Block 1 Start Page (4)
16214
Block 2 Start Page (4)
16218
Space ID Stored Magic Number (4)
16222
SYS_RSEG_HEADER Overview
0
FIL Header (38)
38
Rollback Segment Header (34)
72
Undo Segment Slot 0 (4)
...
Undo Segment Slot 1023 (4)
4164

(Empty Space: 12212 bytes)

16376
FIL Trailer (8)
16384
Rollback Segment Header
38
Max Size (4)
42
History Size (4)
46
History List Base Node (16)
62
Rollback Segment FSEG Entry (10)
72
UNDO_LOG Overview
0
FIL Header (38)
38
UNDO Page Header (18)
56
UNDO Segment Header (26)
86

Undo Records

16376
FIL Trailer (8)
16384
Undo Page Header
38
Undo Page Type (2)
40
Latest Log Record Offset (2)
42
Free Space Offset (2)
44
Undo Page List Node (12)
56
Undo Segment Header
56
State (2)
58
Last Log Offset (2)
60
Undo Segment FSEG Entry (10)
70
Undo Segment Page List Base Node (16)
86
Undo Log Header
0
Transaction ID (8)
8
Transaction Number (8)
16
Delete Marks Flag (2)
18
Log Start Offset (2)
20
XID Flag (1)
21
DDL Transaction Flag (1)
22
Table ID if DDL Transaction (8)
30
Next Undo Log Offset (2)
32
Prev Undo Log Offset (2)
34
History List Node (12)
46
XID Format (4)
If XID Flag

50
TRID Length (4)
54
BQUAL Length (4)
58
XID Data (128)
186
Undo Record
N-2
Previous Record Offset (2)
N
Next Record Offset (2)
N+2
Type + Extern Flag + Compilation Info (1)
N+3
Undo Number (1-11)
Table ID (1-11)

(Variable length undo record data.)


Undo Record for Update
N-2
Previous Record Offset (2)
N
Next Record Offset (2)
N+2
Type + Extern Flag + Compilation Info (1)
N+3
Undo Number (1-11)
Table ID (1-11)
Key Field 1 Length (1-5)
Key Field 1 Content (n)
Key Fields

...
Key Field N Length (1-5)
Key Field N Content (n)
Updated Field Count
Updated Field 1 Field Number (1-5)
Updated Field 1 Length (1-5)
Updated Fields

Updated Field 1 Content (n)


...
Updated Field N Field Number (1-5)
Updated Field N Length (1-5)
Updated Field N Content (n)
(Variable Length Record Header)
Rec. Offset ->
(Variable Length Record Data)
Record Format - Overview
Variable field lengths (1-2 bytes per var. field)
Nullable field bitmap (1 bit per nullable field)
N-5
Info Flags (4 bits)

“Header”
N-4
Number of Records Owned (4 bits)
Order (13 bits)
N-2
Record Type (3 bits)
Next Record Offset (2)
N
(Record Data)
Record Format - Header
Variable field lengths (1-2 bytes per var. field)
Nullable field bitmap (1 bit per nullable field)
N-5
Info Flags (4 bits)
N-4
Number of Records Owned (4 bits)
Order (13 bits)
N-2
Record Type (3 bits)
N
Next Record Offset (2)
Rollback Pointer (ROLL_PTR)
0
Insert Flag (1 bit)
Rollback Segment ID (7 bits)
1
Page Number (4)
5
Page Offset (2)
7
Record Format - Clustered Key - Leaf Pages
N
(Variable Length Record Header)
N+k
Cluster Key Fields (k)
N+k+6
Transaction ID (6)
N+k+13
Roll Pointer (7)
N+k+13+j
Non-Key Fields (j)
Record Format - Clustered Key - Non-Leaf Pages
N
(Variable Length Record Header)
N+k
Cluster Key Min. Key on Child Page (k)
N+k+4
Child Page Number (4)
Record Format - Secondary Key - Leaf Pages
N
(Variable Length Record Header)
N+k
Secondary Key Fields (k)
Cluster Key Fields (j)
N+k+j
Record Format - Secondary Key - Non-Leaf Pages
N
(Variable Length Record Header)
N+k
Secondary Key Min. Key on Child Page (k)
Cluster Key Min. Key on Child Page (j)
N+k+j
Child Page Number (4)
N+k+j+4
INDEX Overview
0
FIL Header (38)
38
INDEX Header (36)
74
FSEG Header (20)
94

Compressed Record Data

Modification Log

(Free Space)

Trailer Data

Dense Page Directory


Compressed Index Data
Uncompressed
Index Records
Record 1: “A” Compressed Data
Key fields
ZLIB Header
System Fields Description 1
Non-key fields Description 2
Description 3
Description 4
Record 2: “C” Record Data 1
Key fields Record Data 2
Record Data 3
System Fields Record Data 4
Non-key fields
Uncompressed Data
Record 3: “B” TRX_ID/ROLL_PTR 4
Key fields TRX_ID/ROLL_PTR 3
TRX_ID/ROLL_PTR 2
System Fields TRX_ID/ROLL_PTR 1
Non-key fields
Slot 3
Slot 2
Record 4: “D” Slot 1
Key fields Slot 0

System Fields

Non-key fields
Compressed Index Data
Uncompressed
Index Records
Record 1: “A” Compressed Data
Key fields
ZLIB Header
System Fields Description 1
Non-key fields Description 2
Description 3
Description 4
Record 2: “C” Record Data 1
Key fields Record Data 2
Record Data 3
System Fields Record Data 4
Non-key fields
Uncompressed Data
Record 3: “B” TRX_ID/ROLL_PTR 4
Key fields TRX_ID/ROLL_PTR 3
TRX_ID/ROLL_PTR 2
System Fields TRX_ID/ROLL_PTR 1
Non-key fields
Slot 3
Slot 2
Record 4: “D” Slot 1
Key fields Slot 0

System Fields

Non-key fields
Compressed Index Data
Uncompressed
Index Records
Record 1: “A” Compressed Data
Key fields
ZLIB Header
System Fields Description 1
Non-key fields Description 2
Description 3
Description 4
Record 2: “C” Record Data 1
Key fields Record Data 2
Record Data 3
System Fields Record Data 4
Non-key fields
Uncompressed Data
Record 3: “B” TRX_ID/ROLL_PTR 4
Key fields TRX_ID/ROLL_PTR 3
TRX_ID/ROLL_PTR 2
System Fields TRX_ID/ROLL_PTR 1
Non-key fields
Slot 3
Slot 2
Record 4: “D” Slot 1
Key fields Slot 0

System Fields

Non-key fields
Modification Log
Entry 1 Heap Number (1-2)
Record Data
Heap Number (1-2)
Entry 2

Record Data
...
Heap Number (1-2)
Entry N

Record Data
End Marker (1) = 0
Trailer Data
Entry N TRX_ID (6)
Roll Pointer (5)
...
TRX_ID (6)
Entry 2

Roll Pointer (5)


TRX_ID (6)
Entry 1

Roll Pointer (5)


Dense Page Directory
Slot N Deleted Flag (1 bit)
Owned Flag (1 bit)
Record Offset (14 bits)
...
Deleted Flag (1 bit)
Slot 1

Owned Flag (1 bit)


Record Offset (14 bits)
Deleted Flag (1 bit)
Slot 0

Owned Flag (1 bit)


Record Offset (14 bits)
The dense page directory contains one entry per records, in the key’s collation order.
All directory slots will own a minimum of 4 and maximum of 8 records.
The page directory grows "downwards" from the end of the page.
Record Format - Change Buffer - Leaf Pages
Space ID (4)
Field Marker (1)
Page Number (4)
Operation Counter (2)
Metadata

Operation Type (2)


Flags (1)
Data Type (1)
Type Info. 1

“Precise” Data Type (1)


Length (2)
Collation Code (2)

Type Information N
Secondary Index Fields (j)

You might also like