0% found this document useful (0 votes)
13 views

B-Tree in Database Management Systems (DBMS)

Uploaded by

royale9063
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

B-Tree in Database Management Systems (DBMS)

Uploaded by

royale9063
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

B-Tree in Database

Management Systems (DBMS)

PRESENTED BY-
SAURAV
ANAND(2224556)
SHIVAM
KUMAR(2224562)
Slide 1
INTRODUCTION(1-a)
•Invented by Rudolf Bayer (1971): The B-Tree was introduced by German computer scientist
Rudolf Bayer in 1971 while he was working at Boeing Research Labs.

•Purpose: It was designed to efficiently manage and access large amounts of data stored on
disk, particularly in situations where the data could not fit into memory.

•Name Origin: The "B" in B-Tree is often thought to stand for "balanced" or "broad," but Bayer
himself did not specify a particular meaning for the letter.

•Impact on Databases: The invention of the B-Tree significantly influenced the development of
database indexing methods, leading to its widespread adoption in database management
systems.

•Foundation for Variants: The original B-Tree concept led to several important variations,
including B+ Trees and B* Trees, which are commonly used in modern DBMSs.
Slide 2
Introduction(1-b)
• B-Trees are fundamental data structures in database management
systems (DBMS), designed to handle large volumes of data
efficiently.
• They play a critical role in indexing, which is vital for quick data
retrieval in databases.
• Unlike simpler tree structures, B-Trees maintain a balanced
structure, ensuring that operations like search, insert, and delete
are consistently fast, even as the data grows.
• In databases, where performance is key, the balanced nature of B-
Trees helps maintain logarithmic time complexity for operations,
making them highly efficient for managing large datasets.
Slide 3
B-Tree or “BLESSING TREE’’
Slide 4
What is a B-Tree?
• A B-Tree is a self-balancing tree data structure that maintains sorted data and allows
searches, insertions, deletions, and sequential access in logarithmic time.
• Unlike binary search trees (BSTs), where each node has at most two children, B-Trees
can have multiple children per node.
• This characteristic makes B-Trees highly efficient for managing large datasets, as the
tree's height remains logarithmic relative to the number of records.
• A B-Tree's structure is defined by its order, which determines the maximum number of
children each node can have.
• All leaf nodes are at the same level, meaning the tree’s height is kept minimal,
facilitating quick access to data.
• The keys within each node are kept in sorted order, and nodes are connected via
pointers, allowing the tree to efficiently manage large amounts of data. This balance
and structure make B-Trees particularly effective for use in databases, where they
significantly improve the efficiency of data retrieval operations. As a result, B-Trees are
widely used in DBMS for indexing, where they help manage and quickly access large
volumes of data.
Slide 5
Why To Use B-Trees in DBMS?
• In databases, performance is often constrained by how quickly data can be
retrieved from disk storage.
• B-Trees are optimized for this purpose, as their balanced structure ensures that
operations such as searching, inserting, and deleting can be performed in
logarithmic time, even as the size of the data grows.
• This efficiency is crucial because, in large databases, even slight delays in data
retrieval can lead to significant performance bottlenecks.
• B-Trees minimize these delays by maintaining a balanced tree structure that
keeps the height of the tree low, ensuring that the number of disk I/O operations
required to access data remains minimal.
• Additionally, B-Trees are designed to store keys and pointers in a way that aligns
well with how data is stored on disk, further optimizing disk access times.
• Their ability to efficiently manage large amounts of data while maintaining quick
access times is why B-Trees are a staple in modern DBMS.
Slide 6
B-Tree Structure
• The structure of a B-Tree is what gives it its efficiency and balance.
• A B-Tree consists of nodes, each of which contains a certain number of
keys and pointers.
• The keys within a node are always stored in sorted order, and the
pointers link to child nodes, which contain further keys and pointers.
• The root node is the topmost node in the tree, and from the root, the
tree branches out into multiple levels. Each internal node has at least
ceil(m/2) children and at most m children, where m is the order of the
tree.
• The leaf nodes, which are the lowest level in the tree, contain keys but
no child pointers. One of the key characteristics of B-Trees is that all
leaf nodes are at the same level, which ensures that the tree remains
balanced.
Slide 7
Properties of B-Tree
•Each internal node (except the root) has at least ceil(m/2) and at most m children, where m is the order of
the B-Tree.

•The root node must have at least two children if it is not a leaf node.

•Every node (except the root) must have at least ceil(m/2) - 1 keys.

•A node can have at most (m – 1) keys.

•All leaf nodes are at the same level, ensuring the tree is balanced and that the height is minimal.
•B-Trees maintain logarithmic time complexity (O(log n)) for search, insertion, and deletion operations, even
as the dataset grows.
Slide 8
B-Tree Insertion Overview
BTreeInsert(Tree, Key):
1. Root = Tree.Root
2. If Root is full (has 2*t - 1 keys):
a. Allocate a new node S
b. Set Tree.Root = S
c. S becomes the new root and has only one child (the
old root)
d. Split the old root and move the median key up to the
new root
3. InsertNonFull(S, Key)

InsertNonFull(Node, Key):
1. If Node is a leaf:
a. Insert Key into the appropriate position in the node's
key array
2. Else (Node is an internal node):
a. Find the child which is the correct position for Key
b. If the child is full:
i. Split the child and move the median key up
ii. Determine the correct child (either the left or
right split) to descend into
c. Recursively call InsertNonFull on the appropriate
child node
Slide 9
Detailed Insertion Example
Slide 10
B-Tree Deletion Overview

•Deletion from a Leaf Node:


Simple Case: Remove the key directly if the node still has enough keys after deletion.
Underflow Case: If the node has too few keys after deletion, rebalancing is needed.

•Deletion from an Internal Node:


Replace Key: If the key has a predecessor or successor, replace it with that and delete the
predecessor/successor from its leaf node.
Underflow Case: If the node has too few keys after deletion, rebalancing is required.

•Rebalancing:
Borrowing from Sibling: If a sibling has extra keys, borrow one to maintain balance.
Merging with Sibling: If siblings don’t have extra keys, merge the underflowing node with a sibling, and adjust
the parent.

•Root Adjustments:
Root with One Child: If the root becomes empty, its only child becomes the new root, reducing the tree’s
height.
Root Remains Balanced: If the root still has multiple children, the tree remains balanced without height
reduction.
Slide 11
Detailed Deletion Example
Slide 12
Search Operation in B-Tree
1. i = 1
2. While i <= n and Key > Node.Key[i]
do:
3. If i <= n and Key == Node.Key[i]
then:
Return (Node, i)
4. If Node is a leaf then:
Return "Key not found"
5. Else:
Read child node Node.C[i] from
disk
• BTreeSearch(Node, Key):
i=i+1
Return BTreeSearch(Node.C[i],
Key
Slide 13
B-Tree vs Other Tree Structures
Slide 14
Applications of B-Trees in DBMS
• It is used in large databases to access data stored on
the disk.
• Searching for data in a data set can be achieved in
significantly less time using the B-Tree.
• With the indexing feature, multilevel indexing can be
achieved.
• Most of the servers also use the B-tree approach.
• B-Trees are used in CAD systems to organize and
search geometric data.
• B-Trees are also used in other areas such as natural
language processing, computer networks, and
Slide 15
Advantages of B-Trees
• B-Trees have a guaranteed time complexity of O(log n)
for basic operations like insertion, deletion, and
searching, which makes them suitable for large data
sets and real-time applications.
• B-Trees are self-balancing.
• High-concurrency and high-throughput.
• Efficient storage.
• B-Trees are also designed to optimize disk read and
write operations, which are the slowest part of data
retrieval processes utilization.
Slide 16
Limitations of B-Trees
• Not the best for all cases.
• Slow in comparison to other data structures.
• B-Trees can introduce overhead in environments
where data is frequently modified.
• Each insertion or deletion may trigger node splits or
merges, which can add computational overhead.
• Each node in a B-Tree contains not only the keys but
also pointers to child nodes, which can consume
significant memory.
Slide 17
Real-World Examples
•B-Trees are widely used in DBMSs like MySQL, PostgreSQL, Oracle, and Microsoft SQL Server.

•Implemented for both primary and secondary indexing to optimize data retrieval.

•Reduce disk I/O by storing data in disk-optimized blocks.

•Efficiently handle large volumes of data, maintaining fast search, insert, and delete operations.

•MySQL's InnoDB uses B+ Trees for indexing; PostgreSQL and Oracle also rely on B-Trees for index
management.

•Enhance query execution speed and maintain data integrity during concurrent operations.

You might also like