Open In App

Introduction of B-Tree

Last Updated : 29 Jan, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

A B-Tree is a specialized m-way tree designed to optimize data access, especially on disk-based storage systems.

  • In a B-Tree of order m, each node can have up to m children and m-1 keys, allowing it to efficiently manage large datasets.
  • The value of m is decided based on disk block and key sizes.
  • One of the standout features of a B-Tree is its ability to store a significant number of keys within a single node, including large key values. It significantly reduces the tree’s height, hence reducing costly disk operations.
  • B Trees allow faster data retrieval and updates, making them an ideal choice for systems requiring efficient and scalable data management. By maintaining a balanced structure at all times,
  • B-Trees deliver consistent and efficient performance for critical operations such as search, insertion, and deletion.

Following is an example of a B-Tree of order 5 .


Properties of a B-Tree

B Tree of order m can be defined as an m-way search tree which satisfies the following properties:

  1. All leaf nodes of a B tree are at the same level, i.e. they have the same depth (height of the tree).
  2. The keys of each node of a B tree (in case of multiple keys), should be stored in the ascending order.
  3. In a B tree, all non-leaf nodes (except root node) should have at least m/2 children.
  4. All nodes (except root node) should have at least m/2 - 1 keys.
  5. If the root node is a leaf node (only node in the tree), then it will have no children and will have at least one key. If the root node is a non-leaf node, then it will have at least 2 children and at least one key.
  6. A non-leaf node with n-1 key values should have n non NULL children.

We can see in the above diagram that all the leaf nodes are at the same level and all non-leaf nodes have no empty sub-tree and have number of keys one less than the number of their children.

Interesting Facts about B-Tree

  • The minimum height of the B-Tree that can exist with n number of nodes and m is the maximum number of children of a node can have is:  h_{min} =\lceil\log_m (n + 1)\rceil - 1    
  • The maximum height of the B-Tree that can exist with n number of nodes and t is the minimum number of children that a non-root node can have is:  h_{max} =\lfloor\log_t\frac {n + 1}{2}\rfloor                   and  t = \lceil\frac {m}{2}\rceil

Need of a B-Tree

The B-Tree data structure is essential for several reasons:

  • Improved Performance Over M-way Trees:
    While M-way trees can be either balanced or skewed, B-Trees are always self-balanced. This self-balancing property ensures fewer levels in the tree, significantly reducing access time compared to M-way trees. This makes B-Trees particularly suitable for external storage systems where faster data retrieval is crucial.
  • Optimized for Large Datasets:
    B-Trees are designed to handle millions of records efficiently. Their reduced height and balanced structure enable faster sequential access to data and simplify operations like insertion and deletion. This ensures efficient management of large datasets while maintaining an ordered structure.

Operations on B-Tree

B-Trees support various operations that make them highly efficient for managing large datasets. Below are the key operations:

Sr. No.OperationTime Complexity
1.SearchO(log n)
2.InsertO(log n)
3.DeleteO(log n)

4.

Traverse

O(n)

Note: "n" is the total number of elements in the B-tree

Search Operation in B-Tree 

Search is similar to the search in Binary Search Tree. Let the key to be searched is k. 

  • Start from the root and recursively traverse down. 
  • For every visited non-leaf node 
    • If the current node contains k, return the node.
    • Otherwise, determine the appropriate child to traverse. This is the child just before the first key greater than k.
  • If we reach a leaf node and don't find k in the leaf node, then return NULL.

Searching a B-Tree is similar to searching a binary tree. The algorithm is similar and goes with recursion. At each level, the search is optimized as if the key value is not present in the range of the parent then the key is present in another branch. As these values limit the search they are also known as limiting values or separation values. If we reach a leaf node and don’t find the desired key then it will display NULL.

Input: Search 120 in the given B-Tree. 

The key 120 is located in the leaf node containing 110 and 120. The search process is complete.

Algorithm for Searching an Element in a B-Tree

C++
struct Node {
    int n;
    int key[MAX_KEYS];
    Node* child[MAX_CHILDREN];
    bool leaf;
};

Node* BtreeSearch(Node* x, int k) {
    int i = 0;
    while (i < x->n && k > x->key[i]) {
        i++;
    }
    if (i < x->n && k == x->key[i]) {
        return x;
    }
    if (x->leaf) {
        return nullptr;
    }
    return BtreeSearch(x->child[i], k);
}
C
BtreeSearch(x, k)
    i = 1
    
    // n[x] means number of keys in x node
    while i ? n[x] and k ? keyi[x]
        do i = i + 1
    if i  n[x] and k = keyi[x]
        then return (x, i)   
    if leaf [x]
        then return NIL
    else
        return BtreeSearch(ci[x], k)
Java
class Node {
    int n;
    int[] key = new int[MAX_KEYS];
    Node[] child = new Node[MAX_CHILDREN];
    boolean leaf;
}

Node BtreeSearch(Node x, int k) {
    int i = 0;
    while (i < x.n && k > x.key[i]) {
        i++;
    }
    if (i < x.n && k == x.key[i]) {
        return x;
    }
    if (x.leaf) {
        return null;
    }
    return BtreeSearch(x.child[i], k);
}
Python
class Node:
    def __init__(self):
        self.n = 0
        self.key = [0] * MAX_KEYS
        self.child = [None] * MAX_CHILDREN
        self.leaf = True

def BtreeSearch(x, k):
    i = 0
    while i < x.n and k > x.key[i]:
        i += 1
    if i < x.n and k == x.key[i]:
        return x
    if x.leaf:
        return None
    return BtreeSearch(x.child[i], k)
C#
class Node {
    public int n;
    public int[] key = new int[MAX_KEYS];
    public Node[] child = new Node[MAX_CHILDREN];
    public bool leaf;
}

Node BtreeSearch(Node x, int k) {
    int i = 0;
    while (i < x.n && k > x.key[i]) {
        i++;
    }
    if (i < x.n && k == x.key[i]) {
        return x;
    }
    if (x.leaf) {
        return null;
    }
    return BtreeSearch(x.child[i], k);
}
JavaScript
// Define a Node class with properties n, key, child, and leaf
class Node {
    constructor() {
        this.n = 0;
        this.key = new Array(MAX_KEYS);
        this.child = new Array(MAX_CHILDREN);
        this.leaf = false;
    }
}

// Define a function BtreeSearch that takes in a Node object x and an integer k
function BtreeSearch(x, k) {
    let i = 0;
    while (i < x.n && k > x.key[i]) {
        i++;
    }
    if (i < x.n && k == x.key[i]) {
        return x;
    }
    if (x.leaf) {
        return null;
    }
    return BtreeSearch(x.child[i], k);
}


To read more about various operations on B-Tree refer below links:

Applications of B-Trees

  • It is used in large databases to access data stored on the disk
  • Searching for data in a data set can be achieved in significantly less time using the B-Tree
  • With the indexing feature, multilevel indexing can be achieved.
  • Most of the servers also use the B-tree approach.
  • B-Trees are used in CAD systems to organize and search geometric data.
  • B-Trees are also used in other areas such as natural language processing, computer networks, and cryptography.

Advantages of B-Trees

  • B-Trees have a guaranteed time complexity of O(log n) for basic operations like insertion, deletion, and searching, which makes them suitable for large data sets and real-time applications.
  • B-Trees are self-balancing.
  • High-concurrency and high-throughput.
  • Efficient storage utilization.

Disadvantages of B-Trees

  • B-Trees are based on disk-based data structures and can have a high disk usage.
  • Not the best for all cases.
  • For small datasets, the search time in a B-Tree might be slower compared to a binary search tree, as each node may contain multiple keys.

Conclusion

In conclusion, B-Trees are a smart and efficient way to handle large amounts of data. Their balanced structure and ability to store multiple keys in one node make searching, adding, and deleting data fast and reliable. B-Trees are especially useful for systems like databases and file storage, where quick access to data is important.


    Similar Reads