B+ Tree


A B+ Tree is an advanced variant of the B-tree data structure, widely used in database systems and file systems due to its efficient data retrieval properties. While B-trees allow keys to be stored in both internal and leaf nodes, B+ Trees store all data (or values) in the leaf nodes only. This difference brings several advantages in terms of searching, insertion, and deletion operations.


What is a B+ Tree?

A B+ Tree is a self-balancing tree data structure that maintains sorted data and allows for efficient searching, insertion, and deletion operations. It is an extension of the B-tree, with the following key differences:

  1. Data Storage: In a B+ tree, all actual data values are stored at the leaf level, while internal nodes only store keys (index values).
  2. Linked List of Leaf Nodes: The leaf nodes are connected in a linked list, allowing for efficient range queries (traversing the leaves in sequential order).
  3. Sorted Data: Like B-trees, B+ trees maintain sorted data, which allows for fast searching and retrieval.

Properties of a B+ Tree

Here are the main properties of a B+ Tree:

  1. Balanced Tree: All leaf nodes are at the same level, ensuring that the tree is balanced.
  2. Internal Nodes: Internal nodes store only keys (index values), and they direct the search to the correct leaf node.
  3. Leaf Nodes: Leaf nodes store actual data (records) and are linked to each other to facilitate sequential access.
  4. Order of the Tree: The order (m) of the tree determines the maximum number of children a node can have. It defines how many keys can be stored in a node and the number of child pointers.
  5. Efficient Range Queries: Due to the linked leaf nodes, B+ trees are particularly efficient for range queries, where all records within a specified range need to be retrieved.

Advantages of B+ Trees

  • Efficient Search: Since all actual records are in the leaf nodes and these nodes are ordered and linked, searching for a value is efficient.
  • Efficient Range Queries: The linked list structure of the leaf nodes allows for fast sequential access to records in a range.
  • Balanced Structure: Like the B-tree, a B+ tree is also self-balancing, ensuring that all leaf nodes are at the same level, which helps in maintaining efficient query performance.
  • Better Disk Utilization: The separation of keys and data allows B+ trees to minimize the number of disk accesses for data retrieval. Only the leaf nodes contain data, while internal nodes are smaller and require fewer disk accesses.

B+ Tree Structure

A B+ tree consists of:

  • Root node: The topmost node in the tree that may or may not be full. It contains keys to guide searches to the correct subtree.
  • Internal nodes: Nodes that do not store actual data but contain keys that help in navigation towards the correct leaf node.
  • Leaf nodes: The bottom-most nodes that store actual records or pointers to records and are linked together in a doubly-linked list.

B+ Tree Operations

1. Insertion in a B+ Tree

Insertion in a B+ tree follows these steps:

  1. Search for the correct position: Start at the root and traverse the tree downwards to find the correct leaf node where the new key should be inserted.
  2. Insert the key: If the leaf node has space, insert the key and its corresponding record.
  3. Split if necessary: If the leaf node becomes full, split it into two nodes, and promote the middle key to the parent node.

2. Deletion in a B+ Tree

Deletion in a B+ tree involves:

  1. Search for the key: Start at the root and traverse the tree to find the key.
  2. Delete the key: Remove the key and its corresponding record from the leaf node.
  3. Rebalance if necessary: If a node becomes underfilled, borrow a key from a sibling or merge nodes.

3. Searching in a B+ Tree

To search for a key in a B+ tree:

  1. Start from the root and traverse the tree, following the child pointers based on the key comparisons.
  2. Once a leaf node is reached, perform the actual key search in the leaf node.

Python Code Example: B+ Tree Implementation

Here’s a Python implementation of a simple B+ Tree that demonstrates insertion and search operations.

class BPlusTreeNode:
    def __init__(self, t, leaf=False):
        self.t = t  # Minimum degree (defines range for the number of keys)
        self.leaf = leaf  # True if the node is a leaf node
        self.keys = []  # List of keys in the node
        self.children = []  # List of children nodes
        self.next = None  # For linking leaf nodes in a linked list

class BPlusTree:
    def __init__(self, t):
        self.t = t  # Minimum degree
        self.root = BPlusTreeNode(t, True)  # Root node is initially a leaf node

    def search(self, key):
        return self.search_in_node(self.root, key)

    def search_in_node(self, node, key):
        if node.leaf:
            # Search in the leaf node
            for i in range(len(node.keys)):
                if node.keys[i] == key:
                    return key
            return None
        else:
            # Traverse the internal node to find the correct child node
            i = 0
            while i < len(node.keys) and key > node.keys[i]:
                i += 1
            return self.search_in_node(node.children[i], key)

    def insert(self, key):
        root = self.root
        if len(root.keys) == (2 * self.t) - 1:
            # If the root is full, split it
            new_node = BPlusTreeNode(self.t, False)
            new_node.children.append(self.root)
            self.split(new_node, 0)
            self.root = new_node

        # Insert the key in the non-full root
        self.insert_non_full(self.root, key)

    def insert_non_full(self, node, key):
        if node.leaf:
            # Find the correct position and insert the key in the leaf node
            i = len(node.keys) - 1
            while i >= 0 and key < node.keys[i]:
                i -= 1
            node.keys.insert(i + 1, key)
        else:
            # Traverse to the correct child node
            i = len(node.keys) - 1
            while i >= 0 and key < node.keys[i]:
                i -= 1
            i += 1
            child = node.children[i]
            if len(child.keys) == (2 * self.t) - 1:
                # If the child is full, split it
                self.split(node, i)
                if key > node.keys[i]:
                    i += 1
            self.insert_non_full(node.children[i], key)

    def split(self, parent, index):
        # Split the full child node into two nodes
        full_node = parent.children[index]
        new_node = BPlusTreeNode(self.t, full_node.leaf)
        mid = self.t - 1

        # Promote the middle key to the parent
        parent.keys.insert(index, full_node.keys[mid])
        parent.children.insert(index + 1, new_node)

        # Move the keys and children to the new node
        new_node.keys = full_node.keys[mid + 1:]
        full_node.keys = full_node.keys[:mid]
        if not full_node.leaf:
            new_node.children = full_node.children[mid + 1:]
            full_node.children = full_node.children[:mid + 1]

    def inorder_traversal(self):
        # Perform an in-order traversal to print keys in sorted order
        self.inorder_traversal_helper(self.root)

    def inorder_traversal_helper(self, node):
        if node:
            i = 0
            while i < len(node.keys):
                if not node.leaf:
                    self.inorder_traversal_helper(node.children[i])
                print(node.keys[i], end=" ")
                i += 1
            if not node.leaf:
                self.inorder_traversal_helper(node.children[i])

# Example Usage
btree = BPlusTree(3)  # Create a B+ tree with minimum degree 3
keys = [10, 20, 5, 6, 12, 30, 7, 17]

for key in keys:
    btree.insert(key)

# In-order Traversal (prints keys in sorted order)
print("In-order Traversal of B+ Tree:")
btree.inorder_traversal()

# Search for a key
key = 12
found_key = btree.search(key)
print(f"\nSearch for key {key}: {'Found' if found_key else 'Not Found'}")