Retrieval (Search)
Searching in a B-Tree follows the same principles as in a Binary Search Tree, but with multi-way branching due to each node having children.
The algorithm recursively descends the tree based on comparisons:
- As each node contains multiple keys sorted, a binary search within the node is used to locate the appropriate interval for the target key.
- If the key matches one in the node, the search terminates successfully.
- If not, the key must lie between two existing keys or fall outside the key range and the algorithm follows the child pointer that covers that range.
- If the correct child does not exist (i.e. the current node is a leaf), the search terminates unsuccessfully.
This structure enables fast lookups with minimal disk reads, as each node access potentially narrows the search range significantly.
Example Of Search
Inserting Into A B-Tree
Insertion in a B-Tree involves traversing to the appropriate leaf node and placing the new key, while preserving all B-tree properties, particularly:
- No node exceeds its maximum key capacity and,
- All leaves remain at the same depth (i.e., the tree stays balanced).
Therefore, the algorithm cannot insert into a full node or create arbitrary new leaves. Instead, it follows a proactive strategy:
- Traverse from the root to the appropriate leaf, making comparisons as in a standard search.
- Before descending into any child node, check if it is full (i.e., contains keys).
- If a node is full, do not step on it, split it before descending:
- Divide the node into two around its median key.
- Promote the median key to the parent node, which is guaranteed to have space due to this proactive strategy.
- After all necessary splits, descend to the correct non-full leaf and insert the new key in sorted order (using binary search to find the insertion point).
This algorithm never descend into a full node, allowing insertion without violating the B-tree’s degree constraints. Additionally, as the tree’s height increases from the root (when it splits), the leaves are always at the same depth.
Splitting Preserves B-Tree Bounds
- A full node has keys. When split, one key is promoted to the parent, and the remaining keys are divided evenly into two nodes with keys each, satisfying the lower and upper bounds. This will always be a clean split.
- When a new root is created during a split, it starts with just one key. This is valid, as the root is exempt from the minimum key constraint.
Deletion
Deletion in a B-Tree is analogous to insertion but must avoid descending into nodes with the minimum number of keys (), to prevent underflow. The root is once again, exempt from this restriction.
Deletion is handled through three main cases:
Case 1: x Is At A Leaf Node With Keys
If the element to delete, x, is at a leaf node that is not at minimum capacity (), then simply delete x. This is the only case where deletion occurs, all other cases eventually transform into this one.
Case 2: x Is In An Internal Node
If x is in an internal node, then replace x with a suitable in-order replacement:
- Case 2a: If the left child of
xhas keys (more than the lower bound):- Find the predecessor
win the left subtree, replacexwithw, and recursively deletew.
- Find the predecessor
- Case 2b: If the right child of
xhas keys (more than the lower bound):- Find the successor
yin the right subtree, replacexwithy, and recursively deletey.
- Find the successor
- Case 2c: If both children have exactly keys:
- Merge
xand its two children into a single node, then recursively deletex.
- Merge
When both children exceed the lower bound, either Case 2a or 2b may be chosen.
Case 3: Descending Into A Subtree With Keys
If the recursive deletion path reaches a subtree containing a child node with exactly keys (the lower bound), the algorithm must first ensure the node has sufficient keys to avoid underflow before descending further.
- Case 3a: If the subtrees immediate sibling has keys (i.e. more than the lower bound):
- Let be the deficient child (with keys), and or be the richer sibling (with keys).
- Perform a rotation among the triple as follows:
- Move the appropriate parent key down into to restore capacity.
- Promote the leftmost key from (or rightmost from ) into the parent to preserve key order.
- Reassign child pointers accordingly: the leftmost/rightmost subtree associated with the promoted key must be transferred to the rotated key in .
- Case 3b: If the subtrees immediate sibling has keys
- Merge the node with its immediate sibling and the separating parent key into a single node.
- This new merged node will have keys, satisfying all constraints
- If the parent was the root and becomes empty, the merged node becomes the root (the trees height decreases).
- Merge the node with its immediate sibling and the separating parent key into a single node.
A "sibling" refers to the adjacent subtree of a shared parent key.
Complexity Analysis
Let be the total number of keys in the B-tree, and the minimum degree.
Height
The height of a B-tree is because each node has at least children (except possibly the root), leading to exponential growth in capacity as the depth increases.
Time Complexity
Searching requires node accesses, with each node searched using binary search in time. The combined complexity is .
Insertion involves traversing from root to leaf over levels. At most one split occurs per level, each costing . Thus, the total insertion complexity is .
Deletion also descends levels and may perform merges or rotations costing per level. This results in an overall complexity of .
Space Complexity
A B-tree stores all keys distributed across its nodes, each containing between and keys, resulting in space overall. The node structure and child pointers add only constant overhead per key, so the total space used scales linearly with the number of stored keys.
Although node splits and merges cost , when is fixed, both insertion and deletion run in time overall.
In practice, is chosen large (e.g., ) to keep tree height small and optimise cache and disk performance.
