In C++, B-trees are balanced tree data structures that maintain sorted data and allow searches, sequential access, insertions, and deletions in logarithmic time. B-trees are generalizations of binary search trees (BST) in that a node can have more than two children. B-trees are optimized for systems that read and write large blocks of data. In this article, we will learn how to implement a B-tree in C++ programming language.
Properties of a B-TreeA B-tree is a self-balancing search tree where nodes can have multiple children. It maintains balance by ensuring all leaf nodes are at the same level. The number of children of a node is constrained to a predefined range.
A B-tree has the following properties:
- Every node has at most m children.
- Every non-leaf node (except the root) has at least ⌈m/2⌉ children.
- The root node contains at least two children.
- A non-leaf node with k children contains k-1 keys.
- All leaf nodes appear on the same level and are empty.
Implementation of B-Tree in C++A B-tree can be implemented using a node structure that contains an array of keys and an array of child pointers. The number of keys in a node is always one less than the number of child pointers. The following diagram represents the structure of a B-tree:’
 Diagram of a B-Tree Representation of B-Tree in C++To represent a B-Tree in C++, we will use a class BTreeNode to define the node structure and another class BTree to implement the B-tree operations. We’ll implement a class template to keep the B-tree generic so that it can store multiple data types.
template <typename T>
class BTreeNode {
public:
T* keys;
int t;
BTreeNode<T>** Children;
int n;
bool leaf;
}; here:
- keys: represents the array of keys.
- t: defines the minimum degree
- children: is an array of child pointers for a node.
- leaf: denotes whether a node is leaf or node.
- n: denotes the current number of keys
Implementation of Basic Operations of B-Tree in C++Following are some of the common applications of the B-Tree in C++:
Operation Name
| Description
| Time Complexity
| Space complexity
|
---|
Insert
| Inserts a new element into the tree
| O(log n)
| O(1)
|
---|
Delete Node
| Deletes a given node from the tree
| O(log n)
| O(1)
|
---|
Search
| Searches for an element in the tree
| O(log n)
| O(1)
|
---|
Traverse
| Traverses the B-Tree and prints the elements
| O(n)
| O(1)
|
---|
Split Child
| Splits a full child node during insertion of a node
| O(1)
| O(1)
|
---|
Merge
| Combine two under-full sibling nodes and a parent key into a single node
| O(log n)
| O(1)
|
---|
Borrow
| Transfer a key between sibling nodes via the parent to balance node occupancy.
| O(log n)
| O(1)
|
---|
Implementation of Insert Function- If the root is full, create a new root and split the old root.
- Call insertNonFull on the appropriate node.
- In insertNonFull:
- If the node is a leaf, insert the key in the correct position.
- If the node is not a leaf, find the correct child and:
- If the child is full, split it.
- Recursively call insertNonFull on the appropriate child.
Implementation of Delete Node Function- Search for the key to be deleted.
- If the key is in a leaf node:
- Remove the key if the node has more than minimum keys.
- If not, borrow a key from a sibling or merge with a sibling.
- If the key is in an internal node:
- Replace with predecessor or successor from the appropriate child.
- Recursively delete the predecessor or successor.
- Ensure all nodes maintain the minimum number of keys.
- If the root is left with no keys, make its only child the new root.
Implementation of Search Function- Start from the root.
- For each node:
- If the key is present in the node, return the node.
- If the node is a leaf, return null.
- Recursively search the appropriate child.
Implementation of Traverse Function- Start from the root node.
- For each node:
- Traverse i-th child.
- Print i-th key.
- Repeat for all keys and children.
Implementation of Split Child FunctionSplitting occurs when a node becomes full and needs to be divided into two nodes.
Consider a B-tree of order 5 (maximum 4 keys per node). Let’s say we have a full node:
[10 | 20 | 30 | 40] When we try to insert 25, we need to split this node.We will follow the below steps to do this:
- Create a new node
- Move the upper half of the keys to the new node
- Push the middle key up to the parent
After splitting:
[30]
/ \
[10 | 20] [40] And then we can insert 25:
[30]
/ \
[10 | 20 | 25] [40] - Create a new node.
- Move the upper half of the keys from the full node to the new node.
- If the full node is not a leaf, move the corresponding children as well.
- Insert the middle key into the parent node.
- Update the parent’s child pointers to include the new node.
Implementation of Merge FunctionMerging occurs when a node has fewer than the minimum required keys and can’t borrow from siblings.
Consider a B-tree of order 3 (minimum 1 key per node). Let’s say we have this situation:
[30]
/ \
[10] [40] If we need to delete 40, we can’t leave an empty node. So, we merge the remaining nodes:
[10 | 30] - Identify the two nodes to be merged and their parent.
- If merging with right sibling:
- Move the separating key from the parent to the end of the left node.
- Copy all keys and child pointers from the right node to the left node.
- Update the parent to remove the separating key and the pointer to the right node.
- If merging with left sibling:
- Move the separating key from the parent to the end of the left node.
- Copy all keys and child pointers from the current node to the left node.
- Update the parent to remove the separating key and the pointer to the current node.
- Delete the empty node.
- If the parent is now under-full, recursively apply merge or borrow operations to it.
Implementation of Borrow FunctionBorrowing occurs when a node has fewer than the minimum required keys but can take a key from a sibling.
Consider a B-tree of order 3:
[30]
/ \
[10 | 20] [40] If we need to delete 40, instead of merging, we can borrow it to the other node:
[20]
/ \
[10] [30 | 40] - Identify the borrowing node, the sibling to borrow from, and their parent.
- If borrowing from right sibling:
- Move the separating key from the parent to the end of the current node.
- Move the first key from the right sibling to the parent as the new separating key.
- If the sibling is not a leaf, move its first child pointer to be the last child pointer of the current node.
- If borrowing from left sibling:
- Move the separating key from the parent to the start of the current node.
- Move the last key from the left sibling to the parent as the new separating key.
- If the sibling is not a leaf, move its last child pointer to be the first child pointer of the current node.
- Update key counts for both nodes.
C++ Program to Implement B-TreeThe following program demonstrates the implementation of B-Tree in C++:
C++
// C++ Program to Implement B-Tree
#include <iostream>
using namespace std;
// class for the node present in a B-Tree
template <typename T, int ORDER>
class BTreeNode {
public:
// Array of keys
T keys[ORDER - 1];
// Array of child pointers
BTreeNode* children[ORDER];
// Current number of keys
int n;
// True if leaf node, false otherwise
bool leaf;
BTreeNode(bool isLeaf = true) : n(0), leaf(isLeaf) {
for (int i = 0; i < ORDER; i++)
children[i] = nullptr;
}
};
// class for B-Tree
template <typename T, int ORDER>
class BTree {
private:
BTreeNode<T, ORDER>* root; // Pointer to root node
// Function to split a full child node
void splitChild(BTreeNode<T, ORDER>* x, int i) {
BTreeNode<T, ORDER>* y = x->children[i];
BTreeNode<T, ORDER>* z = new BTreeNode<T, ORDER>(y->leaf);
z->n = ORDER / 2 - 1;
for (int j = 0; j < ORDER / 2 - 1; j++)
z->keys[j] = y->keys[j + ORDER / 2];
if (!y->leaf) {
for (int j = 0; j < ORDER / 2; j++)
z->children[j] = y->children[j + ORDER / 2];
}
y->n = ORDER / 2 - 1;
for (int j = x->n; j >= i + 1; j--)
x->children[j + 1] = x->children[j];
x->children[i + 1] = z;
for (int j = x->n - 1; j >= i; j--)
x->keys[j + 1] = x->keys[j];
x->keys[i] = y->keys[ORDER / 2 - 1];
x->n = x->n + 1;
}
// Function to insert a key in a non-full node
void insertNonFull(BTreeNode<T, ORDER>* x, T k) {
int i = x->n - 1;
if (x->leaf) {
while (i >= 0 && k < x->keys[i]) {
x->keys[i + 1] = x->keys[i];
i--;
}
x->keys[i + 1] = k;
x->n = x->n + 1;
} else {
while (i >= 0 && k < x->keys[i])
i--;
i++;
if (x->children[i]->n == ORDER - 1) {
splitChild(x, i);
if (k > x->keys[i])
i++;
}
insertNonFull(x->children[i], k);
}
}
// Function to traverse the tree
void traverse(BTreeNode<T, ORDER>* x) {
int i;
for (i = 0; i < x->n; i++) {
if (!x->leaf)
traverse(x->children[i]);
cout << " " << x->keys[i];
}
if (!x->leaf)
traverse(x->children[i]);
}
// Function to search a key in the tree
BTreeNode<T, ORDER>* search(BTreeNode<T, ORDER>* x, T k) {
int i = 0;
while (i < x->n && k > x->keys[i])
i++;
if (i < x->n && k == x->keys[i])
return x;
if (x->leaf)
return nullptr;
return search(x->children[i], k);
}
// Function to find the predecessor
T getPredecessor(BTreeNode<T, ORDER>* node, int idx) {
BTreeNode<T, ORDER>* current = node->children[idx];
while (!current->leaf)
current = current->children[current->n];
return current->keys[current->n - 1];
}
// Function to find the successor
T getSuccessor(BTreeNode<T, ORDER>* node, int idx) {
BTreeNode<T, ORDER>* current = node->children[idx + 1];
while (!current->leaf)
current = current->children[0];
return current->keys[0];
}
// Function to fill child node
void fill(BTreeNode<T, ORDER>* node, int idx) {
if (idx != 0 && node->children[idx - 1]->n >= ORDER / 2)
borrowFromPrev(node, idx);
else if (idx != node->n && node->children[idx + 1]->n >= ORDER / 2)
borrowFromNext(node, idx);
else {
if (idx != node->n)
merge(node, idx);
else
merge(node, idx - 1);
}
}
// Function to borrow from previous sibling
void borrowFromPrev(BTreeNode<T, ORDER>* node, int idx) {
BTreeNode<T, ORDER>* child = node->children[idx];
BTreeNode<T, ORDER>* sibling = node->children[idx - 1];
for (int i = child->n - 1; i >= 0; --i)
child->keys[i + 1] = child->keys[i];
if (!child->leaf) {
for (int i = child->n; i >= 0; --i)
child->children[i + 1] = child->children[i];
}
child->keys[0] = node->keys[idx - 1];
if (!child->leaf)
child->children[0] = sibling->children[sibling->n];
node->keys[idx - 1] = sibling->keys[sibling->n - 1];
child->n += 1;
sibling->n -= 1;
}
// Function to borrow from next sibling
void borrowFromNext(BTreeNode<T, ORDER>* node, int idx) {
BTreeNode<T, ORDER>* child = node->children[idx];
BTreeNode<T, ORDER>* sibling = node->children[idx + 1];
child->keys[child->n] = node->keys[idx];
if (!child->leaf)
child->children[child->n + 1] = sibling->children[0];
node->keys[idx] = sibling->keys[0];
for (int i = 1; i < sibling->n; ++i)
sibling->keys[i - 1] = sibling->keys[i];
if (!sibling->leaf) {
for (int i = 1; i <= sibling->n; ++i)
sibling->children[i - 1] = sibling->children[i];
}
child->n += 1;
sibling->n -= 1;
}
// Function to merge two nodes
void merge(BTreeNode<T, ORDER>* node, int idx) {
BTreeNode<T, ORDER>* child = node->children[idx];
BTreeNode<T, ORDER>* sibling = node->children[idx + 1];
child->keys[ORDER / 2 - 1] = node->keys[idx];
for (int i = 0; i < sibling->n; ++i)
child->keys[i + ORDER / 2] = sibling->keys[i];
if (!child->leaf) {
for (int i = 0; i <= sibling->n; ++i)
child->children[i + ORDER / 2] = sibling->children[i];
}
for (int i = idx + 1; i < node->n; ++i)
node->keys[i - 1] = node->keys[i];
for (int i = idx + 2; i <= node->n; ++i)
node->children[i - 1] = node->children[i];
child->n += sibling->n + 1;
node->n--;
delete sibling;
}
// Function to remove a key from a non-leaf node
void removeFromNonLeaf(BTreeNode<T, ORDER>* node, int idx) {
T k = node->keys[idx];
if (node->children[idx]->n >= ORDER / 2) {
T pred = getPredecessor(node, idx);
node->keys[idx] = pred;
remove(node->children[idx], pred);
} else if (node->children[idx + 1]->n >= ORDER / 2) {
T succ = getSuccessor(node, idx);
node->keys[idx] = succ;
remove(node->children[idx + 1], succ);
} else {
merge(node, idx);
remove(node->children[idx], k);
}
}
// Function to remove a key from a leaf node
void removeFromLeaf(BTreeNode<T, ORDER>* node, int idx) {
for (int i = idx + 1; i < node->n; ++i)
node->keys[i - 1] = node->keys[i];
node->n--;
}
// Function to remove a key from the tree
void remove(BTreeNode<T, ORDER>* node, T k) {
int idx = 0;
while (idx < node->n && node->keys[idx] < k)
++idx;
if (idx < node->n && node->keys[idx] == k) {
if (node->leaf)
removeFromLeaf(node, idx);
else
removeFromNonLeaf(node, idx);
} else {
if (node->leaf) {
cout << "The key " << k << " is not present in the tree\n";
return;
}
bool flag = ((idx == node->n) ? true : false);
if (node->children[idx]->n < ORDER / 2)
fill(node, idx);
if (flag && idx > node->n)
remove(node->children[idx - 1], k);
else
remove(node->children[idx], k);
}
}
public:
BTree() { root = new BTreeNode<T, ORDER>(true); }
// Function to insert a key in the tree
void insert(T k) {
if (root->n == ORDER - 1) {
BTreeNode<T, ORDER>* s = new BTreeNode<T, ORDER>(false);
s->children[0] = root;
root = s;
splitChild(s, 0);
insertNonFull(s, k);
} else
insertNonFull(root, k);
}
// Function to traverse the tree
void traverse() {
if (root != nullptr)
traverse(root);
}
// Function to search a key in the tree
BTreeNode<T, ORDER>* search(T k) {
return (root == nullptr) ? nullptr : search(root, k);
}
// Function to remove a key from the tree
void remove(T k) {
if (!root) {
cout << "The tree is empty\n";
return;
}
remove(root, k);
if (root->n == 0) {
BTreeNode<T, ORDER>* tmp = root;
if (root->leaf)
root = nullptr;
else
root = root->children[0];
delete tmp;
}
}
};
int main() {
BTree<int, 3> t;
t.insert(10);
t.insert(20);
t.insert(5);
t.insert(6);
t.insert(12);
t.insert(30);
t.insert(7);
t.insert(17);
cout << "Traversal of the constructed tree is: ";
t.traverse();
cout << endl;
int k = 6;
(t.search(k) != nullptr) ? cout << k << " is found" << endl
: cout << k << " is not found" << endl;
k = 15;
(t.search(k) != nullptr) ? cout << k << " is found" << endl
: cout << k << " is not found" << endl;
t.remove(6);
cout << "Traversal of the tree after removing 6: ";
t.traverse();
cout << endl;
t.remove(13);
cout << "Traversal of the tree after removing 13: ";
t.traverse();
cout << endl;
return 0;
}
Output
Traversal of the constructed tree is: 5 7 17
6 is not found
15 is not found
The key 6 is not present in the tree
Traversal of the tree after removing 6: 5 7 17
The key 13 is not present in the tree
Traversal of the tree after removing 13: 5 7 17 Applications of B-TreeFollowing are some of the common applications of a B-Tree:
- File Systems: Many file systems use B-trees or variants to manage file and directory entries.
- Multilevel Indexing: B-trees are used in multilevel indexing in database management systems for faster data retrieval.
- Database Systems: B-trees are widely used in databases and file systems to store and retrieve large blocks of data efficiently.
- Search Engines: B-trees can be used in the implementation of search engines for efficient storage and retrieval of data.
- Data Compression: B-trees are used in certain data compression algorithms.
- Blockchain: Some blockchain implementations use B-tree variants for efficient data storage and retrieval.
Related Articles:
|