A KD Tree (k-dimensional tree) is a space-partitioning data structure for organizing points in a k-dimensional space. This structure is particularly useful for applications involving multidimensional search keys, such as range searches and nearest neighbor searches.
In this article, we will learn how to implement a KD Tree in C++ language along with its basic operations and applications.
A KD Tree is a binary tree in which every node represents a k-dimensional point. Every non-leaf node can be thought of as implicitly generating a splitting hyperplane that divides the space into two parts, known as half-spaces. Points to the left of this hyperplane are represented by the left subtree of that node and points to the right of the hyperplane are represented by the right subtree.
Properties of KD Trees- Every node in the tree is a k-dimensional point.
- Every non-leaf node generates a splitting hyperplane that divides the space into two parts.
- Points to the left of the splitting hyperplane are represented by the left subtree of that node and points to the right of the hyperplane are represented by the right subtree.
- The hyperplane direction is chosen in the following way: it is perpendicular to the axis corresponding to the depth of the node (modulo k).
- The tree is balanced when constructed with points that are uniformly distributed.
The following diagram represents the structure of a KD tree in C++:
 KD Tree  Implementation of KD Tree in C++A KD Tree can be implemented using a node structure that represents a point in k-dimensional space and pointers to its left and right child nodes.
Representation of KD Tree in C++ Node Structure KD Tree To represent a KD Tree in C++, we will use implement a class KDTree: That will consist of: a structure Node to represent points, and member functions to provide basic functionality. To keep the KD Tree generic, we will use templates so that it can store multiple data types.
template <size_t K> class KDTree { private: struct Node { array<double, K> point; Node* left; Node* right; };
Node* root; Here:
- K: the number of dimensions.
- Node: a structure representing a node in the KD Tree.
- root: pointer to the root node of the tree.
- point: an array to store the coordinates.
Basic Operations of KD Tree in C++Following are some of the basic operations of KD Tree that are required to manipulate its elements:
Operation
| Description
| Time Complexity
| Space Complexity
|
---|
Insert
| Inserts a new point into the KD Tree
| O(log n) average, O(n) worst
| O(log n)
|
---|
Search
| Searches for a point in the KD Tree
| O(log n) average, O(n) worst
| O(log n)
|
---|
Nearest Neighbor
| Finds the nearest neighbor to a given point
| O(log n) average, O(n) worst
| O(log n)
|
---|
Range Search
| Finds all points within a given range
| O(√n + k) average, O(n) worst
| O(n)
|
---|
- Start at the root, comparing the new point’s first dimension with the root’s first dimension.
- If the new point’s value is less than the root’s, go to the left child; otherwise, go to the right child.
- At the next level, compare the second dimension. Continue this process, cycling through dimensions.
- When a leaf is reached, create a new node and insert the new point.
- Start at the root, comparing the search point’s first dimension with the root’s first dimension.
- If the search point’s value is less than the root’s, go to the left child; otherwise, go to the right child.
- At the next level, compare the second dimension. Continue this process, cycling through dimensions.
- If an exact match is found, return true. If a leaf is reached without finding a match, return false.
Implementation of Nearest Neighbor Search Function- Traverse the tree as in a normal search to find the leaf node closest to the query point.
- Unwind the recursion, considering at each step whether there could be any points on the other side of the splitting plane that are closer to the query point than the current best.
- If there could be, recurse down the other branch, adding any closer points found to the current best.
- The final best point is the nearest neighbor.
Implementation of Range Search Function- Start at the root. If the current node is within the range, add it to the result.
- Recursively search the left and/or right subtrees if the range intersects their respective spaces.
- Prune the search if the current node’s space does not intersect the query range.
C++ Program to Implement KD TreeThe following program demonstrates the implementation of a KD Tree in C++:
C++
// C++ Program to Implement KD Tree
#include <iostream>
#include <array>
#include <cmath>
using namespace std;
// Template class for KDTree with K dimensions
template <size_t K>
class KDTree {
private:
// Node structure representing each point in the KDTree
struct Node {
// Point in K dimensions
array<double, K> point;
// Pointer to left child
Node* left;
// Pointer to right child
Node* right;
// Constructor to initialize a Node
Node(const array<double, K>& pt) : point(pt), left(nullptr), right(nullptr) {}
};
Node* root; // Root of the KDTree
// Recursive function to insert a point into the KDTree
Node* insertRecursive(Node* node, const array<double, K>& point, int depth) {
// Base case: If node is null, create a new node
if (node == nullptr) return new Node(point);
// Calculate current dimension (cd)
int cd = depth % K;
// Compare point with current node and decide to go left or right
if (point[cd] < node->point[cd])
node->left = insertRecursive(node->left, point, depth + 1);
else
node->right = insertRecursive(node->right, point, depth + 1);
return node;
}
// Recursive function to search for a point in the KDTree
bool searchRecursive(Node* node, const array<double, K>& point, int depth) const {
// Base case: If node is null, the point is not found
if (node == nullptr) return false;
// If the current node matches the point, return true
if (node->point == point) return true;
// Calculate current dimension (cd)
int cd = depth % K;
// Compare point with current node and decide to go left or right
if (point[cd] < node->point[cd])
return searchRecursive(node->left, point, depth + 1);
else
return searchRecursive(node->right, point, depth + 1);
}
// Recursive function to print the KDTree
void printRecursive(Node* node, int depth) const {
// Base case: If node is null, return
if (node == nullptr) return;
// Print current node with indentation based on depth
for (int i = 0; i < depth; i++) cout << " ";
cout << "(";
for (size_t i = 0; i < K; i++) {
cout << node->point[i];
if (i < K - 1) cout << ", ";
}
cout << ")" << endl;
// Recursively print left and right children
printRecursive(node->left, depth + 1);
printRecursive(node->right, depth + 1);
}
public:
// Constructor to initialize the KDTree with a null root
KDTree() : root(nullptr) {}
// Public function to insert a point into the KDTree
void insert(const array<double, K>& point) {
root = insertRecursive(root, point, 0);
}
// Public function to search for a point in the KDTree
bool search(const array<double, K>& point) const {
return searchRecursive(root, point, 0);
}
// Public function to print the KDTree
void print() const {
printRecursive(root, 0);
}
};
int main() {
// Create a KDTree with 2 dimensions
KDTree<2> tree;
// Insert points into the KDTree
tree.insert({3, 6});
tree.insert({2, 2});
tree.insert({4, 7});
tree.insert({1, 3});
tree.insert({2, 4});
tree.insert({5, 4});
tree.insert({7, 2});
// Print the KDTree structure
cout << "KD Tree structure:" << endl;
tree.print();
// Search for specific points in the KDTree
array<double, 2> searchPoint = {2, 4};
cout << "\nSearching for point (2, 4): "
<< (tree.search(searchPoint) ? "Found" : "Not found") << endl;
searchPoint = {6, 3};
cout << "Searching for point (6, 3): "
<< (tree.search(searchPoint) ? "Found" : "Not found") << endl;
return 0;
}
OutputKD Tree structure:
(3, 6)
(2, 2)
(1, 3)
(2, 4)
(4, 7)
(5, 4)
(7, 2)
Searching for point (2, 4): Found
Searching for point (6, 3): Not found
Applications of KD TreeFollowing are some of the common applications of KD tree:
- KD trees are used to efficiently find the closest point(s) in a multidimensional space to a given query point.
- Used to quickly identify all points within a specified range or region in a multidimensional space.
- Used in algorithms like k-nearest neighbors (KNN) for classification and regression tasks.
- Used to index and query multidimensional data in geographic information systems (GIS) and similar applications.
- KD trees are also used to speed up the process of finding clusters in multidimensional data sets.
- Used in image compression algorithms and for feature matching in computer vision tasks.
|