A
Binary Search Tree (BST) is a binary tree (max. 2 childs per node) with every node a key and associated value. Also a BST has the property that for every node, the left subtree contains only nodes with a smaller (or equal) key and the right subtree contains only nodes with strictly larger keys. This property will allow us to perform insertion and deletion operations in time proportional to the tree height. Later more on time complexity.
The BST class is declared as follows:
- public class BinarySearchTree<Tkey, Tvalue> where Tkey : IComparable<Tkey>
The nodes in the BST can be keyed by any type that implements IComparable. There is no restriction on the type of the value. Now the simple node definition:
- protected class BinaryKeyValueNode<Tkey, Tvalue> where Tkey : IComparable<Tkey>
- {
- public Tkey Key { get; set; }
- public Tvalue Value { get; set; }
- public BinaryKeyValueNode<Tkey, Tvalue> Parent { get; set; }
- public BinaryKeyValueNode<Tkey, Tvalue> LeftChild { get; set; }
- public BinaryKeyValueNode<Tkey, Tvalue> RightChild { get; set; }
- public BinaryKeyValueNode(Tkey key, Tvalue value)
- {
- Value = value;
- Key = key;
- }
- }
So every node consists of a key and a value and keeps a reference to both subtrees (child nodes) and its parent. These are the fields and constructor for our BST:
- private Random random;
- protected BinaryKeyValueNode<Tkey, Tvalue> Root { get; set; }
- public int Count { get; protected set; }
-
- public BinarySearchTree()
- {
- Root = null;
- random = new Random(1);
- Count = 0;
- }
We maintain a count (optional) and a reference to the root node (initially null). Insertion goes as follows:
- public void Insert(Tkey key, Tvalue value)
- {
- BinaryKeyValueNode<Tkey, Tvalue> parent = null;
- BinaryKeyValueNode<Tkey, Tvalue> current = Root;
- int compare = 0;
- while (current != null)
- {
- parent = current;
- compare = current.Key.CompareTo(key);
- current = compare < 0 ? current.RightChild : current.LeftChild;
- }
- BinaryKeyValueNode<Tkey, Tvalue> newNode = new BinaryKeyValueNode<Tkey, Tvalue>(key, value);
- if (parent != null)
- if (compare < 0)
- parent.RightChild = newNode;
- else
- parent.LeftChild = newNode;
- else
- Root = newNode;
- newNode.Parent = parent;
- Count++;
- }
The reference
current is the subtree in which we try to insert and
parent is the parent of the
current subtree. As long as
current points to a node (line 6), we go either to the left or to the right, based on the comparison between the key of the current node and the key to the value that we want to insert. If the current key is greater than the key we are inserting then we go to the right subtree, otherwise to the left (line 10). When we have reached an external node, the current will equal null. Now we create the new node and set all the right references (lines 12-20).
Finding a value, given its key, works very much the same as in the following:
- public Tvalue FindFirst(Tkey key)
- {
- return FindNode(key).Value;
- }
-
- public Tvalue FindFirstOrDefault(Tkey key)
- {
- var node=FindNode(key, false);
- return node == null ? default(Tvalue) : node.Value;
- }
-
- protected BinaryKeyValueNode<Tkey, Tvalue> FindNode(Tkey key, bool ExceptionIfKeyNotFound = true)
- {
- BinaryKeyValueNode<Tkey, Tvalue> current = Root;
- while (current != null)
- {
- int compare = current.Key.CompareTo(key);
- if (compare == 0)
- return current;
- if (compare < 0)
- current = current.RightChild;
- else
- current = current.LeftChild;
- }
- if (ExceptionIfKeyNotFound)
- throw new KeyNotFoundException();
- else
- return null;
- }
Starting from the root, we compare the given key to the key of the current node. If they are equal, we can return it. If the given key is strictly larger we continue the search in the right subtree, otherwise in the left. Next we have the delete operation:
- protected void DeleteNode(BinaryKeyValueNode<Tkey, Tvalue> node)
- {
- if (node == null)
- throw new ArgumentNullException();
- if (node.LeftChild != null && node.RightChild != null)
- {
- BinaryKeyValueNode<Tkey, Tvalue> replaceBy = random.NextDouble() > .5 ? InOrderSuccesor(node) : InOrderPredecessor(node);
- DeleteNode(replaceBy);
- node.Value = replaceBy.Value;
- node.Key = replaceBy.Key;
- }
- else
- {
- var child = node.LeftChild == null ? node.RightChild : node.LeftChild;
- if (node.Parent.RightChild == node)
- node.Parent.RightChild = child;
- else
- node.Parent.LeftChild = child;
- }
- Count--;
- }
-
- protected BinaryKeyValueNode<Tkey, Tvalue> InOrderSuccesor(BinaryKeyValueNode<Tkey, Tvalue> node)
- {
- BinaryKeyValueNode<Tkey, Tvalue> succesor = node.RightChild;
- while (succesor.LeftChild != null)
- succesor = succesor.LeftChild;
- return succesor;
- }
-
- protected BinaryKeyValueNode<Tkey, Tvalue> InOrderPredecessor(BinaryKeyValueNode<Tkey, Tvalue> node)
- {
- BinaryKeyValueNode<Tkey, Tvalue> succesor = node.LeftChild;
- while (succesor.RightChild != null)
- succesor = succesor.RightChild;
- return succesor;
Deletion of a node is easy if it has either zero or one child. We can just connect the parent and the child and forget about the deleted node (line 12-18). The BST property will still hold. If the node that needs to be deleted has 2 childs, then we need to find an appropriate node to replace it by that is either the node with the lowest key in the right subtree (in-order-successor), or the node with the highest key in the left subtree (in-order-predecessor) (lines 5-11). We decide which one with a coinflip.
We now have all the essential operations that must be done on a BST. If one would like to traverse the entire tree, to enumerate all its elements, then there are multiple ways to do so, namely depth-first or breadth-first. Here is the implementation to all 3 ways to make a depth-first tree traversal:
- public IEnumerable<Tvalue> TraverseTree(DepthFirstTraversalMethod method)
- {
- return TraverseNode(Root, method);
- }
-
- protected IEnumerable<Tvalue> TraverseNode(BinaryKeyValueNode<Tkey, Tvalue> node, DepthFirstTraversalMethod method)
- {
- IEnumerable<Tvalue> TraverseLeft = node.LeftChild == null ? new Tvalue[0] : TraverseNode(node.LeftChild, method),
- TraverseRight = node.RightChild == null ? new Tvalue[0] : TraverseNode(node.RightChild, method),
- Self = new Tvalue[1] { node.Value };
- switch(method)
- {
- case DepthFirstTraversalMethod.PreOrder:
- return Self.Concat(TraverseLeft).Concat(TraverseRight);
- case DepthFirstTraversalMethod.InOrder:
- return TraverseLeft.Concat(Self).Concat(TraverseRight);
- case DepthFirstTraversalMethod.PostOrder:
- return TraverseLeft.Concat(TraverseRight).Concat(Self);
- default:
- throw new ArgumentException();
- }
- }
-
- public enum DepthFirstTraversalMethod
- {
- PreOrder,
- InOrder,
- PostOrder
- }
If you want to learn more about tree traversal, read the
wikipedia. Traversing a BST in-order will rusult in a sorted enumeration of all node-values.
On Time Complexity
The three basic operations, insert, find and delete, all have time complexity O(h) (linearly dependant), where h is the height of the tree. The height of the tree is bound by the 2-log of the count (lower bound) and the count (upper bound). The expected height of the tree is the square root of the count (that grows faster than the 2-log), after many random insertions and deletions. The performance of a BST can be improved uon by self-balancing; keeping the height small (2-log). I will implement a self-balancing BST in my next article soon.