Overview
A binary tree is a hierarchical data structure composed of nodes. Each node can have up to two children, typically called the left and right child. Binary trees are foundational in computer science, supporting a wide array of algorithms and specialized structures like binary search trees, heaps, and expression trees.
For precise definitions of terms like root, leaf, height, depth, level, and path, see tree_terminology. This page focuses on the structure and general algorithms for binary trees.
Key Properties
- Node degree: Each node has at most two children.
- Height (h): Longest path length from the root to a leaf. An empty tree often has height −1 or 0 by convention.
- Size (n): Number of nodes in the tree.
- Level capacity: At depth d, there are at most 2^d nodes.
- Upper bound on size: For height h, n ≤ 2^(h+1) − 1.
- Special forms: Full (every node has 0 or 2 children), Complete (all levels filled except possibly last, filled left-to-right), Perfect (all leaves at same depth, all internal nodes have 2 children).
- Balance: A tree is balanced if its height is O(log n), which keeps many operations efficient.
Representations
Pointer/Reference-based
// Example in a C-like style
struct Node {
Value value;
Node* left;
Node* right;
};
This is flexible and works for any shape of binary tree.
Array-based (for complete trees)
Use an array where index i stores a node, with children at 2i+1 and 2i+2 (0-based). This is common for heaps due to memory locality and implicit structure.
Traversals
- Preorder (Root, Left, Right): Useful for copying/serializing trees.
- Inorder (Left, Root, Right): Yields sorted order in binary search trees.
- Postorder (Left, Right, Root): Useful for deleting/freeing trees and evaluating expression trees.
- Level-order (Breadth-First): Visits nodes by level using a queue.
# Recursive traversal in Python
class Node:
def __init__(self, val, left=None, right=None):
self.val, self.left, self.right = val, left, right
def preorder(root):
if not root: return
print(root.val)
preorder(root.left)
preorder(root.right)
def inorder(root):
if not root: return
inorder(root.left)
print(root.val)
inorder(root.right)
def postorder(root):
if not root: return
postorder(root.left)
postorder(root.right)
print(root.val)
def level_order(root):
if not root: return
from collections import deque
q = deque([root])
while q:
node = q.popleft()
print(node.val)
if node.left: q.append(node.left)
if node.right: q.append(node.right)
Core Operations and Algorithms
- Search (general binary tree): Without an ordering property, worst-case O(n). Use DFS or BFS.
- Insert: In a general binary tree, insertion position is application-specific (e.g., first available spot for a complete shape). In ordered variants (like BSTs), insertion follows a key-based rule.
- Delete: Removing a node in a general binary tree may require re-linking a subtree or structural rules specific to the application (e.g., heap deletion uses last element replacement then heapify).
- Compute height/size/leaves: Typically O(n) via recursion or iterative traversal.
- Mirror (invert) tree: Swap left and right pointers at every node (O(n)).
- Lowest Common Ancestor (LCA): In a general tree, O(n) via two-path tracing or single-pass recursion; faster with parent pointers or preprocessing.
Complexity
- Let h be the height and n the number of nodes.
- Traversal: O(n) time, O(h) auxiliary space for recursion/stack; BFS uses O(w) queue space where w is max width.
- Search/Insert/Delete (general): O(n) time; in structured variants (e.g., balanced BSTs) typical operations are O(log n).
Applications
- Expression trees: Represent and evaluate arithmetic/logical expressions.
- Priority queues (heaps): Complete binary trees stored as arrays.
- Compression (Huffman trees): Optimal prefix codes from a binary tree.
- Range queries: Segment trees and Fenwick trees are binary-based structures for sums/min/max.
- Decision processes: Binary decision trees for classification and game search.
- Syntax and parsing: Abstract syntax trees (ASTs) often contain binary operators as binary subtrees.
Recursive vs. Iterative Approaches
Many binary tree algorithms are naturally recursive, reflecting the tree’s self-similarity. Iterative versions use explicit stacks for DFS or queues for BFS to control space and avoid call-stack limits.
Common Pitfalls
- Confusing binary trees with binary search trees—ordering rules apply only to BSTs.
- Unbalanced growth leading to height O(n) and degraded performance.
- Stack overflow with deep recursion; consider iterative traversals.
- Forgetting null checks when re-linking nodes during insert/delete.
Practice Ideas
- Implement the four standard traversals (both recursive and iterative).
- Write functions to compute height, size, leaf count, and to mirror a tree.
- Serialize and deserialize a binary tree using preorder with null markers or level-order.
- Compare pointer-based vs. array-based representations on complete vs. sparse trees.
Related Variants and Extensions
- Binary Search Trees (BSTs): Maintain an in-order key property for fast search.
- Balanced Trees: AVL and Red-Black trees keep height logarithmic.
- Heaps: Specialized binary trees for priority queues.
- Segment/Fenwick Trees: Binary-based structures for efficient range queries.