Lowest Common Ancestors and Variations

5 min readMar 20, 2020

The lowest common ancestor (LCA) is the question where given two nodes and the root of the tree. Write a function to determine the first node that contains both nodes.

Input: 
   tree: node
   n1: node
   n2: nodeOutput: 
   ancestor_node: nodeNotes:
   both n1 and n2 are in the tree

Assuming both nodes belong to the same tree, we can always just return the root node of the tree. Since by definition the root node must be an ancestor node to nodes n1 and n2.

The root node is not always the lowest common ancestor. A node further down the tree could be the first node that contains both n1 and n2. The following diagrams illustrate some of the possible positions of the lowest common ancestor.

We can see how the nodes can be the ancestor of each other, and how they may occur at different levels. To get started with explaining how we generally solve this problem, we’ll first define the standard binary tree node’s signature:

class BinaryTreeNode:
    def __init__(self, data: int,
                 left: BinaryTreeNode,
                 right: BinaryTreeNode):
        self.data = data
        self.left = left
        self.right = right

Solution for Standard Binary Tree

To locate the first such node, a quick brute force solution would be to go to every tree node and search for n1 and n2. To ensure that we have the LCA, we have to keep performing this search until we find all possible ancestors and pick the last one. This can quickly get out of hand if the tree is large and the LCA is at the bottom of the tree.

So lets re-frame our first solution, instead of searching for n1 or n2 for every node, we can traverse the tree once and report to every node how many of n1 and n2 we have seen in our traversal path. This way on the first node where we haven seen n1 and n2, the node in question will naturally be the lowest common ancestor.

You can see an example of this in the following animation.

Animated LCA search https://observablehq.com/@nma/lowest-common-ancestors

The code above performs a post-order traversal, where we first visit the left and right sub nodes before checking the current node. At every step, we have a function to keep a count every time we see n1 or n2, we also take the count reported by our left and right child and sum them up. The first node where we have a count of 2 will be our LCA.

LCA post-order traversal algorithm

What happens when we add Parent Pointers to the Binary Tree?

With the basic version we can also modify our BinaryTreeNode to include a parent pointer. Modifying our tree node class below.

class BinaryTreeNode:
    def __init__(self, 
                 data: int,
                 left: BinaryTreeNode, 
                 right: BinaryTreeNode,
                 parent: BinaryTreeNode):
        self.data = data
        self.left = left
        self.right = right
        self.parent = parent

So whats does the this do for us?

Well with the ability to go back up the tree through the parent pointer, this means we can find our ancestor by going up! The issue here is that if our nodes are at different depths we can actually miss the ancestor if one traversal passes the ancestor before the other. So what our new algorithm does is figure out the depth of each ancestor by first traversing the parent pointer up to the root for both n1 and n2.

This way we grab the depths of each of n1 and n2. If they are at different depths we can go up n1 or n2 until the depths are the same. Then we can have both nodes traverse up at the same time until we reach the ancestor!

LCA with parent pointers

When the depths are the same then we traverse both nodes up until they meet or until one of them hits the root node.

Can we make the Parent Pointers more efficient using a Hash?

One of the issues in the above code is that we are traversing up to the root node no matter how close together our nodes are. We can actually be right next to each other and still traverse the entire tree height. This can drastically affect performance if the tree is large.

What we can do is trade a bit of space for time and use a hashset to keep track of nodes that we have seen. This changes the run time from O(h) to O(d), where d is the distance between the nodes n1 and n2.

What happens when the Binary Tree is a Binary Search Tree?

This actually makes searching for the ancestor incredibly easier. The Binary Search Tree (BST) property allows us to check less conditions because it enforces that for every node, the left side is less than or equal to the current node, and the right side is greater than or equal to the current node.

The gist here is that once we determine the first such node that exists between the two values (n1, n2) it will be the LCA. If we think about it briefly, by the BST property no other nodes left of the current node will have values less than the node, and no other nodes right of the current node will have values greater than the current node. So if n1 ≤ current ≤ n1, then we have found our LCA.

We can see how much simpler the previous code snippet is compared to our original traversal based solution with the BST property.

Summary

These are just a few examples of how simple tweaks to the tree property or data structure can change the implementation of the same function so drastically.

So if you ever get asked to grab the LCA on a tree, make sure to check what properties the nodes have!