Find Duplicate Subtrees
A medium-tier problem at 60% community acceptance, tagged with Hash Table, Tree, Depth-First Search. Reported in interviews at Yandex and 1 others.
Find Duplicate Subtrees trips up candidates who don't recognize the serialization pattern. The problem asks you to identify every subtree that appears more than once in a binary tree, which sounds tree-traversal-basic until you realize you need a way to fingerprint each subtree. Yandex and PhonePe have both asked this, and it's not a gimme. With a 60% acceptance rate, you're looking at candidates who either nail the hashing insight or spin their wheels comparing nodes manually. If you blank on how to serialize and track subtrees during your live OA, StealthCoder surfaces the working approach invisible to the proctor.
Companies that ask "Find Duplicate Subtrees"
Find Duplicate Subtrees is the kind of problem that decides whether you pass. StealthCoder reads the problem on screen and surfaces a working solution in under 2 seconds. Invisible to screen share. The proctor sees nothing. Built by an engineer at a top-10 tech company who can solve these problems cold but didn't want to trust himself in a 90-minute screen share.
Get StealthCoderThe trick is treating each subtree as a serializable object. A naive approach tries to compare subtree shapes node-by-node, which is slow and error-prone. The real solution hashes the structure of each subtree (using postorder traversal, for instance) and counts occurrences in a hash table. Candidates often fail because they either don't serialize subtrees consistently, or they forget that identical structure means identical hash. Another common miss: returning duplicates correctly, not intermediate nodes. Depth-First Search with memoization is the spine here. The Hash Table stores your serialization, letting you spot collisions in one pass. If you're stuck on the serialization format or the order of operations when your live assessment hits this problem, StealthCoder executes the solution instantly.
Pattern tags
You know the problem.
Make sure you actually pass it.
Find Duplicate Subtrees recycles across companies for a reason. It's medium-tier, and most candidates blank under the timer. StealthCoder is the hedge: an AI overlay invisible during screen share. It reads the problem and surfaces a working solution in under 2 seconds. Built by an engineer at a top-10 tech company who can solve these problems cold but didn't want to trust himself in a 90-minute screen share. Works on HackerRank, CodeSignal, CoderPad, and Karat.
Find Duplicate Subtrees interview FAQ
How hard is this really compared to other tree problems?+
It's medium difficulty legitimately. 60% acceptance means most who pass see the serialization insight; those who don't get stuck fast. It's not a slog like some harder trees, but it's not a warm-up either. The trick isn't immediately obvious without tree hashing experience.
What's the core algorithmic trick?+
Serialize each subtree to a string or tuple, hash it, and count. Postorder DFS ensures you build subtrees bottom-up so you can hash before moving up. That way identical shapes produce identical hashes. No manual node comparison needed.
Why do candidates fail at this problem?+
Most either don't know how to serialize, or they try comparing node references and object equality instead. Some also forget that you need to track count globally, not just return subtrees as you find them. The serialization format is the gatekeeper.
Which topics are most important for solving this?+
Depth-First Search and Hash Table are non-negotiable. You traverse the tree with DFS, then use the hash table to track and count serialized subtrees. Binary Tree fundamentals round it out, but DFS and hashing are the pillars.
Is this still asked at Yandex and PhonePe?+
Both companies are on record asking it. It's a solid filter for testing whether a candidate can think beyond naive comparison and apply hashing to tree problems. Still relevant, not a dusty classic.
Want the actual problem statement? View "Find Duplicate Subtrees" on LeetCode →