Hi Friends,
Welcome to the 117th issue of the Polymathic Engineer newsletter.
This week, we talk about a powerful algorithm design technique that every software engineer should know: backtracking.
The outline will be as follows:
What is backtracking
The algorithm template
Easy to Medium Backtracking Problems
Pruning
Hard Problems
Project-based learning is the best way to develop technical skills. CodeCrafters is an excellent platform for practicing exciting projects, such as building your version of Redis, Kafka, DNS server, SQLite, or Git from scratch.
Sign up, and become a better software engineer.
What is backtracking
Backtracking is a systematic approach to solving problems, exploring all the possible solutions step-by-step, and avoiding repetitions. However, this is unlike a brute-force approach or an exhaustive search because it doesn't blindly try every option. Instead, it tries a different path whenever the current one doesn't work.
You can think of backtracking like finding your way through a maze. You stay on one road, making a choice at each intersection until you reach a dead end or find the way out. Whenever you get stuck, you return to the last intersection and try going another direction. This process continues until you find the exit or explore all possible paths.
In programming, you can implement this approach using recursion. At each recursion level, you make a choice and go through a different path with another recursive call.
The whole process can be represented as a decision tree. The points where you make choices are nodes in the tree, and the branches are the recursive calls that go from a decision point to the next one.
A backtracking algorithm is essentially a way of navigating this decision tree to find solutions that meet the criteria. It's nothing different than a depth-first search of the decision tree.
If all this sounds a bit abstract to you, everything makes a lot more sense when you go through an example.
Suppose you have the English lowercase letters a-z and need to generate all strings of length n using the letters. Since each letter could be one of a-z, there are 26^n possible strings. You can imagine all such possibilities as a tree. The root is the empty string, and then all nodes have 26 children, with the path from the root to a leaf representing a string. The depth of the tree is n, and the leaf nodes represent solutions to the initial problem.
Suppose you also have a constraint so that instead of all solutions of length n, you only want ones with only vowels. An exhaustive search would still generate all strings of length n and then check each one to meet the criteria. With backtracking, you discard all subtrees without vowels, reducing possible solutions from O(26^n) candidates to O(5^n).
The algorithm template
Now that we've got a handle on backtracking let's dive into how it works and how we can implement it in code.
Every backtracking algorithm usually follows some structure like this:
Check if you've reached a solution. If so, do something with it. For example, print the solution or add it to a list.
If not, consider all the possible options to extend the current candidate solution.
For each option, build the candidate and recursively try to solve the problem. Then undo your changes to the candidate.
Let's put this into code. Here's the a general template for a backtracking function in pseudocode:
def backtrack(input, candidate):
if is_a_solution(candidate):
process_solution(candidate)
return
for item in input:
build(candidate, item)
backtrack(candidate)
undo(candidate, item)
This might look a bit abstract, so let's break it down:
find_solution
is a boolean function that tests whether the current candidate forms a complete solution for the given problem. It usually uses the expected size of a solution plus additional constraints.process_solution
is what you do after finding a solution, like adding it to a list of results.build
andundo
are two functions that modify the current candidate to generate the next candidate and clean up this data structure after the recursive call.is_valid
is an optional boolean function that checks if the next candidate that will be built is valid according to the problem's rules. We will talk more about this function later on.
The magic happens in the recursive backtrack
call. This is exactly where you go one step deeper into the decision tree. Every call to the function represents a node in the tree, while each iteration in the for loop represents a child of the current node.
Calling backtrack in that loop represents moving to a child, while the line where you undo the modifications is equivalent to moving back up the tree from a child to its parent.
For each given node, the path from the root to that node is a candidate solution that is being built. The leaf nodes are complete solutions and represent when the base case is reached. The root is an empty candidate that represents the scope from which the first backtrack call is being made.