Multiplying Matrix

Three different approach to matrices multiplication: brute force, divide et conquer, and Strassen algorithm.

Feb 24, 2024

Hi Friends,

Welcome to the 60th issue of the Polymathic Engineer newsletter.

This week, we will talk about an uncommon topic that is extremely important in computer science: matrix multiplication.

Even if matrices were originally a way to describe systems of linear equations, they became common in many other fields. For example, they are often used to represent graphs and are essential in computer graphics since a digital image is basically a matrix.

Multiplication between matrices is a very common operation, and it is also an interesting case study since it can be executed with many different algorithms.

The outline will be as follows:

The brute force approach
Multiplying matrix with divide et conquer
The Strassen method
Posts that made me think

Brute force approach

Suppose you want to multiply two matrices, A and B, with dimensions n x n. The multiplication of A and B generates a matrix C is given by the formula:

\(c_{ij}= a_{i1} b_{1j} + a_{i2} b_{2j} +\cdots+ a_{in} + b_{nj} = \sum_{k=1}^n a_{ik}b_{kj}\)

If you look at the formula, it is relatively easy to implement an algorithm that calculates the matrix C. You must calculate n² matrix values in C, and each value is the sum of n values from matrices A and B.

Here is a Python code implementing the algorithm:

def matrix_multiplication(A, B):

    if len(A[0]) != len(B):
        raise ValueError("Columns in A must be equal to rows in B")

    rows_A = len(A)
    cols_A = len(A[0])
    cols_B = len(B)

    C = [[0] * cols_B for _ in range(rows_A)]

    for i in range(rows_A):
        for j in range(cols_B):
            for k in range(cols_A):
                C[i][j] += A[i][k] * B[k][j]
    
    return C

Here, the statement C[i][j] += A[i][k] * B[k][j] executes three times due to the nested for loops, and thus the time complexity for the algorithm is O(n³).

Looking at the formula, you might think O(n³) is the lower bound time complexity for any matrix multiplication algorithm. However, this assumption would be incorrect.

To see why, you need first to understand how the divide and conquer approach to matrix multiplication works.

Divide and Conquer

The divide and conquer approach breaks a big problem into smaller and easy-to-solve subproblems. Then, the solution of these small subproblems is combined to get the solution for the whole problem.

The approach is divided into three parts:

Divide the problem into some subproblem
Conquer the subproblems by solving them recursively until the subproblem is so tiny that it can be easily solved.
Combine the solution to the subproblems to find a solution for the original problem.

Let’s see now how these three steps apply to the matrix multiplication problem. To keep things simple, we assume that the size n of the matrices A and B is an exact power of 2.

The divide step partitions each of A and B into four submatrices having size n/2 x n/2. Supposing that n = 4, we can rewrite the matrix multiplication formula

\(\begin{pmatrix} a_{11} & a_{12} & a_{13} & a_{14}\\ a_{21} & a_{22} & a_{23} & a_{24}\\ a_{31} & a_{32} & a_{33} & a_{34}\\ a_{41} & a_{42} & a_{43} & a_{44}\\ \end{pmatrix} \times \begin{pmatrix} b_{11} & b_{12} & b_{13} & b_{14}\\ b_{21} & b_{22} & b_{23} & b_{24}\\ b_{31} & b_{32} & b_{33} & b_{34}\\ b_{41} & b_{42} & b_{43} & b_{44}\\ \end{pmatrix} = \begin{pmatrix} c_{11} & c_{12} & c_{13} & c_{14}\\ c_{21} & c_{22} & c_{23} & c_{24}\\ c_{31} & c_{32} & c_{33} & c_{34}\\ c_{41} & c_{42} & c_{43} & c_{44}\\ \end{pmatrix}\)

\(\begin{pmatrix} A_{11} & A_{12}\\ A_{21} & A_{22} \\ \end{pmatrix} \times \begin{pmatrix} B_{11} & B_{12}\\ B_{21} & B_{22}\\ \end{pmatrix} = \begin{pmatrix} C_{11} & C_{12}\\ C_{21} & C_{22}\\ \end{pmatrix}\)

Here, A_11 includes the upper left values of the A matrix, A_12 consists of the upper right values, A_21 the bottom left values, and A_22 the bottom right values. The same applies to matrix B and matrix C.

The conquer step keeps dividing the submatrices recursively until they do not become equal to a single value.

The combine step finds a solution by combining the results of the conquer step at each recursion level. In the base case, matrices are reduced to a single value, and the multiplication of two values gives the solution.

In all the other cases, the solution is given by the formulas:

\(\begin{matrix} C_{11} = & A_{11} \cdot B_{11} + A_{12} \cdot B_{21}\\ C_{12} = & A_{11} \cdot B_{12} + A_{12} \cdot B_{22}\\ C_{21} = & A_{21} \cdot B_{11} + A_{22} \cdot B_{21}\\ C_{22} = & A_{21} \cdot B_{12} + A_{22} \cdot B_{22}\\ \end{matrix}\)

The formulas show how each recursion step requires 8 matrix multiplications and 4 additions. If we use the notation T(n) to describe the problem of multiplying two n x n matrices, we can describe the time complexity of the solution as:

\(T(n) = \left\{\begin{matrix} 1 & if & n = 1 \\ 8T(n/2) + O(n^{2})& if & n > 1 \\ \end{matrix}\right.\)

where 8T(n/2) is the part related to the 8 multiplication and O(n²) is the part related to the additions. Without going further into math details, it can be shown that the time complexity becomes equal to O(n³), which is the same as the brute force approach.

Posting the whole algorithm implementation here would be too cumbersome, but you can find my Python code in my GitHub.

Strassen algorithm

The Strassen algorithm is a recursive method based on the divide and conquer approach but has a better time complexity.

Also, the Stassen algorithm divides each matrix into four submatrices having size n/2 x n/2, but it uses a different way to combine the results. These are the formulas used by the Strassen algorithm:

\(\begin{matrix} C_{11} = & P_{5} + P_{4} - P_{2} + P_{6}\\ C_{12} = & P_{1} + P_{2}\\ C_{21} = & P_{3} + P_{4} \\ C_{22} = & P_{5} + P_{1} - P_{3} + P_{7}\\ \end{matrix}\)

where

\(\begin{matrix} P_{1} = & A_{11} \cdot (B_{12} - B_{22})\\ P_{2} = & (A_{11} + A_{12}) \cdot B_{22}\\ P_{3} = & (A_{21} + A_{22}) \cdot B_{11}\\ \\ P_{4} = & A_{22} \cdot (B_{21} - B_{11})\\ P_{5} = & (A_{11} + A_{22}) \cdot (B_{11} - B_{22}) \\ P_{6} = & (A_{12} + A_{22}) \cdot (B_{21} - B_{22})\\ P_{7} = & (A_{11} + A_{21}) \cdot (B_{11} - B_{12}) \end{matrix}\)

As for the divide and conquer approach, these formulas work if both A and B matrices have size n x n and n is a power of 2. However, if the above conditions are not satisfied, it’s always possible to pad the matrices with 0 for this method to work.

Since the Strassen algorithm requires only 7 multiplications at each step, it is possible to show that its asymptotic time complexity is equal to O(n^(2.81)).

This seems like a minor improvement over the previous approaches, but it is significant for large input sizes.

Posts that made me think

Good software architects are also software developers. They are not in an ivory tower and don't avoid code to focus on higher-level things.

I have friends who, after being laid off, made life changes that they wouldn't have made otherwise. But even if things can get better in the long term, I wouldn't wish anyone to go through that experience.

The Polymathic Engineer

Discussion about this post