Hi Friends,
Welcome to the 138th issue of the Polymathic Engineer newsletter.
This week, we talk about linear interpolation, which is one of the most important algorithms in computer graphics. The outline will be as follows:
Linear Interpolation
Bilinear Interpolation: The 2D Approach
Trilinear Interpolation: Moving to 3D
Project-based learning is the best way to develop technical skills. CodeCrafters is an excellent platform for practicing exciting projects, such as building your version of Redis, Kafka, DNS server, SQLite, or Git from scratch.
Sign up, and become a better software engineer.
Linear Interpolation
Interpolation is a fundamental technique that allows us to estimate values at arbitrary positions between known data points. It is often used to fill in the blanks in computer graphics and image processing when information is only available in certain places.
Most of the time, image data is stored on a regular 2D or 3D grid, with values saved at the points where the lines intersect. However, we often have to examine numbers that are scattered throughout the grid.
When a sample point falls exactly on a grid vertex, we can use the stored value. But if it falls inside a cell, we have to figure out a value by averaging the values of all the vertices nearby. The process of finding numbers in the middle is called interpolation.
For instance, think about what happens when you want to show a small picture on a bigger screen. Every pixel in the original picture has to be stretched to fill more than one pixel on the screen. But what values should those new pixels have? This is where interpolation comes in: it creates smooth transitions between the known pixel values.
One of the best ways to get familiar with how interpolation works is to start from the simple 1D case in which data is given on a straight line. We know two points, a and b. The goal is to find the number that is between them.
This type of interpolation is called linear interpolation, and can be written in math as: result = (1 - t) * a + t * b. Here, t is a number between 0 and 1 that represents a position between a and b. When t = 0, the result equals a; when t = 1, the result equals b.
Even though it is relatively simple, linear interpolation is the basis for more advanced interpolation techniques. Bilinear and trilinear interpolation used for 2D and 3D grids both build on the idea of linear interpolation by applying it along each axis.
In the following sections, we will discuss how linear interpolation extends to higher dimensions and how to make the most of implement these techniques efficiently in graphics applications.
Bilinear Interpolation: The 2D Approach
When we move from 1D lines to 2D images or textures, we need a more sophisticated approach. This is where bilinear interpolation comes in handy. As the name suggests, the idea is to perform linear interpolation twice, once along each axis.
Let's say we have a 2D grid and we want to find a value at a point inside a square cell. The four corners of this cell are c00, c10, c01, and c11. The goal is to find the value of a point inside the cell whose coordinates are (cx, cy). Here's how bilinear interpolation works:
We perform two linear interpolations along the x-axis. We interpolate between c00 and c10 to get the point a, and between c01 and c11 to get the point b.
Then, we perform one more linear interpolation along the y-axis between points a and b to get our final result.
It doesn't matter which axis you start with. You could first interpolate along the y-axis and then along the x-axis, and the result would be identical.
A perfect example to see bilinear interpolation in action is image resizing. Here is a simple python function that performs bilinear interpolation in this context:
def bilinear_resize(image, new_width, new_height):
height, width, channels = image.shape
resized = np.zeros((new_height, new_width, channels), dtype=np.uint8)
# Calculate scaling factors
x_scale = width / new_width
y_scale = height / new_height
for new_y in range(new_height):
for new_x in range(new_width):
# Map new coordinates back to original image coordinates
orig_x = new_x * x_scale
orig_y = new_y * y_scale
# Get integer parts (corners of the cell)
x0 = int(orig_x)
y0 = int(orig_y)
x1 = min(x0 + 1, width - 1)
y1 = min(y0 + 1, height - 1)
# Get fractional parts (tx, ty)
tx = orig_x - x0
ty = orig_y - y0
# Get the four corner pixels
c00 = image[y0, x0]
c10 = image[y0, x1]
c01 = image[y1, x0]
c11 = image[y1, x1]
# Perform bilinear interpolation
# First, interpolate along x-axis
a = c00 * (1 - tx) + c10 * tx
b = c01 * (1 - tx) + c11 * tx
# Then interpolate along y-axis
result = a * (1 - ty) + b * ty
resized[new_y, new_x] = result.astype(np.uint8)
return resized
The visual results of bilinear interpolation depend heavily on whether you are making the image larger (upsampling) or smaller (downsampling).
When you enlarge a small picture, such as a 9x9 grid, to 512x512, almost all output pixels fall somewhere between the coordinates of the original pixels. The algorithm interpolates between the four nearest neighbor pixels, creating smooth transitions where sharp edges become soft gradients. This is the classic "smooth interpolation" effect most people associate with bilinear filtering. The images below, shows a the visual results obtained when comparing the above bilinear interpolation to the default used by the resize function provided by the popular OpenCV library.
When you reduce the size of a big picture, such as a 500×500 grid to 10×10, each output pixel doesn't just look at one exact input pixel, but represents a 50×50 area in the original image. The algorithm must somehow represent all that information in a single output pixel, which works more like area averaging and uses the main color. As you can see in the images below, the boundaries between color blocks remain crisp because each output pixel has enough input data to make a confident decision about what color it should be.
This resizing example demonstrates how bilinear interpolation is fast and simple to implement, but it has limitations. There are times when the results look less smooth than wanted or have patterns that can be seen.
This happens because bilinear interpolation is actually a quadratic function, despite what its name says. It's only truly linear along the edges of the cells. For applications where better results are needed, higher-degree methods like cubic interpolation might be necessary, though they come with increased computational cost.
Trilinear Interpolation: Moving to 3D
When working with three-dimensional data like medical scans, fluid simulations, or volumetric rendering, we need to extend in some way our interpolation technique to handle 3D grids. This is where trilinear interpolation comes in.
Trilinear interpolation is a straightforward extension of bilinear interpolation. The idea is to perform bilinear interpolation twice and then combining the results with one more linear interpolation.
We start with a 3D cell that has eight corners: c000, c100, c010, c110, c001, c101, c011, and c111. Our goal is to find the value at a point (cx, cy, cz) somewhere inside this cube. Here's how it works:
We perform two bilinear interpolations. One for the front face of the cube using c000, c100, c010, and c110 to get point e. Then another for the back face using c001, c101, c011, and c111 to get point f.
We perform one linear interpolation between e and f along the z-axis to get our final result.
Here is a possible implementation:
def trilinear_interpolation(grid_3d, x, y, z):
"""
Perform trilinear interpolation on a 3D grid
grid_3d: 3D numpy array containing the data
x, y, z: coordinates in grid space (0 to grid_size-1)
"""
# Get the dimensions of the grid
depth, height, width = grid_3d.shape
# Get integer parts (corners of the cell)
x0, y0, z0 = int(x), int(y), int(z)
x1 = min(x0 + 1, width - 1)
y1 = min(y0 + 1, height - 1)
z1 = min(z0 + 1, depth - 1)
# Get fractional parts
tx = x - x0
ty = y - y0
tz = z - z0
# Get the eight corner values
c000 = grid_3d[z0, y0, x0]
c100 = grid_3d[z0, y0, x1]
c010 = grid_3d[z0, y1, x0]
c110 = grid_3d[z0, y1, x1]
c001 = grid_3d[z1, y0, x0]
c101 = grid_3d[z1, y0, x1]
c011 = grid_3d[z1, y1, x0]
c111 = grid_3d[z1, y1, x1]
# First, perform bilinear interpolation on the front face (z0)
e = bilinear_interpolate(tx, ty, c000, c100, c010, c110)
# Then, perform bilinear interpolation on the back face (z1)
f = bilinear_interpolate(tx, ty, c001, c101, c011, c111)
# Finally, interpolate between the two faces along z-axis
result = e * (1 - tz) + f * tz
return result
def bilinear_interpolate(tx, ty, c00, c10, c01, c11):
"""Helper function for bilinear interpolation"""
a = c00 * (1 - tx) + c10 * tx
b = c01 * (1 - tx) + c11 * tx
return a * (1 - ty) + b * ty
Like its 2D counterpart, trilinear interpolation is fast and not too hard to implement. It provides reasonable quality and it remains an excellent choice for most applications that need to work with volumetric data. However, it has the same main problem as the bilinear interpolation: the results might not look as smooth as you'd like, especially if the data being used has sharp edges.
References
GitHub repository with the reference code