

Discover more from The Polymathic Engineer
Delivering content
A deep dive into how a content delivery network works. Plus the moving average coding challenge.
Hi Friends,
Welcome to the 32th issue of the Polymathic Engineer newsletter. This issue is focused on the following topics:
how a content delivery network works internally
moving average coding challenge
interesting tweets
Content Delivery Network
Content Delivery Networks (CDNs) deliver a large portion of the internet content, so it's likely you've interacted with them, even if you didn't know it.
CDNs have two primary roles: to cache web content closer to the user and create a low latency overlay network connecting clients and origin servers. Here I want to delve into how exactly a CDN works internally.
Working principle
At a fundamental level, a CDN operates like a cache. When a CDN server receives a request, it first checks if the requested resource is cached locally.
If the resource is available, it is immediately returned to the client. Otherwise, the server fetches it from the origin server, caches it for future use, and then serves it to the client.
Ensuring low latency
The physical distance between a client and a server can greatly impact latency. Even the fastest server will be affected by high latency if the client is far away. To overcome this issue, CDNs position their servers in multiple geographical locations, aiming to bring content closer to the end-users.
So, how does a client determine which CDN cluster is closest to them? The answer typically lies in a technique called global DNS load balancing.
To understand how the client gets access to the global DNS load balancing, let’s suppose that the requested resource is the http://mywebsite.com.
Normally, if a CDN is not used the DNS authoritative name server provides the IP address of the website. However, with a CDN in place, the process is slightly different.
The DNS authoritative name server provides an alias that points to the domain name of the CDN server http://mywebsite.cdn.com. The DNS resolver then requests the authoritative name server to resolve http://mywebsite.cdn.com and gets as an answer the domain name for the CDN's global DNS load balancer.
The CDN's global DNS load balancer uses advanced DNS functionalities that can infer a client's location based on their IP address. With this information, it provides a list of geographically nearby clusters to the client. This list is typically ordered, taking into account factors like network congestion and the health of the cluster.
The overlay network
Surprisingly, the main advantage of a CDN is not just its caching capabilities but the underlying network. Unlike the public internet, which wasn't explicitly designed for performance, CDNs have a performance-oriented overlay network.
CDNs' routing algorithms employ specialized techniques to select network paths with minimal congestion and latency. For instance, they use TCP optimizations and regularly updated data about the health of the underlying network.
Moreover, CDN servers are often located at exchange points where Internet Service Providers (ISPs) connect. This strategic positioning enables the communication between a client and the origin server to occur almost entirely over network links belonging to the CDN.
Layered Caching
A typical CDN consists of multiple layers of content caching. The topmost layer comprises edge server clusters positioned at different geographical locations. Below this, there are intermediate server clusters located at a smaller number of locations.
This multi-layered structure plays a crucial role. The presence of numerous edge clusters allows CDNs to serve geographically dispersed clients. However, as a downside, this can lower the cache hit ratio. To counteract this, the intermediate layers cache a large fraction of the content, reducing the load on the origin server.
In conclusion, CDNs are complex systems that employ a variety of techniques to deliver content efficiently. By understanding how they work, we can better appreciate their role in the smooth operation of the internet.
Coding challenge
This week I want to discuss a popular interview question at Amazon and Meta: Find k-Radius Averages using Sliding Window.
I liked this question because it doesn't require any deep knowledge of algorithms and data structures. You only need to carefully understand the question, break it down and show that you are able to write a clean solution.
Given an array of integers, the goal is to find the average of each element along with its k neighbors on both sides. If an element has less than k neighbors on both sides, the average can’t be computed and should be equal to -1.
The sliding window approach offers a highly efficient solution to compute the k-radius averages for each element in an integer array. The sliding window has size 2 * k + 1, and it is centered around the current element.
Once the output array has been initialized with -1, a clean solution can be implemented iterating over the array using two pointers representing the boundaries of the windows.
The idea is to store the sum of the elements inside the window in a variable and keep it updated according to the formula:
Sum[i+1] = Sum[i] + (rightmost window) - (leftmost window element)
Here is my solution implemented in C#:
The time complexity of this solution is O(n). Initializing the output array with -1 takes O(n) time. Also iterating over the input array linearly to find the k-radius average of each index takes O(n) time.
The space complexity is O(1), since the only non constant memory space used is the output array.
Interesting tweets
Learning computer science concepts is easier and more engaging when you see their real world applications. That’s one of the things that should be improved in colleges and universities. Link
Finding the right balance between extensibility, maintainability and simplicity of the software is hard. Experience is everything here. Link
I respect John’s opinion and I fully agree with him that software architecture, domain knowledge and capability to choose technology are crucial. Especially when you advance to more senior roles. But I am strongly convinced that data structures and algorithms knowledge are the foundation of a strong career as software developer. Of course not everyone needs to know everything, but the basic knowledge should be solid. Link