Latency and Throughput

Franco Fernando

Jun 22, 2024

The difference between latency and throughput. Plus, how to optimize them in system design.

Read →

2 Comments

Josh

Jun 22, 2024

I feel like this post gives an overview of terms but misses giving an actionable point.

Response time, as an application would measure it, is a function of latency, throughput, protocol overhead, and server/application delay on the other side of the connection.

When examining why something is slow, a waterfall graph, like what you can find in Chome’s developer tools, is a great first step for figuring out which of the above factors is most influencing the delay.

If network latency is the cause of the issue, you need to tweak the components of that delay function to compensate. Caches or CDNs can directly reduce that latency, but if that isn’t possible, look for ways to reduce protocol overhead. If you’re using TLS 1.2, upgrade to 1.3. If you’re using HTTP1.2, upgrade to HTTP2. If you can make it happen in your enterprise network, upgrade to QUIC.

When talking about protocol overhead, sometimes the biggest overhead is not in the transport (TCP) or session (TLS) layers. A good example is file copies. NFS and SMB2 are common network file share protocols, and they’re both very chatty. If you’re transferring hundreds of individual files, these protocols negotiate for each file individually, which slows things down. The solution: zip up those files. TCP and SMB are really good at a small number of large files, so give them that.

Likewise with database queries, it’s often faster to batch things up. Use a stored procedure to get your processing done on the database itself. If you can’t do that, go ahead and retrieve 5k rows at a time. It’ll be faster on the network.

Expand full comment

Reply (1)

Franco Fernando

Jul 11, 2024

Excellent additions, Josh. Thanks a lot for taking the time to write this long comment. Many readers will find it helpful.

Expand full comment

The Polymathic Engineer

Latency and Throughput