The Polymathic Engineer

Share this post

Reliable Communication

newsletter.francofernando.com

Reliable Communication

How TCP implements a reliable network communication channel.

Franco Fernando
Dec 20, 2022
4
Share this post

Reliable Communication

newsletter.francofernando.com

Hi Friends,

Welcome to the 6th issue of the Polymathic engineer newsletter. We are a community of 700 people now, thanks again for your support and feedback. Today we will talk about:

Thanks for reading The Polymathic Engineer! Subscribe for free to receive new posts and support my work.

  • What is the TCP protocol

  • How a TCP connection is established

  • How TCP implement flow and congestion control

  • Coding challenge

  • Tech threads of the week

  • Interesting tweets of the week

TCP Protocol

The communication between two machines it’s not reliable at the network layer. Data packets are routed from the source to the destination machines through a series of routers. Still, there is no guarantee that these packets will arrive.

This is why TCP is one of the most critical network protocols. It allows a reliable communication channel between 2 processes where data is transmitted in order, not corrupted, and without duplication or gaps.

TCP partitions a data stream of bytes into blocks called segments. Those segments have three essential properties:

  • they are sequentially numbered so that the receiver can detect duplication or gaps

  • they include a checksum that is used to check the integrity of the data

  • they need to be acknowledged by the receiver

Segments that are not acknowledged after a certain amount of time are retransmitted.

TCP Connection

Two processes need to set up a TCP connection before exchanging segments. First, the connection is established through endpoints known as sockets. A socket identifies a process and the machine where it runs, tracking the connection state.

A TCP connection between 2 processes, A and B, is established with a 3-ways handshake:

  1. A sends an SYN segment with a random number x

  2. B sends an SYN/ACK segment with a random number y and x+1

  3. A sends the first data with x+2 and y+1

Establishing a connection requires a full round-trip without application data exchange. The same happens for closing a connection. In addition, when a connection is closed, a socket must stay in a waiting state where all received segments are discarded for a certain amount of time. This is to avoid those segments being part of a newly established connection.

This is why it makes sense to keep a connection open between two processes if it's likely another data exchange will follow. Connection pools are usually used for this.

TCP Flow and Congestion Control

TCP implements a reliable communication channel and mechanisms to prevent the sender from overwhelming the receiver and the underlying network from getting flooded.

Flow control is the mechanism preventing the sender from overwhelming the receiver. The receiver store the segments in a dedicated buffer. The buffer size is communicated to the sender with every acknowledgment. The sender uses this information to avoid sending segments that do not fit into the buffer.

Congestion control is the mechanism preventing the network from getting flooded. The sender tracks the number of segments that can be sent without receiving an ACK. This number is known as congestion windows. The window increases when ACK segments are received within a timeout; otherwise decreases.

The smaller the window, the less bandwidth is used. This means that the shorter the round trip time, the sooner the window grows, and the entire bandwidth gets used. This is one of the reasons why it's so critical having web servers close to the clients.

Coding challenge

The coding challenge of last week was Longest Substring Without Repeating Characters. I love such a problem because its optimal solution combines the sliding window technique with the hash set data structure.

The problem gives as input a string and asks to find the length of the longest substring without repeating characters.

The brute force solution is to analyze all the possible substrings and check which is the longest without repeating characters. Enumerating all the possible substrings takes O(N^2), while the check takes O(N). So the overall solution takes O(N^3) time, where N is the length of the string.

Better performance can be achieved using the sliding window technique. The idea is to maintain a window with all unique characters. The right pointer expands the window at each step. In contrast, the left pointer shrinks the window when a repeated character enters it.

A hash set data structure keeps track of all the characters in the window. This allows us to efficiently detect when the right pointer enters a repeated character in the window. In that case, the leftmost characters are dropped until there are no more duplicate characters.

Once the right pointer iterates the whole string, the maximum window length is the problem's solution.

The time complexity is O(N) because the right pointer iterates over the input string and the left pointer never exceeds the right pointer. Instead, the space complexity is O(min(N, L)), where L is the number of possible characters.

Here is the code of this solution.

The challenge for the next week is Simplify Path.

Tech threads of the week

Patterns for solving recursion problems in coding interviews.

Twitter avatar for @Franc0Fernand0
Fernando 🇮🇹🇨🇭 @Franc0Fernand0
Recursion problems are common in coding interviews. Here are 6 common categories of recursion problems and their solution patterns: ↓
3:28 PM ∙ Dec 14, 2022
596Likes126Retweets

Data partitioning in distributed systems.

Twitter avatar for @Franc0Fernand0
Fernando 🇮🇹🇨🇭 @Franc0Fernand0
Partitioning is the process of storing a large database across multiple machines. Here are the popular partitioning architectures with their benefits and costs: {1/8} ↓
Image
3:08 PM ∙ Dec 17, 2022
559Likes109Retweets

Interesting tweets of the week

Logarithms are one of the most crucial mathematical concepts to know as developers. An interesting thread will introduce logarithms from a historical point of view, explaining their real-life applications. Definitely, something which will make you appreciate more logarithms and mathematics in general.

Twitter avatar for @intelocode
Abhishek Verma 💻 @intelocode
In the early 17ᵗʰ century, when tedious calculations were holding back progress in science, a brilliant idea came to the rescue. Let's explore the fascinating story of how logarithm was discovered and it's tremendous applications ! Quite an underappreciated concept indeed 🧵👇
Image
2:51 PM ∙ Dec 15, 2022
118Likes30Retweets

Reading technical books is fundamental to improving as a developer. My friend Thiago put together a very well-curated list of technical books. I've read many of them and plan to go through others next year.

Twitter avatar for @thiagoghisi
Thiago Ghisi @thiagoghisi
I've read 500+ books and accumulated 55.000+ notes & highlights over the last ten years. 99% of those books were non-fiction & mostly around Software Engineering. Below are The 22 Technical Books that Impacted my Career the most ➕ their highest-density chapters & pages. 🧵
Image
Image
Image
7:10 PM ∙ Dec 18, 2022
1,602Likes242Retweets

Thanks for reading The Polymathic Engineer! Subscribe for free to receive new posts and support my work.

Share this post

Reliable Communication

newsletter.francofernando.com
Comments
TopNewCommunity

No posts

Ready for more?

© 2023 Franco Fernando
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing