Hi Friends,
Welcome to the 30th edition of the Polymathic Engineer.
This week I want to address a question I’ve been asked often. Is knowledge of computer networking relevant as a software developer? And which are the basic concepts to know?
In addition, I will talk about:
Transport Layer Security protocol
Resources to study computer networking
Interesting tweets
Computer Networking
Computer networking refers to how machines exchange data with each other. It's a vast field, but in my opinion every developer should be familiar with the basics.
Here are the 6 must-know concepts:
Communication Protocols. A protocol is a set of rules defining an agreement on how 2 parties should interact. In computer networking the parties are machines
and the rules are the messages they exchange
Network Protocols. They define the type, format, structure, and order of the messages sent between 2 machines. Messages are transmitted as sequences of bytes known as packets and divided into 2 parts. The header contains protocol-related information and the payload contains the transmitted data. There are many network protocols grouped in layers according to their functionalities.
The most important layers are Application, Transport, and Network.
IP Address. Such addresses identify machines connected to the internet network.
IPv4 addresses consist of 4 numbers separated by dots. All four numbers are between 0 and 255. There are special addresses like 127.0.0.1 (your machine)
and 192.168.x.x (your private network).
IP (Internet Protocol). It is the protocol enabling machine-to-machine communication. The smaller units of data exchanged between 2 machines are IP packets. An IP packet contains an header with the source/destination IP addresses and the size plus a payload with TCP or UDP packets.
TCP (Transmission Control Protocol). Create a connection enabling an ordered, reliable data transmission between 2 machines. For this purpose, TCP exposes to the application protocols endpoints known as sockets. A socket combines an IP address and a port (a 16-bit number). A pair of sockets uniquely identifies a connection between 2 applications running on different machines.
A TCP packet contains an header with the source/destination sockets, plus a payload with the application protocol packet.
HTTP (Hyper Text Transfer Protocol). It is the most famous application protocol and follows a request-response paradigm. The client makes a request, and the server issues a response with the requested content and the relevant status information about the request.
TLS protocol
Transport Layer Security is a robust and widely used protocol for web applications. It implements a secure communication end-to-end channel between application processes. Here is how TLS works under the hood.
TLS is a protocol enabling application processes to communicate securely. It enhances application-level protocols by providing: authentication, encryption, and integrity. For example, HTTPS, FTPS, and SMTPS run on top of TLS.
1. Encryption
TLS guarantees that the exchanged data is read only by the 2 communicating processes. Asymmetric encryption is used to generate an encryption key shared by the processes. Symmetric encryption is then used to encrypt the data.
Asymmetric encryption is not used for everything because it is slower and more expensive than a symmetric one
The 2 processes periodically renegotiate the shared encryption key. This minimizes the amount of data that can be deciphered if the key is broken.
2. Authentication
TLS allows the communicating parts to verify that the other is who it claims to be. It's implemented with a digital signature based on asymmetric cryptography.
Process A generates a pair of public-private keys, sharing the public with process B. A signs all messages to B with the private key. B verifies that the messages with from A using the public key. B verifies the ownership of the public key shared by A with a certificate. The certificate is verified by B before establishing the TLS connection.
3. Integrity
TLS ensure that the exchanged data is not tampered. It's implemented with a message authentication code (HMAC). The sender creates a digest and include it in the message. The receiver recompute the digest checking that's equal to the one in the message.
Handshake
TLS requires a handshake between the involved processes to agree on a cipher suite to use (i.e. key exchange algorithms), define the public key used for symmetric encryption, and verify the certificates.
The cool thing is that the handshake is not expensive. It takes only 2 round trips with TLS version 1.2 and 1 with TLS 1.3. Anyway, creating a connection has always a cost, and it's good practice to have servers close to the clients and reuse connections when possible.
Resources to study computer networking
There are a couple of books I highly recommend to dig deeper into computer networking. The first one is "Computer Networking: A Top-Down Approach" by James Kurose and Keith Ross. I used it for my CS studies but it has been updated to the 8th edition. The second is "Data Communications and Networking" by Behrouz Forouzan. Another good book for beginners is "Computer Networks" by Andrew Tanenbaum.
Interesting Tweets
Writing code without an understanding of the problem and the solution to implement is rarely a good idea. The more the problem is complicated, the more time it’s better to invest in planning. Link
AI is and even more will be an excellent support for software engineers. But it won't be able to solve tricky problems that don't even have a clear solution. There are still many things that engineers need to put in: creativity, design skills, performance tuning, and so on. Link
There is no way to synchronize clocks across different distribute process perfectly. Link
Tenenbaum was the one I read way back in college. Lovely refresher article for me. Thanks :)