

Discover more from The Polymathic Engineer
Hi Friends,
Welcome to the 31th issue of the Polymathic Engineer. We crossed the milestone of 3K subscribers last week. Thanks everyone for your attention and support. In this issue I wrote about:
an introduction to DynamoDB
what is WebSocket
interesting tweets
Dynamo DB
DynamoDB is a fully managed NoSQL database available on Amazon Web Services. It is widespread and powers many well-known platforms like Netflix, Zoom, and Snapchat.
DynamoDB is essentially a database as a service providing APIs to insert items, create tables, scan table, etc. The cool thing is that it manages everything, allowing developers to focus only on the application logic. Data encryption, failure recovery, upgrades are all handled by the database.
Many people doesn’t know that DynamoDB builds upon the learnings from two Amazon's earlier projects:
- Dynamo: a highly scalable, key-value database for storing shopping cart data
- SimpleDB: a fully managed NoSQL database service
However, both had limitations. Tables had small capacity (10 gigabytes) and read/write latencies were unpredictable because of the mechanism used to index the attributes. DynamoDB was introduced to address these weaknesses.
DynamoDB's data model is quite flexible: data is stored in tables containing items. Each item is composed of attributes which can be numbers, strings, booleans, binaries, lists, maps, or sets. This flexibility allows to model complex data relationships.
Items are identified based on the partition key and the sort key. The partition key is mandatory, while the sort key is optional. You can use both keys to query data. For additional attributes, you need to create secondary indexes.
The main feature of DynamoDB is its infinite scalability. It achieves this by partitioning tables across multiple storage nodes. The node for each item is determined by hashing the partition key. The mapping between items and storage nodes can change dynamically.
Partitions receiving many requests are relocated to more powerful storage nodes. DynamoDB stores each item on 3 nodes in different availability zones for redundancy. The nodes are kept in synch using a Paxos-based algorithm for consensus and leader election.
The leader periodically send heartbeats to the other replicas. If a heartbeat isn’t detected after a certain period of time then the replicas select a new leader and add a new replica to the group.
Writes always go to the leader replica, who generate a write-ahead log record and send the new write to the replicas. Once a quorum of replicas has persisted and acknowledged the write to their local write-ahead logs then the write is acknowledged back to the user. The leader replica is also responsible in case strongly consistent read are requested.
DynamoDB archives each node's B-tree and write-ahead log to AWS S3 to ensure data durability. When a node fails, the leader immediately adds a new node, copying the failed node's B-Tree and Write-Ahead log. Data integrity is guaranteed using checksums extensively.
DynamoDB has many strengths but isn't perfect though. For small data sizes, a monolithic database might provide better performance. Also, you need a good understanding of your access patterns beforehand. This is crucial for effectively designing your tables.
For example, the way you can optimally select your partition and sort keys depend on how you access the data. If you want to do joins, then you’ll need to denormalize data. If you want to filter not using the partition/sort key then you need to create a secondary index.
WebSockets
WebSockets are a type of technology letting web browsers and servers talk in real-time. It's like an open phone line letting them send messages to each other right away. They fit perfectly into real-time applications like chat rooms or online games.
With WebSockets, the server can send updates without the client requesting them. This efficient, low-latency and full-duplex communication relies on a single TCP connection.
Usually, you don't need to start from scratch to use them. Most programming languages already have a WebSocket implementation available. You create a WebSocket object in the client code and connect it to a supporting server. Once the connection is established, you can send and receive real-time messages.
This bidirectional channel makes communication efficient, reducing network traffic and latency. The WebSockets protocol is defined in RFC 6455 and is supported by all modern browsers. Only older browsers may require a fallback mechanism.
Interesting Tweets
Reading on social media you could think that if you don't work at fancy companies or get promoted soon you're a failure. But that's not true. Backgrounds and circumstances are different and everybody has his/her own path. Link
Don’t worry, you are not weird if you don’t have a setup like this. 95% of the stuff on this photo are not necessary to work as a software engineer. Or at least I don’t have them :) Link
Is it necessary to know complex algorithms for day-to-day coding? I think it depends on the specific job. For example, I prototype computer vision algorithms so knowing complex algorithms and being comfortable with geometry is highly beneficial. Link