Hi Friends,
Welcome to the 25th issue of the Polymathic Engineer newsletter.
Today we will talk about:
Distributed databases architectures
Stacks and Queues data structures
Coding challenge
Interesting tweets
Distributed architectures
All distributed databasesIn a single leader architecture, a single database acts as a leader, receiving and applying all writes from clients. All other replicas (followers) can only receive and answer read requests.
The main benefit is that this topology avoids write conflicts, but a single leader also has some downsides and limitations.
First, the leader shall be able to handle all the writes. So the application should not be write-intensive.
Second, the system latency increase. The leader can only be close to some clients, and the overall round-trip times are higher.
Third, it's necessary to put in place a failover strategy in case the leader goes down. Failover strategies require ensuring the leader failed and deciding the new leader.
In a multi leader architecture, multiple databases act as leaders, receiving and applying the writes from clients.
The main benefit is that more databases perform writes and are closer to the clients. So the latency reduces, and the write throughput increases.
The drawback is that it's necessary to solve conflicts between different writes. There are two types of conflicts: between leaders and between followers.
Conflicts between leaders can be solved using logical clocks to keep them in synch. Conflicts between followers can be solved applying conflicting writes in the same order.
In a leaderless architecture, every database can accept writes and the clients write concurrently to many replicas. As soon the client gets a confirmation from some of them, a write is successful.
The benefit is that failures are tolerated more easily without failover strategies, but having no leader causes other issues.
First, the destination replicas can have staled data if some writes fail. So, clients need read data from several replicas concurrently to deal with this.
The replicas return a kind of version number together with the data. The clients use the version number to decide which data to use.
Second, it's necessary to bring in synch the replicas with staled data. There are 2 common techniques to synch replicas: read repair or background process.
Stacks and Queues
Stacks and Queues are 2 similar and opposite data structures. Let’s revise their key differences and have a look at they can be implemented.
A stack is an ordered collection where elements are added and removed from the same end. So stacks have a Last In First Out (LIFO) behavior. The last element placed inside is the first element to come out.
The operation of inserting an element into a stack is called pushing. Instead, removing an element from a stack is called popping. Another standard operation is peeking, which looks at the last inserted element.
These operations have usually O(1) time complexity, but the actual time complexity depends on the implementation. Many programming languages have built-in stacks available, so you don’t need to implement one.
Otherwise, a dynamic array is the most common and easiest way to implement a stack. The alternative way to implement a stack is using a linked list.
As a side note, stacks have a lot to do with recursion. When executing a recursive function, function calls are pushed on a stack. On each return statement, the current call is popped off the stack. he top of the stack is the current function call.
A queue is an ordered collection where elements are added and removed from opposite sides. So queues have a First In First Out (FIFO) behaviour.
The first element placed inside is the first element to come out. Adding/removing elements is called enqueuing / dequeuing. Many programming languages have a built-in queue available, so again you don’t need to implement one.
Otherwise, queues are trickier to implement than stacks to get O(1) time complexity for the operations. A dynamic array doesn't work because operations on the front are O(N).
An efficient way to implement a queue is using a linked list. The list shall maintain pointers to both the head and tail. The alternative is using a circular array. The array shall maintain the indexes of the first and last inserted elements.
A variant of the queue is a data structure called Deque. In a Deque, it's possible to add or delete elements from both ends. So a Deque can implement both a stack and a queue behaviour. The implementation of a Deque is similar to a standard queue.
Coding challenge
The coding challenge for this week is Spiral Matrix II.
Given a positive integer n
, the problem asks to generate a square matrix filled with elements from 1
to n^2
in spiral order. Here is an example with n = 3.
The best way to solve this problem is simulating the process of filling the matrix according to the given rules.
The key to write understandable code are two:
traverse the matrix using the left, right, upper, and lower boundaries
keep the boundaries updated while moving from the border to the center of the matrix
use the number of iterations: either the number of squares or the max value to write.
Here is a possible implementation:
The time complexity is O(N^2), while the space complexity is O(1).
Interesting Tweets
What's wrong is not reusing others' code, but reusing it without fully understanding it. The probability of introducing bugs or logical errors is very high. Link to the tweet.
Following this advice always helps me to fix stupid flaws before submitting a PR. Link to the tweet.
Only memories of the time spent together remain when our parents get older. It was another epoch, but I don't have many memories of me and my father. So I want my kids to have as many memories as possible of me. Link to the tweet.