Cassandra

A deep dive into the Apache Cassandra database.

Nov 25, 2023

∙ Paid

Hi Friends,

Welcome to the 48th edition of the Polymathic Engineer newsletter, the first reserved for paid subscribers. Thanks for your trust and support. I hope you’ll enjoy the reading.

This time, we will focus on one of the most popular NoSQL data stores: Apache Cassandra.

The outline will be as follow:

introduction
architecture
data model
partition and replication
automation and scalability
trade-offs
how to set up and use a cluster

Introduction

Cassandra is a popular NoSQL data store that was developed by Facebook and incorporates architectural ideas from Bigtable and Dynamo DB. It is a data store built for scale, and some of its features only work on a multi-node Cassandra cluster.

The largest Cassandra clusters have tens of thousands of nodes and store petabytes of data. Users of Cassandra include many big tech companies like Apple, Netflix, Uber, Meta and so on.

Architecture

The first thing to keep in mind is that Cassandra has a has a decentralized architecture where all nodes in a cluster perform the same functions. Clients can connect to any node, and when they do, that node becomes the session coordinator for the client.

The Polymathic Engineer

Cassandra

A deep dive into the Apache Cassandra database.

Introduction

Architecture

This post is for paid subscribers