The Polymathic Engineer

Share this post

Logarithmic power

newsletter.francofernando.com

Logarithmic power

Why logarithms are on of the most important math concepts for developers. And the process Facebook uses to encode 500K+ videos each month.

Franco Fernando
Jan 23
3
Share this post

Logarithmic power

newsletter.francofernando.com

Hi Friends,

Thanks for reading The Polymathic Engineer! Subscribe for free to receive new posts and support my work.

Welcome to the 9th edition of the Polymathic Engineer newsletter.

Today we will talk about:

  • the importance of logarithms in software development

  • the process used by Facebook to encode videos

  • coding challenge

  • three interesting tweets

The importance of logarithms

There are few math concepts as crucial for software developers as logarithms. So this is how they make it possible for an algorithm to run in 1 second instead of 1 day.

The following equation defines the logarithm function: logₖ(x) = y if and only if kʸ = x, where k is the base. Since the default base in programming is 2, let's consider k=2.

Some examples:

• log(4) = 2 since 2² = 4

• log(8) = 3 since 2³ = 8

• log(32) = 5 since 2⁵ = 32

• log(1024) = 10 since 2¹⁰= 1024

• ...

• log(4294967296) = 32

The takeaway is that when y increases by one, x doubles. Consequently, when x gets big (i.e., 4 billion), y remains relatively tiny (i.e., 32). This property is relevant for algorithms and programs.

Indeed each algorithm performs a certain number of basic operations like increasing a pointer, adding two numbers, or accessing an element in an array.

If an algorithm runs in linear time, the number of performed basic operations increases linearly with its input. For example, traversing an array of 8 numbers involves eight operations. Likewise, traversing an array of 64 numbers involves 64 operations.

If an algorithm runs in logarithmic time, the number of performed operations increases differently. For example, repeatedly splitting an array of 8 numbers in half and discarding half takes three operations. However, if the array has 64 numbers, it takes six operations.

If a basic operation takes one nanosecond to perform and the input size is 1 billion, then:

• an O(N) algorithm completes in 1 billion nanoseconds

• an O(logN) algorithm completes in ~30 nanoseconds

The first algorithm takes ~1 day and the second end in a heartbeat.

How Facebook encodes videos

People upload 500K+ videos to Facebook every day. Here is a description of Facebook's process to encode such a massive amount of videos.

The main requirement is to deliver the uploaded videos with high quality and no buffering. For this to happen, Facebook uses two main ingredients:

  • multiple video codecs to encode/decode videos

  • Adaptive Bitrate Streaming to deliver videos

Codecs

Video codecs are techniques to compress and decompress digital video files. Without using them, it would be unfeasible transmitting video data over a network because of its size. Codecs differ in compression efficiency, visual quality, and required computing power. Examples of codecs are H264 (AVC), MPEG-1, and VP9.

Advanced codecs like VP9 provide better compression but require more power. For example, Facebook's encoding system uses advanced codecs to compress video files as much as possible.

Adaptive Bitrate Streaming

Adaptive Bitrate Streaming (ABR) is a technique to deliver videos to clients. The critical step is to encode the videos into multiple resolutions (i.e., 480p, 720p, 1080p). The client's player can then switch between streaming the different encodings.

To do this, the player detects the client's bandwidth and CPU capacity in real-time. It then changes encoding depending on the available resources. As a result, ABR is way better than Progressive Streaming, which streams a single file over the network to the client.

Encoding process

A key question for Facebook is assigning priorities to incoming encoding requests. The idea is to apply more compute-intensive codecs to the most watched videos. To decide which are those videos, Facebook uses a model that relies on two metrics: benefit and cost.

The benefit quantifies the advantages end users will get from advanced encoding, and it's the product of the codec compression efficiency and the predicted watch time. The first is an intrinsic property of each codec. The latter estimates the time a video will be watched and is computed by an ML model.

The cost measures the CPU resources to make all the encoding resolutions available. Indeed, not all encoding jobs require the same resolutions to be considered deliverable. Therefore, the priority of each incoming request corresponds to the benefit divided by the cost.

For more details about the used ML algorithms you can refer to this article on the Facebook’s engineering blog.

Coding Challenge

The coding challenge of last week was a popular interview question asked at Amazon and Facebook: Find Pivot Index.

The problem takes an array of integers as input and asks for its pivot index. The pivot is the lower index where the sum of all the numbers strictly to the left is equal to the sum strictly to the right. If there is no such index, the returned result should be -1.

A great way to reformulate the problem is using the concept of the prefix sum. Indeed the searched index is the lower index i, such as the prefix sum of the element before i is equal to the postfix sum after i.

Prefix and postfix sum can be calculated while iterating through the input array. The prefix sum is the sum of all the elements preceding the current one. The postfix sum is the difference between the sum of all the elements and the prefix sum plus the current element.

Here is the code for the solution.

The coding challenge for the next week is Palindrome Linked List.

Interesting Tweets

I decided to share this tweet for two reasons. First, the link to the free book is available on FreeCodeCamp. Second, because the message is profoundly true, I studied computer science but was not too fond of programming until I didn't take it seriously and started becoming proficient. The more I improved, the more I became passionate.

Twitter avatar for @rpandey1234
Rahul Pandey (joinTaro.com) @rpandey1234
Started going through the "How to Learn To Code" book by @ossia. Already found this gem. You don't just stumble into your passion -- you have to develop it with hard work and persistence. Book is completely free through @freeCodeCamp: freecodecamp.org/news/learn-to-…
Image
5:30 PM ∙ Jan 18, 2023
73Likes14Retweets

There are many misconceptions in software development, which is a good list of the main ones. Nevertheless I’d add two more:

  • Software development means only coding

  • communication is not essential in software development

Twitter avatar for @dmokafa
Daniel Moka⚡ @dmokafa
Common misconceptions in software development: - thinking TDD is equal to testing - trying to be agile by using Scrum - coming up with exact estimations - using microservices for every problem - implementing DevOps by having a separate DevOps team What else would you add?
11:59 AM ∙ Jan 21, 2023
379Likes73Retweets

Thanks for reading The Polymathic Engineer! Subscribe for free to receive new posts and support my work.

Share this post

Logarithmic power

newsletter.francofernando.com
Comments
TopNewCommunity

No posts

Ready for more?

© 2023 Franco Fernando
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing