Posts

How to Design YouTube in a System Design Interview: Learn the Thinking, Not Just the Diagram

Designing YouTube is one of the most popular system design interview questions because it tests almost everything: storage, scalability, CDN, async processing, databases, caching, APIs, reliability, and recommendations. But most candidates make one mistake. They start drawing boxes too early. They say: Client → Load Balancer → API Gateway → Video Service → Database → S3 → CDN That is not wrong, but it is not enough. A strong system design interview answer is not about memorizing architecture. It is about showing how you think. In this blog, we will design a YouTube-like system in a way that helps you crack the interview and build the right engineering mindset. 1. Start With the Interview Mindset When an interviewer says: Design YouTube. They are not expecting you to design all of YouTube in 45 minutes. They want to see whether you can: Clarify ambiguous requirements Identify the core user flows Estimate scale Separate read-heavy and write-heavy paths Choose the right ...

Concurrency Control from First Principles

3 strategies on a single node, 3 strategies across multiple nodes — with real-life examples, production patterns, and future-ready guidance Credits / Acknowledgements This blog is based on detailed discussions and whiteboarding sessions with Sourabh Kumar Banka and Jatin Goyal . Why you should care (even if things “work fine” today) Race conditions don’t usually show up in development. They show up when: traffic spikes, retries kick in, background jobs overlap, autoscaling adds more instances, latency increases (so overlaps happen more often). If you’re building modern systems (cloud, microservices, async workflows, distributed caches), you’re going to face concurrency whether you like it or not. The core idea to remember: Every correct system serializes updates somewhere. Your design decision is where that serialization happens and what trade-offs you accept. First principles: What is a race condition? A race condition exists when all three are true: Shared mutable state Something...

Concurrency Control from First Principles

  Concurrency Control from First Principles 3 Strategies on a Single Node, 3 Across Multiple Nodes — with Real-Life Analogies and Future-Ready Patterns Credits / Acknowledgements This article is based on deep technical discussions and whiteboarding sessions with Sourabh Kumar Banka and Jatin Goyal . Why This Matters (Now and in the Future) Most real production failures are not caused by wrong business logic. They are caused by incorrect ordering of updates . As systems scale — microservices , distributed caches , cloud-native deployments, async retries, autoscaling — concurrency issues increase, not decrease. If you remember only one thing from this article: Concurrency control is about deciding where updates become ordered (serialized) — and intentionally paying the right trade-off. Every correct system enforces order somewhere: Database Application Distributed coordinator Event log Workflow engine If you don’t choose where, contention will choose for you. First Principles: Why...

Concurrency Control from First Principles

3 ways on a single node, 3 ways across multiple nodes — with real-life examples Race conditions aren’t a “database problem” or a “threading problem”. They’re a physics-of-computing problem : two actors try to change the same thing, and time doesn’t give you a single, obvious order. This blog explains concurrency using first principles, then maps that to the 3 most common strategies on a single node and the 3 most common strategies in a distributed (multi-node) system , with practical examples you can reuse. First principles: what causes a race condition? A race condition exists when all three are true: Shared mutable state Something can be changed (a DB row, cache entry, file, in-memory map). Concurrent actors Two+ threads/processes/nodes can touch it “at the same time”. Non-atomic read → compute → write The update is not a single indivisible step. The classic shape is: Read current state Compute new state Write new state If two actors do this concurrently, you can violate invariant...