Home/System Design/Back-of-Envelope Estimation

Back-of-Envelope Estimation

🎒 Explain Like I'm 5: Planning a School Trip

Imagine your class is going on a field trip! Before you go, you need to figure out how many buses, lunch boxes, and water bottles you need!

🚌 The Problem:

  • Your school has 1,000 students
  • Each bus holds 50 students
  • Each student needs 1 lunch box and 2 water bottles

🤔 Let's Calculate!

  • Buses needed: 1,000 students ÷ 50 per bus = 20 buses
  • Lunch boxes: 1,000 students × 1 box = 1,000 lunch boxes
  • Water bottles: 1,000 students × 2 bottles = 2,000 water bottles

📚 The Lesson:

This is what engineers do! They estimate what resources they need BEFORE building a system. Instead of buses and lunch boxes, they calculate servers, storage, and bandwidth!

What is Back-of-Envelope Estimation?

Back-of-envelope estimation is a quick, approximate calculation to understand system requirements. It's done 'on the back of an envelope' - meaning you don't need perfect accuracy, just ballpark numbers to make informed decisions!

Why is it Important?

  • Makes system design interviews easier - shows you can think through problems
  • Helps estimate infrastructure costs before building
  • Identifies potential bottlenecks early
  • Guides technology choices (SQL vs NoSQL, caching needs, etc.)

Power of 2 Table (Memorize This!)

Computer memory and storage use powers of 2. These numbers are essential!

2^10 = 1 KB (1,024 bytes)
2^20 = 1 MB (1,048,576 bytes)
2^30 = 1 GB (1 billion bytes)
2^40 = 1 TB (1 trillion bytes)
2^50 = 1 PB (1 quadrillion bytes)

Latency Numbers Every Programmer Should Know

Understand how long different operations take:

L1 cache: 0.5 ns
L2 cache: 7 ns
RAM: 100 ns
SSD: 150 μs
HDD: 10 ms
Network (same datacenter): 0.5 ms
Network (cross-continent): 150 ms

Traffic Estimation

Step 1: Calculate Requests Per Second

Example: Twitter-like App

Daily Active Users (DAU): 200 million
Each user reads 100 tweets per day
Each user posts 2 tweets per day

Read Requests:

200M users × 100 reads = 20 billion reads/day
20B ÷ 86,400 seconds = ~230,000 reads/second
Peak traffic (3x): ~700,000 reads/second

Write Requests:

200M users × 2 writes = 400 million writes/day
400M ÷ 86,400 seconds = ~4,600 writes/second
Peak traffic (3x): ~14,000 writes/second

Storage Estimation

Step 2: Calculate Storage Needs

Continuing Twitter Example:

400 million tweets per day
Average tweet: 300 bytes (text)
10% of tweets have images (500KB average)
Keep data for 5 years

Text Storage:

400M tweets × 300 bytes = 120GB/day
120GB × 365 days = 43.8TB/year
43.8TB × 5 years = 219TB

Image Storage:

400M × 10% × 500KB = 20TB/day
20TB × 365 days = 7.3PB/year
7.3PB × 5 years = 36.5PB

Total Storage: 219TB + 36.5PB ≈ 37PB for 5 years

Bandwidth Estimation

Step 3: Calculate Network Bandwidth

Read Bandwidth:

700,000 reads/sec × 300 bytes = 210MB/sec
With 10% images: 210MB + (70K × 500KB) = ~35GB/sec

Write Bandwidth:

14,000 writes/sec × 300 bytes = 4.2MB/sec
With 10% images: 4.2MB + (1.4K × 500KB) = ~700MB/sec

Total Peak Bandwidth: ~36GB/sec incoming + outgoing

Memory/Cache Estimation

Step 4: Estimate Cache Size

80/20 Rule:

80% of traffic goes to 20% of content. Cache the hot 20%!

Cache popular tweets:

Daily tweets: 400M
Cache 20%: 80M tweets
80M × 300 bytes = 24GB

Need ~25-30GB of cache memory for optimal performance

Tips for Estimation

💡Always round numbers - 1 million is easier than 987,654
💡Start with Daily Active Users (DAU) - everything derives from this
💡Assume 100K seconds per day (actual: 86,400)
💡Remember the 80/20 rule for caching
💡Consider peak traffic is 2-3x average
💡Write down assumptions clearly - interviewers want to see your thinking

Common Mistakes to Avoid

Forgetting to account for replication (typically 3x storage)
Not considering peak traffic (always 2-3x average)
Mixing up bits and bytes (8 bits = 1 byte)
Forgetting metadata overhead (add 20-30% to raw data size)
Not accounting for compression (can reduce storage by 60-70%)

Practice Problems

1.Estimate storage for Instagram (500M DAU, 20% post photos daily)
2.Calculate bandwidth for YouTube (2B users, 1 hour watch time/day)
3.Estimate cache size for Reddit (50M DAU, 100 page views/day)
4.Calculate requests/sec for WhatsApp (2B users, 40 messages/day)