Home/System Design/Distributed Systems

Distributed Systems

Explain Like I'm 5...

Imagine you and your friends are working on a BIG group project - building a huge LEGO castle! But instead of everyone sitting at the same table, each friend works at their own house on different parts:

The Group Project:

  • Sarah builds the castle towers at her house
  • Mike makes the walls at his house
  • Emma creates the gate at her house
  • They all share their progress using video calls!
  • Even if one friend gets sick, others can keep working!

Why Is This Useful?

  • Too big for one person - need many computers working together!
  • If one computer breaks, others keep the system running
  • Faster! Multiple computers do work at the same time
  • Users around the world get fast responses from nearby computers

What Are Distributed Systems?

A distributed system is a collection of independent computers that appear to users as a single coherent system. These computers communicate and coordinate their actions by passing messages to one another.

  • Multiple computers working together
  • Connected through a network
  • Coordinate by message passing
  • Appear as a unified system to users

Distributed System Architecture

                         Load Balancer
                              |
              +---------------+---------------+
              |               |               |
         [Server 1]      [Server 2]      [Server 3]
              |               |               |
         +----+----+     +----+----+     +----+----+
         |         |     |         |     |         |
    [Cache 1] [DB Replica] [Cache 2] [DB Replica] [Cache 3] [DB Primary]
         |         |     |         |     |         |
         +---------+-----+---------+-----+---------+
                   |                     |
              [Message Queue]    [Service Discovery]
                   |                     |
              +----+----+           [ZooKeeper]
              |         |
         [Worker 1] [Worker 2]
              |         |
         [Analytics] [Logging]

CAP Theorem

The CAP theorem states that a distributed system can only guarantee TWO of these three properties at the same time:

Consistency

All nodes see the same data at the same time. Every read receives the most recent write.

Availability

Every request receives a response (success or failure). The system is always operational.

Partition Tolerance

The system continues to operate despite network failures that partition the system into isolated groups.

                    Consistency (C)
                         /\
                        /  \
                       /    \
                      /  CA  \
                     /  (Not  \
                    / Realistic)\
                   /_____________\
                  /       |       \
                 /   CP   |   AP   \
                /  Systems| Systems \
               /   (e.g.  | (e.g.   \
              /  MongoDB) |Cassandra)\
             /__________________________\
    Availability (A)              Partition Tolerance (P)

    You can only pick TWO!

The Trade-off:

  • CP Systems (Consistency + Partition Tolerance): May become unavailable during network issues. Examples: MongoDB, HBase, Redis
  • AP Systems (Availability + Partition Tolerance): May return stale data during network issues. Examples: Cassandra, CouchDB, DynamoDB
  • CA Systems (Consistency + Availability): Only work in single-location systems, not truly distributed. Examples: Traditional RDBMS

Consistency Models

Strong Consistency

After an update, all subsequent reads will return the updated value. Highest consistency but may sacrifice availability and performance.

Eventual Consistency

If no new updates are made, eventually all reads will return the last updated value. Better availability and performance, but temporary inconsistencies possible.

Causal Consistency

Operations that are causally related are seen by all nodes in the same order. Provides a middle ground between strong and eventual consistency.

Consensus Algorithms

Algorithms that help distributed systems agree on a single value or state, even when some nodes fail:

Paxos

Classic consensus algorithm that's proven correct but complex to implement. Uses proposers, acceptors, and learners to reach agreement.

Raft

Easier-to-understand alternative to Paxos. Uses leader election and log replication. Popular in modern systems like etcd and Consul.

Raft Consensus - Leader Election

    Normal Operation:

    [Leader] --heartbeat--> [Follower]
       |                        |
    [Follower] <--ack-----  [Follower]

    Leader Failure:

    [FAILED] X          [Follower] --request vote--> [Follower]
                             |                            |
                        [Follower] <--grant vote---- [Follower]
                             |
                        [NEW LEADER] --heartbeat--> [Follower]

    Log Replication:

    Client --> [Leader] --append entries--> [Follower]
                  |                             |
              [Follower] <--ack------------- [Follower]
                  |
              [Commit] --success--> Client

Service Discovery & Coordination

Apache ZooKeeper

Centralized service for maintaining configuration, naming, synchronization, and group services. Provides distributed coordination primitives.

HashiCorp Consul

Service mesh solution providing service discovery, health checking, and key-value store. Built on Raft consensus.

Microservices Architecture

Architectural style where applications are composed of small, independent services that communicate over a network:

  • Each service runs in its own process
  • Services communicate via lightweight protocols (HTTP/REST, gRPC, message queues)
  • Services can be deployed independently
  • Each service can use different technology stacks

Benefits:

  • Independent scaling of services
  • Easier to understand and modify
  • Fault isolation - one service failure doesn't crash entire system
  • Technology diversity - use best tool for each job

Microservices Architecture

                      API Gateway
                          |
        +-----------------+------------------+
        |                 |                  |
   [User Service]   [Product Service]  [Order Service]
        |                 |                  |
    [User DB]        [Product DB]       [Order DB]
        |                 |                  |
        +--------[Message Queue]-------------+
                          |
                    [Email Service]
                          |
                     [SMTP Server]

Fault Tolerance & Redundancy

Techniques to keep systems running despite component failures:

Replication

Keep multiple copies of data across different nodes. If one fails, others continue serving requests.

Health Checks & Heartbeats

Regular signals from nodes to confirm they're alive. Missing heartbeats trigger failover procedures.

Circuit Breaker

Stop calling a failing service to prevent cascading failures. Like a circuit breaker in your house protecting from electrical overload.

Bulkhead Pattern

Isolate resources for different parts of the system. Failure in one area doesn't drain resources from others.

Network Partitions

When network failures split the system into isolated groups that can't communicate:

Split-Brain Problem

Multiple partitions think they're the leader, making conflicting decisions. Solved using quorum-based approaches.

Quorum

Require majority agreement before making decisions. Prevents split-brain by ensuring only one partition has majority.

Real-World Examples

Google Search - Global Distribution

Google operates massive distributed systems across data centers worldwide:

  • Search queries routed to nearest data center for low latency
  • Index replicated across thousands of servers
  • MapReduce processes petabytes of data in parallel
  • Bigtable provides distributed storage with eventual consistency
  • Spanner offers global strong consistency using atomic clocks

WhatsApp - Message Delivery

WhatsApp handles billions of messages across distributed servers:

  • Messages routed through multiple servers for reliability
  • Eventual consistency - messages sync across devices over time
  • Queue-based architecture handles message delivery
  • End-to-end encryption maintained across distributed infrastructure
  • Presence information (online/offline) uses gossip protocols

Dropbox - File Synchronization

Dropbox synchronizes files across multiple devices and cloud storage:

  • Files chunked and stored across distributed storage nodes
  • Metadata service tracks file versions and locations
  • Conflict resolution when same file edited on different devices
  • Delta sync - only changed parts transmitted
  • Eventual consistency model with conflict notification

Netflix - Global Streaming

Netflix serves content to millions from distributed infrastructure:

  • Content replicated to regional CDN servers worldwide
  • Microservices architecture with hundreds of services
  • Chaos engineering (Chaos Monkey) tests fault tolerance
  • Dynamic failover between AWS regions
  • Eventual consistency for user preferences and viewing history

Key Challenges

  • ⚠️Network Latency - Communication between nodes takes time
  • ⚠️Partial Failures - Some components fail while others work
  • ⚠️Concurrency - Multiple nodes accessing same data simultaneously
  • ⚠️Clock Synchronization - Hard to agree on time across nodes
  • ⚠️Data Consistency - Keeping replicas synchronized
  • ⚠️Debugging - Difficult to trace issues across distributed components

Design Principles

  • Design for Failure - Assume components will fail and plan for it
  • Embrace Asynchrony - Use message queues and async communication
  • Eventual Consistency - Accept temporary inconsistencies for availability
  • Idempotency - Operations safe to retry without side effects
  • Loose Coupling - Services independent and communicate via well-defined interfaces
  • Monitoring & Observability - Track system health and behavior

Common Patterns

  • Leader Election - Select one node to coordinate actions
  • Sharding - Partition data across nodes for scalability
  • Read Replicas - Scale reads by distributing across copies
  • Write-Ahead Logging - Ensure durability before acknowledging writes
  • Gossip Protocol - Spread information through peer-to-peer communication
  • Saga Pattern - Manage distributed transactions through compensating actions

Trade-offs

  • ⚖️Consistency vs Availability - Can't maximize both during partitions
  • ⚖️Latency vs Consistency - Stronger consistency requires more coordination
  • ⚖️Complexity vs Performance - Optimizations add operational complexity
  • ⚖️Cost vs Redundancy - More replicas cost more but improve reliability
  • ⚖️Flexibility vs Standardization - Microservices freedom vs consistency

Best Practices

  • 1.Use proven consensus algorithms (Raft, Paxos) for critical coordination
  • 2.Implement comprehensive monitoring and distributed tracing
  • 3.Design APIs to be idempotent for safe retries
  • 4.Use circuit breakers to prevent cascading failures
  • 5.Implement graceful degradation when dependencies fail
  • 6.Test partition scenarios and failure modes regularly
  • 7.Use immutable data structures to avoid coordination
  • 8.Implement proper backpressure and rate limiting