Home/System Design/Design a URL Shortener

Design a URL Shortener

🎯 Explain Like I'm 5...

Imagine you have a really long book title that's hard to remember. A URL shortener is like giving that book a short nickname that's easy to remember and share!

What is a URL Shortener?

It turns long web addresses into short, easy-to-share links.

Example:

Long URL: https://www.example.com/products/category/shoes/nike/air-max-2024?color=red&size=10

Short URL: https://short.url/abc123

πŸ”¨ How It Works (Simple Version)

  • 1. User gives us a long URL
  • 2. We create a short code (like 'abc123')
  • 3. We save: short code β†’ long URL in our database
  • 4. When someone clicks the short link, we look up the long URL and redirect them!

🎯 Key Features to Design

  • βœ“URL Shortening: Convert long URLs to short ones
  • βœ“URL Redirection: When clicked, go to original URL
  • βœ“Custom Short URLs: Let users choose their own short code
  • βœ“Analytics: Track how many times link was clicked
  • βœ“Expiration: Links can expire after some time

πŸ“‹ Requirements

Functional Requirements:

  • β€’ Generate a unique short URL for each long URL
  • β€’ Redirect users from short URL to original URL
  • β€’ Support custom short URLs (optional)
  • β€’ Track click analytics
  • β€’ Delete/expire old URLs

Non-Functional Requirements:

  • β€’ High availability (99.9% uptime)
  • β€’ Low latency (<100ms for redirects)
  • β€’ Scalable (millions of URLs)
  • β€’ Secure (prevent abuse)

πŸ“Š Capacity Estimation

Assumptions:

  • β€’ 100 million URLs created per month
  • β€’ 100:1 read to write ratio (100 redirects per 1 creation)
  • β€’ URLs stored for 5 years

Calculations:

  • β€’ Write requests: 100M/month β‰ˆ 40 requests/second
  • β€’ Read requests: 40 * 100 = 4,000 requests/second
  • β€’ Storage: 100M URLs/month * 60 months = 6 billion URLs
  • β€’ If each URL = 500 bytes β†’ 6B * 500B = 3TB storage

πŸ”Œ API Design

API Endpoints
javascript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
// 1. Create Short URL
POST /api/shorten
Request Body:
{
"originalUrl": "https://www.example.com/very/long/url",
"customAlias": "mylink" // optional
}
Response:
{
"shortUrl": "https://short.url/abc123",
"originalUrl": "https://www.example.com/very/long/url",
"createdAt": "2024-01-15T10:30:00Z"
}
// 2. Redirect Short URL
GET /{shortCode}
Response: HTTP 301/302 Redirect to original URL
// 3. Get URL Stats
GET /api/stats/{shortCode}
Response:
{
"shortCode": "abc123",
"originalUrl": "https://www.example.com/very/long/url",
"clicks": 1523,
"createdAt": "2024-01-15T10:30:00Z",
"expiresAt": "2025-01-15T10:30:00Z"
}
// 4. Delete URL
DELETE /api/url/{shortCode}
Response: 204 No Content

πŸ’Ύ Database Schema

Database Schema (SQL)
sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- URLs Table
CREATE TABLE urls (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
short_code VARCHAR(10) UNIQUE NOT NULL,
original_url TEXT NOT NULL,
user_id BIGINT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP,
is_custom BOOLEAN DEFAULT FALSE,
INDEX idx_short_code (short_code),
INDEX idx_user_id (user_id)
);
-- Analytics Table
CREATE TABLE url_analytics (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
short_code VARCHAR(10) NOT NULL,
clicked_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
ip_address VARCHAR(45),
user_agent TEXT,
referrer TEXT,
INDEX idx_short_code (short_code),
INDEX idx_clicked_at (clicked_at)
);
-- Users Table (Optional)
CREATE TABLE users (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
email VARCHAR(255) UNIQUE,
api_key VARCHAR(64) UNIQUE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

URL Shortening Algorithm

Option 1: Hash-Based (MD5/SHA256)

Pros: Fast, deterministic

Cons: Collisions possible, long output needs truncation

Option 2: Base62 Encoding

Convert unique ID to Base62 (a-z, A-Z, 0-9)

Pros: Short, no collisions with auto-increment ID

Cons: Predictable sequence

Option 3: Random String Generation

Pros: Unpredictable, secure

Cons: Need to check for collisions

βœ… Recommended Approach: Base62 Encoding

Base62 Encoding Implementation
java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
public class Base62Encoder {
// Base62 character set: a-z, A-Z, 0-9 (62 characters)
private static final String BASE62_CHARS =
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
/**
* Encode a number to Base62 string
* @param num - The number to encode (database auto-increment ID)
* @return Base62 encoded string
*/
public static String encodeBase62(long num) {
if (num == 0) {
return String.valueOf(BASE62_CHARS.charAt(0));
}
StringBuilder encoded = new StringBuilder();
while (num > 0) {
int remainder = (int)(num % 62);
encoded.insert(0, BASE62_CHARS.charAt(remainder));
num = num / 62;
}
return encoded.toString();
}
/**
* Decode a Base62 string back to number
* @param str - The Base62 string to decode
* @return Decoded number
*/
public static long decodeBase62(String str) {
long decoded = 0;
for (int i = 0; i < str.length(); i++) {
char c = str.charAt(i);
int value = BASE62_CHARS.indexOf(c);
decoded = decoded * 62 + value;
}
return decoded;
}
public static void main(String[] args) {
// Example usage:
long id = 125348L; // Auto-increment database ID
String shortCode = encodeBase62(id);
System.out.println("Short code: " + shortCode); // "w7e"
// Decode back
long decodedId = decodeBase62(shortCode);
System.out.println("Decoded ID: " + decodedId); // 125348
// More examples
System.out.println("ID 1: " + encodeBase62(1L)); // "b"
System.out.println("ID 62: " + encodeBase62(62L)); // "ba"
System.out.println("ID 1000000: " + encodeBase62(1000000L)); // "4c92"
// Capacity calculations:
// 7 characters = 62^7 = 3.5 trillion unique URLs!
// 6 characters = 62^6 = 56 billion unique URLs
// 5 characters = 62^5 = 916 million unique URLs
}
}

πŸ›οΈ High-Level System Design


                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   Clients    β”‚
                    β”‚ (Web/Mobile) β”‚
                    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
                    β”‚Load Balancer β”‚
                    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
           β”‚               β”‚               β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
    β”‚   App      β”‚  β”‚    App     β”‚  β”‚    App    β”‚
    β”‚ Server 1   β”‚  β”‚  Server 2  β”‚  β”‚ Server 3  β”‚
    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
           β”‚               β”‚               β”‚
           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                β”‚                     β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
         β”‚   Cache     β”‚      β”‚  Analytics  β”‚
         β”‚   (Redis)   β”‚      β”‚   Service   β”‚
         β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
                β”‚                     β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
         β”‚  Database   β”‚      β”‚   Message   β”‚
         β”‚   (Master)  β”‚      β”‚    Queue    β”‚
         β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜      β”‚   (Kafka)   β”‚
                β”‚             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
         β”‚             β”‚
    β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”   β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”
    β”‚Replica β”‚   β”‚Replica β”‚
    β”‚   1    β”‚   β”‚   2    β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Components:

  • β€’ Load Balancer: Distribute traffic
  • β€’ Application Servers: Handle requests
  • β€’ Database: Store URL mappings
  • β€’ Cache (Redis): Fast lookups for popular URLs
  • β€’ Analytics Service: Track clicks

Data Flow:

Create Short URL:

  • 1. Client β†’ Load Balancer β†’ App Server
  • 2. Generate unique ID (auto-increment)
  • 3. Convert ID to Base62
  • 4. Save to database
  • 5. Return short URL

Redirect Short URL:

  • 1. Client clicks short URL
  • 2. Check cache for short code
  • 3. If miss, query database
  • 4. Cache the result
  • 5. Redirect to original URL (HTTP 301/302)

πŸ” Deep Dive Topics

Caching Strategy

Use Redis/Memcached to cache popular URLs. LRU eviction policy.

Scaling Database

Partition data by hash of short code. Use read replicas for high read traffic.

Rate Limiting

Prevent abuse by limiting requests per user/IP (e.g., 100 URLs/hour).

Security Considerations

  • β€’ Validate URLs before shortening
  • β€’ Prevent malicious redirects
  • β€’ Use CAPTCHA for suspicious activity
  • β€’ Implement expiration for old links

βš–οΈ Trade-offs & Decisions

HTTP 301 vs 302 Redirect:

301: 301 (Permanent): Cached by browser, faster but no analytics

302: 302 (Temporary): Always hits server, slower but full analytics

SQL vs NoSQL:

SQL: SQL: ACID, complex queries, harder to scale

NoSQL: NoSQL: Fast, scalable, but eventual consistency

πŸ€” Follow-up Questions

Q: How would you handle custom short URLs?

A: Check availability, reserve in database, prevent conflicts with auto-generated codes

Q: How to prevent the same long URL from getting multiple short URLs?

A: Hash long URL, check if exists in database before creating new short URL

Q: How to implement analytics?

A: Use message queue (Kafka) to async process click events. Store in time-series database.

Q: How to delete/expire URLs?

A: Add 'expiry_date' field. Run cron job to clean expired URLs. Mark as deleted (soft delete).

🌍 Real-World Examples

  • β€’ bit.ly - Popular URL shortener with analytics
  • β€’ tinyurl.com - Simple, no-frills URL shortening
  • β€’ goo.gl - Google's URL shortener (discontinued)