Học CDN architecture và edge computing: cách CDN hoạt động, edge servers, cache headers, invalidation strategy, static vs dynamic content, geo routing, và edge computing patterns để giảm latency toàn cầu.
Chia sẻ bài học
Bạn có một API server ở Singapore. User ở Brazil truy cập website → latency 300ms chỉ để network round trip.
Không quan trọng code của bạn tối ưu thế nào. Physics không thể cheat được.
Ánh sáng đi qua fiber optic có tốc độ giới hạn. Distance từ Singapore → Brazil = ~18,000km. Round trip = minimum ~240ms, thực tế 300-400ms vì routing.
Giải pháp duy nhất: Đưa content gần user hơn.
Đây chính là lý do CDN (Content Delivery Network) tồn tại.
Lesson này dạy bạn:
CDN = Distributed cache system deployed globally at edge locations.
Nhưng CDN không chỉ là cache đơn thuần. CDN là:
1. Geographic distribution
2. Specialized caching
3. Network optimization
4. Security layer
Mental model: CDN = cache layer + network layer + security layer ở edge.
flowchart TB
subgraph Users["Users Worldwide"]
U1[User Vietnam]
U2[User Brazil]
U3[User Europe]
end
subgraph Edge["Edge Locations"]
E1[Edge Singapore]
E2[Edge São Paulo]
E3[Edge Frankfurt]
end
subgraph Regional["Regional POPs"]
R1[Regional Asia]
R2[Regional Americas]
R3[Regional EU]
end
Origin[Origin Server]
U1 --> E1
U2 --> E2
U3 --> E3
E1 -->|Cache Miss| R1
E2 -->|Cache Miss| R2
E3 -->|Cache Miss| R3
R1 -->|Cache Miss| Origin
R2 -->|Cache Miss| Origin
R3 -->|Cache Miss| Origin
Tiered CDN architecture (ví dụ Cloudflare, Akamai):
Ở gần user nhất.
Aggregation layer.
Your actual server.
Flow thực tế:
User (Vietnam)
→ Edge (Singapore) [HIT - 10ms] ✅
User (Thailand)
→ Edge (Bangkok) [MISS]
→ Regional (Asia) [HIT - 50ms] ✅
User (Myanmar - new location)
→ Edge (Myanmar) [MISS]
→ Regional (Asia) [MISS]
→ Origin (Singapore) [200ms] ⚠️
90%+ requests serve từ edge/regional. Origin chỉ handle <10%.
CDN caching behavior được control bởi HTTP headers.
Quan trọng nhất để control caching.
Cache-Control: public, max-age=31536000, immutable
Directives:
public
private
max-age=<seconds>
max-age=3600 = cache 1 hours-maxage=<seconds>
immutable
no-cache
no-store
Static assets (CSS/JS với versioning):
Cache-Control: public, max-age=31536000, immutable
style.abc123.css → change code → style.def456.cssImages:
Cache-Control: public, max-age=86400
API responses (cacheable):
Cache-Control: public, max-age=300, s-maxage=600
User-specific data:
Cache-Control: private, max-age=0, must-revalidate
Sensitive data:
Cache-Control: no-store
ETag = fingerprint của content.
HTTP/1.1 200 OK
ETag: "abc123def456"
Cache-Control: public, max-age=3600
Revalidation flow:
sequenceDiagram
participant Browser
participant CDN
participant Origin
Note over Browser,Origin: Initial Request
Browser->>CDN: GET /image.jpg
CDN->>Origin: GET /image.jpg
Origin-->>CDN: 200 OK, ETag: "abc123"
CDN-->>Browser: 200 OK, ETag: "abc123"
Note over Browser,Origin: Revalidation (after max-age)
Browser->>CDN: GET /image.jpg<br/>If-None-Match: "abc123"
CDN->>Origin: If-None-Match: "abc123"
Origin-->>CDN: 304 Not Modified
CDN-->>Browser: 304 Not Modified
304 Not Modified = content chưa thay đổi, dùng lại cache.
Bandwidth saving: Không transfer lại full content.
Content không thay đổi hoặc ít thay đổi.
Examples:
CDN strategy:
Cache-Control: public, max-age=31536000, immutable
Cache hit rate: 95%+
Best practices:
app.v123.js thay vì app.jsbundle.abc123.jsstatic.example.com (cookie-free)Content personalized hoặc thay đổi thường xuyên.
Examples:
Challenge: Mỗi user khác nhau, cache không hiệu quả?
Solution: Smart caching strategies.
<!-- Cached HTML template -->
<html>
<body>
<header>...</header>
<!-- Dynamic part loaded via JS -->
<div id="user-content"></div>
<footer>...</footer>
</body>
</html>
Template cache, user data fetch riêng.
Vary: Accept-Encoding, User-Agent
Cache-Control: public, max-age=300
CDN cache separate copies based on vary header.
Ví dụ:
<html>
<body>
<esi:include src="/user/profile" />
<esi:include src="/recommended-products" />
</body>
</html>
CDN assemble page từ multiple parts:
Problem: Content thay đổi ở origin, CDN vẫn serve stale data.
Explicitly tell CDN to delete cache.
# Cloudflare purge API
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
-H "Authorization: Bearer {api_token}" \
-d '{"files":["https://example.com/image.jpg"]}'
Khi nào dùng:
Nhược điểm:
Thay đổi URL khi content thay đổi.
<!-- Old version -->
<script src="/app.js?v=1"></script>
<!-- New version -->
<script src="/app.js?v=2"></script>
Better: Content hash
<script src="/app.abc123.js"></script>
<!-- After change -->
<script src="/app.def456.js"></script>
Ưu điểm:
Đây là best practice cho static assets.
Cache-Control: max-age=60, stale-while-revalidate=86400
Behavior:
Trade-off: Accept eventual consistency để có performance.
Tag related content với keys.
Surrogate-Key: product-123 category-shoes
Purge by tag:
# Update product 123 → purge all related
curl -X PURGE "https://example.com/" \
-H "Surrogate-Key: product-123"
Purge multiple URLs cùng lúc without knowing exact URLs.
CDN evolution: Từ cache → compute.
Edge computing = chạy code ở edge locations, không chỉ cache.
// Cloudflare Workers example
addEventListener('fetch', event => {
event.respondWith(handleRequest(event.request))
})
async function handleRequest(request) {
const variant = Math.random() < 0.5 ? 'A' : 'B'
// Modify response based on variant
const response = await fetch(request)
return new Response(response.body, {
...response,
headers: {
...response.headers,
'X-Variant': variant
}
})
}
Benefit: Không cần backend xử lý A/B logic.
async function handleRequest(request) {
const token = request.headers.get('Authorization')
// Verify JWT ở edge
if (!verifyToken(token)) {
return new Response('Unauthorized', { status: 401 })
}
// Pass to origin only if valid
return fetch(request)
}
Benefit: Block unauthorized requests ở edge, protect origin.
async function handleRequest(request) {
const url = new URL(request.url)
// Route based on path
if (url.pathname.startsWith('/api/v2')) {
// Route to new backend
return fetch('https://new-api.example.com' + url.pathname)
}
// Route to old backend
return fetch('https://old-api.example.com' + url.pathname)
}
Blue-green deployment, canary release ở edge.
async function handleRequest(request) {
const response = await fetch(request)
const data = await response.json()
// Transform response ở edge
const transformed = {
...data,
cached_at: Date.now(),
edge_location: 'singapore'
}
return new Response(JSON.stringify(transformed))
}
Benefit: Origin trả raw data, edge customize cho từng region.
async function handleRequest(request) {
const country = request.cf.country // Cloudflare provides
if (country === 'VN') {
return fetch('https://api.example.com/content/vn')
} else if (country === 'US') {
return fetch('https://api.example.com/content/us')
}
return fetch('https://api.example.com/content/global')
}
Serve localized content without hitting origin.
Cloudflare Workers
AWS Lambda@Edge
Fastly Compute@Edge
Vercel Edge Functions
1. Limited compute time
2. Limited memory
3. No persistent storage
4. Cold start (depending on platform)
Edge computing phù hợp cho lightweight transformations, không phải heavy processing.
Geo routing = route user đến server gần nhất dựa trên location.
Route 53, Cloudflare DNS:
User Vietnam → DNS query
→ Returns Singapore server IP (35.240.x.x)
User US → DNS query
→ Returns US server IP (34.120.x.x)
Latency-based routing:
Geolocation-based routing:
Single IP address, multiple locations.
CDN IP: 104.16.0.0
User Vietnam → reaches Singapore POP
User Brazil → reaches São Paulo POP
Network routes traffic đến nearest POP automatically.
Benefit:
Deploy application ở multiple regions, route intelligently.
flowchart TB
subgraph Users
U1[Users Asia]
U2[Users Americas]
U3[Users Europe]
end
subgraph Regions
R1[Singapore Region]
R2[US Region]
R3[EU Region]
end
DB1[(Database Replica)]
DB2[(Database Replica)]
DB3[(Database Replica)]
U1 --> R1
U2 --> R2
U3 --> R3
R1 --> DB1
R2 --> DB2
R3 --> DB3
DB1 -.->|Replicate| DB2
DB2 -.->|Replicate| DB3
DB3 -.->|Replicate| DB1
Benefits:
Challenges:
CDN không phải lúc nào cũng cần thiết. Đây là decision framework:
1. Global user base
2. Static content heavy
3. Traffic spikes
4. Performance critical
1. Local user base only
2. Dynamic content only
3. Low traffic
4. B2B internal tools
CDN chỉ cho static assets, dynamic content direct to origin.
Static (images/CSS/JS) → CDN
API requests → Direct to origin
Simple, cost-effective, significant benefit.
| Provider | Global POPs | Strengths | Pricing |
|---|---|---|---|
| Cloudflare | 300+ | DDoS protection, Workers | Free tier available |
| AWS CloudFront | 450+ | AWS integration | Pay-as-you-go |
| Fastly | 70+ | Real-time purge, VCL | Premium pricing |
| Akamai | 4100+ | Enterprise, most POPs | Expensive |
| Bunny CDN | 100+ | Affordable | $0.01/GB |
Cloudflare = best choice cho most startups (free tier generous, Workers powerful).
AWS CloudFront = best nếu đã dùng AWS (tight integration).
Fastly = best cho enterprise với real-time needs.
1. Version static assets
<script src="/app.abc123.js"></script>
Immutable caching, instant updates.
2. Separate domain cho static content
www.example.com (dynamic)
static.example.com (CDN)
Cookie-free domain, better caching.
3. Optimize images
4. Use cache headers properly
Cache-Control: public, max-age=31536000, immutable # Static
Cache-Control: public, max-age=300 # Dynamic
5. Monitor cache hit rate
6. Set up purge automation
// Deploy hook
await cdn.purge(['/index.html', '/api/latest'])
7. Test from multiple locations
8. Enable HTTP/2 or HTTP/3
1. CDN = distributed cache + network optimization ở edge
Không chỉ cache, còn routing, security, compute.
2. Tiered architecture: Edge → Regional → Origin
Majority requests serve từ edge, origin protected.
3. Cache headers control CDN behavior
Cache-Control, ETag, Vary - hiểu và dùng đúng.
4. Static content: long TTL + versioned URLs
Best practice cho immutable caching.
5. Dynamic content: smart caching strategies
ESI, vary header, stale-while-revalidate.
6. Cache invalidation: Purge API hoặc cache busting
Versioned URLs > purge API (simpler, instant).
7. Edge computing: lightweight logic gần user
A/B testing, auth, routing - không phải heavy processing.
8. Geo routing: latency-based hoặc geolocation-based
User tự động route đến server gần nhất.
9. Decision framework: Global + static content = cần CDN
Local + dynamic only = có thể skip.
Remember: CDN không phải magic bullet. Là trade-off giữa performance, consistency, và cost. Great architects biết khi nào cần, khi nào không cần.