What is Caching
Caching is a mechanism that temporarily stores copies of data in a location that can be accessed quickly. It reduces access to the original data source (database, API, etc.) and shortens response times.
Impact of Caching: If a database query takes 100ms, retrieval from cache can be completed in under 1ms.
Cache Layers
flowchart TB
Browser["Browser Cache"] --> CDN["CDN Cache"]
CDN --> App["Application Cache (Redis, etc.)"]
App --> DBCache["Database Cache"]
DBCache --> DB["Database"]
Caching Patterns
Cache-Aside
The application directly manages the cache and database.
async function getUser(userId) {
// 1. Check cache
const cached = await cache.get(`user:${userId}`);
if (cached) {
return JSON.parse(cached);
}
// 2. Cache miss: Get from DB
const user = await db.users.findById(userId);
// 3. Save to cache
await cache.setex(`user:${userId}`, 3600, JSON.stringify(user));
return user;
}
Pros: Simple, resilient to failures Cons: Latency on cache miss
Read-Through
The cache itself handles data retrieval.
// Cache library configuration
const cache = new Cache({
loader: async (key) => {
// Called automatically on cache miss
const userId = key.replace('user:', '');
return await db.users.findById(userId);
}
});
// Usage (Simple!)
const user = await cache.get(`user:${userId}`);
Write-Through
Updates cache and DB simultaneously on write.
async function updateUser(userId, data) {
// Update DB
const user = await db.users.update(userId, data);
// Also update cache
await cache.setex(`user:${userId}`, 3600, JSON.stringify(user));
return user;
}
Pros: High data consistency Cons: Increased write latency
Write-Behind
Writes to cache immediately, DB update is done asynchronously.
async function updateUser(userId, data) {
// Update cache immediately
await cache.setex(`user:${userId}`, 3600, JSON.stringify(data));
// Add DB write to queue
await writeQueue.add({ userId, data });
return data;
}
// Background worker
writeQueue.process(async (job) => {
await db.users.update(job.userId, job.data);
});
Pros: Fast writes Cons: Risk of data loss
Cache Invalidation
TTL (Time To Live)
Automatically expires after a certain time.
// Expires after 60 seconds
await cache.setex('key', 60, 'value');
Event-Based Invalidation
Explicitly delete cache when data is updated.
async function updateUser(userId, data) {
await db.users.update(userId, data);
// Invalidate related caches
await cache.del(`user:${userId}`);
await cache.del(`user:${userId}:profile`);
await cache.del(`users:list`);
}
Pattern-Based Invalidation
// Delete all user-related caches
const keys = await cache.keys('user:123:*');
await cache.del(...keys);
Cache Problems and Solutions
Cache Stampede (Thundering Herd)
A problem where many requests simultaneously experience cache misses.
// Solution: Use locks
async function getWithLock(key, loader) {
const cached = await cache.get(key);
if (cached) return JSON.parse(cached);
// Acquire lock
const lockKey = `lock:${key}`;
const locked = await cache.set(lockKey, '1', 'NX', 'EX', 10);
if (!locked) {
// Another process is loading → Wait and retry
await sleep(100);
return getWithLock(key, loader);
}
try {
const data = await loader();
await cache.setex(key, 3600, JSON.stringify(data));
return data;
} finally {
await cache.del(lockKey);
}
}
Probabilistic Early Recomputation
Probabilistically update cache before TTL expires.
async function getWithProbabilisticRefresh(key, loader, ttl) {
const data = await cache.get(key);
const remainingTtl = await cache.ttl(key);
// If TTL is running low, probabilistically recompute
if (data && remainingTtl < ttl * 0.1) {
if (Math.random() < 0.1) {
// 10% chance of background update
loader().then(newData => {
cache.setex(key, ttl, JSON.stringify(newData));
});
}
}
if (data) return JSON.parse(data);
const newData = await loader();
await cache.setex(key, ttl, JSON.stringify(newData));
return newData;
}
Cache Key Design
// Good key design
const key = `user:${userId}:profile:v2`;
// Components:
// - Prefix: Entity type
// - Identifier: Unique ID
// - Sub-resource: Specific data
// - Version: Compatibility for schema changes
TTL Design Guidelines
| Data Type | TTL | Reason |
|---|---|---|
| Static content | 1 day - 1 week | Rarely changes |
| User profile | 1 - 24 hours | Low change frequency |
| Configuration | 5 - 30 minutes | Moderately updated |
| Real-time data | 1 - 5 minutes | Frequently changes |
| Session | 30 min - 24 hours | Balance security and UX |
Summary
Caching is a fundamental technique for performance optimization. By understanding patterns like Cache-Aside and Write-Through, and designing appropriate TTL and invalidation strategies, you can build fast and scalable systems. Consider the balance between caching complexity and benefits when implementing.
← Back to list