TechTrailCamp
← Back to Blog

Caching Strategies for Distributed Systems

CACHING IN DISTRIBUTED SYSTEMS Client CDN Cache CloudFront Gateway Cache API Gateway App Cache In-memory / Local Redis Distributed Cache ElastiCache Database RDS / DynamoDB Request Latency by layer: CDN: ~5ms | Redis: ~1ms | DB: ~10-50ms | Cold DB: ~100ms+ Cache hit ratio target: 90-99% A $50/mo Redis instance can save $1000s in compute and DB costs

Caching is the single most impactful performance optimization in distributed systems. A well-placed cache can reduce latency from 100ms to 1ms, cut database load by 90%, and save thousands in infrastructure costs. But caching incorrectly leads to stale data, consistency bugs, and mysterious production issues.

This guide covers the core caching patterns, when to use each, and the pitfalls that catch even experienced engineers.

Caching Patterns

1. Cache-Aside (Lazy Loading)

The most common pattern. The application checks the cache first. On a cache miss, it reads from the database, then stores the result in the cache for future requests.

Cache-Aside Pattern Application Cache Database 1 2 1. Check cache first 2. On miss: read from DB 3. Store result in cache Pros: Only caches data that's actually requested Cache failure doesn't break the app Cons: First request is always slow (cache miss)
Cache-aside: the application manages the cache explicitly
// Cache-Aside pattern in pseudocode
public User getUser(String userId) {
    // 1. Check cache
    User cached = cache.get("user:" + userId);
    if (cached != null) return cached;  // Cache hit!
    
    // 2. Cache miss: read from DB
    User user = database.findById(userId);
    
    // 3. Store in cache with TTL
    cache.set("user:" + userId, user, Duration.ofMinutes(15));
    
    return user;
}

2. Write-Through

Every write goes to both the cache and the database simultaneously. The cache is always up-to-date, but writes are slower because they hit two stores.

  • Pros — cache is always consistent with DB, no stale reads
  • Cons — higher write latency, caches data that may never be read
  • Best for — data that's read frequently after being written (e.g., user sessions)

3. Write-Behind (Write-Back)

Writes go to the cache immediately, and the cache asynchronously flushes to the database. Extremely fast writes, but risk of data loss if the cache fails before flushing.

  • Pros — fastest write performance, batches DB writes
  • Cons — risk of data loss, complex failure handling
  • Best for — high-throughput writes where eventual persistence is acceptable (e.g., analytics counters, page views)

4. Read-Through

The cache itself loads data from the database on a miss. The application only talks to the cache, never directly to the DB. DynamoDB DAX is an example of a read-through cache.

Cache Invalidation

Phil Karlton famously said: "There are only two hard things in Computer Science: cache invalidation and naming things." Here are the practical strategies:

TTL-Based Expiration

Set a Time-To-Live on every cache entry. After the TTL expires, the next request triggers a fresh read from the database. Simple and effective for most use cases.

// TTL strategy
cache.set("product:123", productData, Duration.ofMinutes(5));
// After 5 minutes, this key expires automatically
// Next read will be a cache miss → fresh data from DB

Event-Based Invalidation

When data changes, publish an event that triggers cache invalidation. More complex but provides near-real-time consistency.

// On product update
productRepository.save(product);
eventBus.publish(new ProductUpdatedEvent(product.getId()));

// Cache invalidation listener
@EventListener
void onProductUpdated(ProductUpdatedEvent event) {
    cache.delete("product:" + event.getProductId());
}

Version-Based Invalidation

Include a version number in the cache key. When data changes, increment the version. Old cache entries naturally become orphaned and expire via TTL.

Caching Layers in a Distributed System

  • Browser cacheCache-Control headers for static assets. Zero latency for repeat visits.
  • CDN cache (CloudFront) — edge locations serve static content and API responses close to users. ~5ms latency.
  • API Gateway cache — cache entire API responses. Ideal for read-heavy, infrequently changing endpoints.
  • Application-level cache — in-process cache (e.g., Caffeine in Java, lru-cache in Node). Sub-millisecond but not shared across instances.
  • Distributed cache (Redis/Memcached) — shared cache across all application instances. ~1ms latency. The workhorse of most architectures.
  • Database query cache — some databases cache query results internally. Useful but don't rely on it exclusively.

Redis vs Memcached on AWS

Redis vs Memcached Redis (ElastiCache) ✓ Rich data structures (lists, sets, hashes) ✓ Persistence (snapshots + AOF) ✓ Pub/Sub messaging ✓ Lua scripting, transactions ✓ Replication + Cluster mode Best for: most use cases, sessions, leaderboards Memcached (ElastiCache) ✓ Simple key-value only ✓ Multi-threaded (uses all cores) ✓ No persistence (pure cache) ✓ Simpler, slightly lower latency ✓ Auto-discovery for scaling Best for: simple caching, large object caching
Redis is the default choice for most teams; Memcached wins for pure, simple caching at scale

Common Caching Pitfalls

1. Cache Stampede (Thundering Herd)

A popular cache key expires, and hundreds of simultaneous requests all miss the cache and hit the database at once. The DB gets overwhelmed.

Solutions:

  • Mutex/lock — only one request fetches from DB; others wait for the cache to be populated
  • Stale-while-revalidate — serve stale data while one request refreshes in the background
  • Jittered TTL — add random variance to TTLs so keys don't all expire at the same time

2. Cache Penetration

Requests for data that doesn't exist bypass the cache every time (there's nothing to cache). An attacker can exploit this to flood your database.

Solution: Cache negative results too. Store a "null" marker with a short TTL for keys that returned no data.

3. Hot Key Problem

A single cache key gets disproportionate traffic (e.g., a viral product page). Even Redis can struggle if one key gets millions of reads per second on a single shard.

Solution: Replicate hot keys across multiple shards with key suffixes, or use a local in-memory cache in front of Redis for hot keys.

4. Stale Data in Distributed Caches

In multi-region or multi-instance deployments, cache invalidation doesn't happen instantaneously everywhere. One instance may serve stale data while another serves fresh data.

Solution: Accept eventual consistency with short TTLs, or use pub/sub-based invalidation across all instances.

What to Cache (and What Not to)

Cache these:

  • Database query results that are read frequently
  • Computed/aggregated data that's expensive to calculate
  • Session data and authentication tokens
  • API responses from external services
  • Static configuration that rarely changes

Don't cache these:

  • Data that must be real-time consistent (account balances, inventory counts during checkout)
  • Data that changes on every request
  • Large binary blobs (use a CDN or S3 instead)
  • Write-heavy data with low read rates
The best cache is one you don't notice. It should be invisible to the user, transparent to the developer, and measurable by the operations team. If your caching strategy requires a PhD to understand, simplify it.

Practical Checklist

  1. Measure first — identify your slowest queries and most frequent reads before adding caching
  2. Start with cache-aside + TTL — the simplest pattern that works for 80% of use cases
  3. Use Redis on ElastiCache — unless you have a specific reason for Memcached
  4. Set TTLs on everything — never cache without expiration. Even long TTLs (24h) are better than infinite
  5. Monitor hit rates — target 90%+ cache hit ratio. Below 80% means your caching strategy needs adjustment
  6. Plan for cache failure — your app must work (slowly) without the cache. Never make cache a hard dependency
  7. Add caching incrementally — start with the highest-impact queries, measure the improvement, then expand

Conclusion

Caching is not optional in production distributed systems — it's essential. The key is choosing the right pattern for each use case, setting appropriate TTLs, handling invalidation correctly, and planning for failure. Start simple with cache-aside and Redis, measure your hit rates, and optimize from there.

At TechTrailCamp, caching strategies are a core part of our System Design and AWS tracks. You'll implement real caching layers with Redis and CloudFront through hands-on, 1:1 mentoring.

Want to build high-performance systems?

Join TechTrailCamp's 1:1 training and master caching patterns for distributed systems.

Start Your Learning Journey