Caching
Caching stores copies of data in a location that's faster to access than the original source. When an application needs data, it checks the cache first. If the data exists there (a "cache hit"), it's returned immediately. If not (a "cache miss"), the application fetches from the original source and typically stores a copy in the cache for next time. This simple concept underlies much of what makes modern applications fast.
Why it matters
Speed shapes user experience. Studies consistently show that even small delays increase bounce rates and decrease engagement. A page that loads in one second instead of three can mean the difference between a user completing a task or abandoning it.
Caching matters because it's often the most effective way to improve performance. Database queries that take hundreds of milliseconds can be served from cache in single-digit milliseconds. API calls that require network round-trips can be eliminated entirely. The performance gains compound - faster responses mean servers handle more requests, which means better scalability without proportional infrastructure costs.
How caching works
Every caching system involves decisions about what to store, where to store it, and when to invalidate it:
What to cache. Data that's read frequently but changes infrequently benefits most from caching. User profiles, product catalogs, configuration settings, and computed results are common candidates. Data that changes constantly or is unique to each request benefits less.
Where to cache. Caches exist at multiple levels - browser caches store assets locally, CDN caches store content at edge locations, application caches store computed data in memory, and database caches store query results. Each level has different characteristics and use cases.
When to invalidate. Cached data can become stale when the source changes. Cache invalidation - knowing when to discard or refresh cached data - is notoriously difficult. Strategies include time-based expiration, event-driven invalidation, and versioning.
Types of caches
Browser cache. Stores static assets (images, scripts, stylesheets) on the user's device. Controlled via HTTP headers. Eliminates network requests entirely for cached resources.
CDN cache. Content Delivery Networks cache content at edge locations geographically close to users. Reduces latency and offloads traffic from origin servers.
Application cache. In-memory stores like Redis or Memcached hold frequently accessed data. Sits between the application and database, dramatically reducing database load.
Database cache. Databases maintain internal caches for query results and frequently accessed data. Query optimization often involves making better use of these caches.
Computed result cache. Stores the output of expensive computations. If the same calculation is needed repeatedly, cache the result rather than recomputing.
Cache invalidation
"There are only two hard things in Computer Science: cache invalidation and naming things." This joke persists because cache invalidation genuinely is difficult.
Time-based expiration (TTL). Data expires after a set duration. Simple to implement but means data can be stale until expiration. Works well when some staleness is acceptable.
Event-driven invalidation. When source data changes, explicitly invalidate or update the cache. More complex but keeps data fresher. Requires knowing all places data might be cached.
Write-through caching. Write to cache and source simultaneously. Keeps cache consistent but adds latency to writes.
Cache-aside pattern. Application manages cache explicitly - checks cache, fetches from source on miss, writes to cache. Most flexible but requires careful implementation.
The right strategy depends on consistency requirements, change frequency, and system complexity.
Caching trade-offs
Caching involves fundamental trade-offs:
Freshness vs. performance. Longer cache durations mean better performance but potentially staler data. The right balance depends on how much staleness users can tolerate.
Memory vs. hit rate. Larger caches hold more data and produce more hits, but memory costs money. Cache sizing requires balancing hit rate against resource costs.
Complexity vs. consistency. Sophisticated invalidation keeps data fresh but adds system complexity. Simple TTL-based expiration is easier but less precise.
Local vs. distributed. Local caches are faster but don't share state across instances. Distributed caches share state but add network overhead.
Common caching patterns
Cache-aside. Application checks cache first, fetches from database on miss, writes result to cache. Most common pattern for application-level caching.
Read-through. Cache itself fetches from source on miss. Simplifies application code but requires cache infrastructure that supports this pattern.
Write-through. Writes go to cache and source together. Ensures cache consistency but adds write latency.
Write-behind. Writes go to cache immediately, then asynchronously to source. Fast writes but risks data loss if cache fails before write-through completes.
Caching in product development
Product managers encounter caching in several contexts:
Performance requirements. When defining acceptable response times, caching strategies affect what's achievable and at what cost.
Data freshness requirements. How stale can displayed data be? Real-time dashboards need different caching than product catalogs.
Consistency trade-offs. Users sometimes see stale data due to caching. Understanding when this matters helps prioritize engineering investment.
Scaling decisions. Caching is often cheaper than adding servers. Understanding caching options informs capacity planning.
Tools like Klero help product teams understand which performance issues actually affect users. When customer feedback highlights slow experiences, that's signal for where caching investment might have the most impact.

