Faster, Cheaper, Scalable: Architecting High-Performance Azure Apps with Caching
Details
“There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.” — Leon Bambrick
Join us as Microsoft MVP Bryn Lewis shows us how caching is the ultimate “cheat code” for cloud architecture. When implemented correctly, it’s the fastest way to slash your Azure consumption costs, reduce database contention, and keep your application responsive under massive load. But move beyond simple lookups – your deployment model and caching strategy can make or break your app’s reliability. In this session, we’ll move from the browser edge to the distributed core:
- Optimizing the Edge: Leverage RFC-standard HTTP semantics to offload traffic to CDNs and browsers, cutting ingress/egress costs before requests even reach your App Service.
- Saving Compute: See how ASP.NET Core Output Caching rescues your CPU from redundant work, allowing you to scale out less frequently and save on your monthly Azure bill.
- Modern Object Strategies: A deep dive into HybridCache and FusionCache. We’ll compare L1/L2 strategies and master “the dark arts” of stampede protection and cache invalidation to ensure high availability.
- The Power of Azure Cache for Redis: We’ll close by configuring Redis as a distributed L2 cache, ensuring your cloud applications stay fast, synchronized, and resilient across multiple instances.
The code I used to double check my assumptions is available on GitHub. This repo demonstrates various .NET caching strategies (FusionCache, HybridCache, OutputCache, Redis, etc.) against a real Azure SQL Server backend.
Dapper Extensions
All the demo projects use my DapperExtensions project, so every cache benchmark hits the database through the same resilient layer, meaning the retry logic “should never” skew the results.
DIYCache – Rolling your own cache in 80 lines
The cache is a ConcurrentDictionary<string,CacheItem<T>>> registered as a singleton. The GET endpoint uses the cache-aside pattern, return if valid and not expired, otherwise hit the database, and store the result with a configurable TTL, then return. A companion DELETE endpoint evicts a specific entry with a single TryRemove call. The cache has no background eviction, or stampede protection, and the size is “unbounded”
Fusion Cache – Scale with configuration
FusionCache is a read through cache with Fast L1/Shared L2 support which hides the checking, fetching, and storing in three separate steps, with a factory lambda to manage the process. Cache invalidation uses tags, each entry is stamped at write time, and a single RemoveByTagAsync call evicts every matching entry. In the sample project Stack Exchange Redis is opt-in via configuration. Add the required connection string and FusionCache becomes a two-tier cache: fast in-memory L1 backed by distributed Redis L2. Add a backplane connection string and invalidation signals propagate across all running instances. The same code works as a single-process cache in development and a fully distributed one in production with no code changes. Realistically it was what I was hoping HybridCache would be
HTTP Head – RFC 9111 IETF “HTTP Caching”.
The HTTPHead project shows how HTTP’s HEAD method and ETags can eliminate unnecessary data transfers. When a client fetches a neighborhood record via GET, it receives an ETag derived from the Azure SQL Server rowversion (replaces the TimeStamp which has been deprecated) column. On subsequent checks, it sends that ETag to the HEAD endpoint, which queries only the version column and returns 304 Not Modified or 200 OK no payload needed. The PUT endpoint uses optimistic concurrency, rejecting updates where the ETag no longer matches. This ensures clients only download data that has actually changed.
Hybrid Cache – If only
HybridCache is a two-tier cache that sits in front of both an in-process L1 cache and an optional Redis L2 cache behind a single GetOrCreateAsync call. In the sample code NeighborHood lookups are cached with a 5-minute in-memory expiry and a 30-minute Stack Exchange Redis expiry, so repeated requests within the same process never leave the machine, while distributed deployments still share a warm cache across instances.
Hybrid Cache Serialization – When less sent to the L2 Cache is more
The HybridCacheSerialization project extends the HybridCache sample by swapping the default JSON serializer for others like Neuecc MessagePack. HybridCache exposes an IHybridCacheSerializer interface, so developers can plug in different serialisers. In the sample the Data Transfer Object(DTO) is decorated with [MessagePackObject] and [Key(n)] attributes to control the binary layout (MessagePack message format is supported by many languages). The payoff is compact, fast binary payloads stored in Stack Exchange Redis instead of verbose JSON. This is worthwhile when cached objects are large, retrieved frequently, or bandwidth between app and cache is a latency/jitter/cost concern.
Object Cache – Barely sufficient
The ObjectCache project is the simplest (non-DIY) option using just IMemoryCache. Neighborhood lookups are wrapped in GetOrCreateAsync: a hit returns the cached object instantly, a miss queries Azure SQL Server and caches the result for 5 minutes. In this example a database miss isn’t just returned as NotFound and forgotten, this is cached too, for 1 minute, so a flood of requests for a non-existent record won’t hammer the database. A DELETE endpoint lets callers evict a specific entry on demand.
Output Cache – Avoiding regeneration, but don’t cross the streams.
The OutputCache project demonstrates ASP.NET Core’s OutputCaching middleware, a response-level cache that stores the fully serialized HTTP responses rather than the underlying objects. Output caching short-circuits the entire endpoint and serves the cached bytes directly. The project has named policies (“short”, “medium”, “neighborhood”) defined at startup and applied to endpoints with .CacheOutput(), inline policies defined inline as a lambda, Stack Exchange Redis can be dropped in
as the backing store with no code changes
MIDDLEWARE ORDER MATTERS- Place AFTER authentication/authorization so user identity and policies respected
Redis Cache – Old school and amazingly Fast
The RedisCache project goes bare-metal, using the Stack Exchange Redis IConnectionMultiplexer directly rather than any .NET caching abstraction. The cache-aside pattern used, check Redis first, then fall back to the database on a miss, then write the result back with a 30-second TTL. This sample uses source-generated JSON serializationvia JsonSerializerContext: serialization and deserialization use pre-compiled code paths rather than runtime reflection, which keeps allocation low and throughput high on the hot path. This also enables Ahead of Time(AoT) compilation support.
ResponseCache – RFC 9110 IETF “HTTP Semantics”
The ResponseCache project covers ASP.NET Core’s older ResponseCaching middleware, which caches responses based on standard HTTP Cache-Control headers rather than any framework-specific API. The endpoint sets Cache-Control: public, max-age=90 directly on the response headers and the middleware handles the rest. ResponseCache has largely been replaced by Output Cache though it matters when managing the caching behaviour of downstream proxies and Content Delivery Networks(CDNs), because the Cache-Control headers it emits are understood by the full HTTP stack.
Response Compression – When less sent to the client is more
The ResponseCompression middleware is server-side complement to caching that reduces payload size rather than request database traffic. The sample supports Gzip (faster,universally supported) and Brotli (better compression ratio, higher CPU cost), with an optionalflag to tune the trade-off between speed and size.
The application/json content-type isn’t compressed by default so it must be added explicitly to the Multipurpose Internet Mail Extension(MIME) type list; EnableForHttps must be opted into deliberately since compressing encrypted responses can expose reflected secrets (the CRIME/BREACH attacks); and Azure App Service containers apply their own platform-level gzip, so enabling this middleware there risks double-compression. Clients must send Accept-Encoding: gzip for compression as it’s not automatic.
The full source is available in the CHCAzureUGC202604 repository alongside the caching demos it supports
