PSO Precaching Deep Dive
The single biggest source of player-visible stutter in shipped UE5 games is shader compilation. UE 5.2+ introduced a runtime PSO precaching system to fix it — but precaching alone isn't enough, and there are several traps that resurrect 100ms+ hitches even on fully-precached builds. This tutorial walks the three-tier mental model (runtime precache, bundled cache, driver cache), the CVars that make each tier work, and the validation workflow for shipping a hitch-free build.
Why PSO compilation stalls a frame
Modern GPUs don't render from raw shader bytecode. They render from Pipeline State Objects (PSOs): pre-compiled, hardware-specific permutations that bake together the vertex shader, pixel shader, render state, vertex format, and target format into a single object the GPU can consume. The first time the engine encounters a new PSO at runtime, the driver has to compile it. That compile happens on the calling thread — which is the render thread. The result is the hitches that have plagued so many UE5 titles.
The cost varies enormously. NVIDIA's developer blog reports that ray-tracing PSO compile cost ranges from 20 ms to 300 ms per pipeline (per NVIDIA's parallel shader compilation post). Even a "cheap" 20 ms compile in the middle of gameplay is a visibly dropped frame.
The architectural fix — the one that distinguishes UE5 from UE4 — is to compile every PSO the player will need before they need it. UE5 has three machineries for that, and the rest of this tutorial walks each in turn.
Three tiers: runtime, bundled, driver
You will need all three for a shipped title. Skipping any one of them is a stutter regression waiting to ship.
1. Runtime PSO precaching (UE 5.2+). When an asset loads, the engine speculatively compiles the PSOs that asset is likely to need based on its component types and rendering state. Set r.PSOPrecaching=1 and r.PSOPrecache.ProxyCreationWhenPSOReady=1. Covers the majority of asset-driven PSOs.
2. The bundled cache (`.spc` / `.upipelinecache`). A pre-recorded list of PSOs your gameplay actually used, captured from a play session, packaged with the build, and compiled at startup or in the menu. Covers the long tail that runtime precaching misses — especially graphics-pipeline globals and content combinations the precache logic doesn't see.
3. The driver cache. NVIDIA, AMD, and Intel drivers each maintain their own per-application cache of compiled PSOs. After the first launch, the driver cache makes subsequent launches faster. This is also the trap: drivers can evict entries from their cache, and "the game was fine yesterday" can become "the game stutters today" without any code change.
Per Ari Arnbjörnsson's "Great Hitch Hunt" Unreal Fest 2025 talk: "PSO precaching… should always be used, but it's not enough on its own" (per Epic's published transcript). Plan for all three tiers from day one.
Wiring up runtime precaching
The minimum viable runtime precaching configuration:
[/Script/Engine.RendererSettings] r.PSOPrecaching=1 r.PSOPrecache.Validation=2 r.PSOPrecache.ProxyCreationWhenPSOReady=1 r.PSOPrecache.GlobalShaders=1 r.PSOPrecache.GlobalComputeShaders=1 [DevOptions.Shaders] NeedsShaderStableKeys=True
What each one does:
r.PSOPrecaching=1— the master switch.r.PSOPrecache.Validation=2— emits stats (stat psoprecache) and Insights bookmarks for tracking. Mandatory if you want to validate coverage.r.PSOPrecache.ProxyCreationWhenPSOReady=1— when a scene proxy's PSO isn't yet compiled, hide the proxy or use the default material instead of stalling. Avoids the hitch but introduces a brief visible pop. Most projects accept the pop.r.PSOPrecache.GlobalShaders=1/GlobalComputeShaders=1— precache global shader permutations. Pre-5.5 only computes were tracked; 5.5+ added graphics globals (per Tom Looman).NeedsShaderStableKeys=Truein[DevOptions.Shaders]— required for the cooker to emit the.shkfiles thatShaderPipelineCacheTools Expandreads when building bundled caches.
NeedsShaderStableKeys parsing is fragile
The [DevOptions.Shaders] section is parsed strictly — inline-comment semicolons on the same line as the value will silently break parsing. Set the value on its own line, comments above. Worth checking when "my bundled cache build is producing zero PSOs."
Capturing a bundled cache from gameplay
The bundled cache is captured from a real play session. The workflow:
- Build a packaged Development or Test build with logging enabled.
- Launch with
-logPSOon the command line. Optionally add-clearPSODriverCacheto start from a clean driver state. - Play through your game in a way that exercises every shader-relevant content path: every map, every weapon, every enemy, every UI screen, every particle effect, every cinematic. The capture is only as good as what you exercise.
- Exit. PSOs landed in
Saved/CollectedPSOs/*.rec.upipelinecache. - Run the ShaderPipelineCacheTools commandlet to bundle them into a single
.spc:
UnrealEditor-Cmd.exe Project.uproject ^
-run=ShaderPipelineCacheTools ^
Expand <recordings>*.rec.upipelinecache ^
<metadata>*.shk ^
<out>.spc
- Drop the
.spcintoBuild/<Platform>/PipelineCaches/and recook. The cooker bundles it into the packaged build. - At runtime, the engine compiles bundled-cache PSOs at startup or in the main menu via
r.ShaderPipelineCache.StartupMode 2.
Tom Looman's PSO caching guide and Vermilion's PSO caching in Unreal Engine are both excellent step-by-step references for the full pipeline. They're worth bookmarking.
The relevant CVars for tuning bundled-cache compile bandwidth:
[/Script/Engine.RendererSettings] r.ShaderPipelineCache.Enabled=1 r.ShaderPipelineCache.StartupMode=2 ; Compile in main menu r.ShaderPipelineCache.BatchTime=16 ; ms/frame foreground (default) r.ShaderPipelineCache.BackgroundBatchTime=0 ; ms/frame background (default) r.ShaderPipelineCache.PreCompileBatchTime=10 ; ms/frame at startup r.ShaderPipelineCache.BatchSize=50 ; PSOs per batch r.ShaderPipelineCache.ExcludePrecachePSO=1 ; Only log PSOs runtime precache missed
Validation: stat psoprecache and Insights
You can't trust precaching without validating it. Two tools:
stat psoprecache — a packaged-build stat that prints precached vs. JIT-compiled counts. Not available in editor builds. Requires r.PSOPrecache.Validation 2. The numbers you want to see: zero or near-zero JIT compiles during gameplay.
Unreal Insights PSO track. Capture a trace with -trace=default; in the resulting trace the PSO Insights track shows a bookmark per discovered PSO with its type (graphics/compute) and timing. Any bookmark that fires during gameplay rather than during loading is a precache miss.
Add -clearPSODriverCache to the trace launch so you measure the genuine first-run experience, not "second launch with warm driver cache."
r.PSOPrecache.GlobalShaders 1 + GlobalComputeShaders 1 covers most of these in 5.5+; older versions need the bundled cache to fill the gap.
The NVIDIA eviction trap
This is the gotcha that most fully-precached UE5 builds still hit. Epic engineer Tim Stullich documented it in a forum thread: PSOs marked EPSOPrecacheResult::Complete can be silently evicted from the NVIDIA driver cache if they're not used soon enough after compile. When the GPU then encounters that PSO during gameplay, the driver re-compiles it — and the call to RHICreateComputePipelineState takes "100+ milliseconds."
The fix is to keep PSOs alive in CPU memory until first use, so the engine can re-issue them without paying the driver-cache miss cost. This shipped as a new CVar in CL 43263054 (UE 5.6+):
[/Script/Engine.RendererSettings] ; Keep precached PSOs alive in CPU memory until first use. ; Defeats NVIDIA driver-cache eviction. Default is now 2 in 5.6+. r.PSOPrecache.KeepInMemoryUntilUsed=2 ; Soft caps on retained PSOs. Raise on memory-rich platforms. r.PSOPrecache.KeepInMemoryGraphicsMaxNum=8192 r.PSOPrecache.KeepInMemoryComputeMaxNum=4096
If you're on UE 5.5 or older and seeing first-mission hitches that don't reproduce on subsequent runs, this is almost always the cause. Cherry-pick the CL or upgrade.
Driver cache reset for honest first-run profiling
The most common "but it's smooth on my machine" failure is profiling on a machine where the driver cache is already warm. The first 30 minutes of any new player's experience are with a cold driver cache; that's the case you need to test.
The command-line flag -clearPSODriverCache tells UE to wipe the driver-side PSO cache before run. Add it to every PSO validation launch:
YourGame.exe -trace=default -clearPSODriverCache
-clearPSODriverCache is broken on UE < 5.6 for NVIDIA
NVIDIA renamed their driver-cache directory at some point and UE's clearing logic didn't follow until 5.6. On older engine versions, the flag silently no-ops on NVIDIA hardware. Workaround: Ari Arnbjörnsson's open-source PSOCacheBuster plugin restores the functionality.
Coverage gotchas (decals, globals, FastVRAM)
Several known holes in PSO precache coverage to be aware of:
- DecalComponent precache coverage was missing pre-5.6. Decals were a frequent source of "first time you place a decal, hitch" stutter. Fixed in 5.6 (per Epic's forum thread on PSO collection).
- Global graphics shaders pre-5.5 only covered compute. Upgrade or supplement with bundled cache.
- FastVRAM-bit mismatch noise in 5.4. KTerelst (Epic) confirmed in the forums that FastVRAM bit-21 differences between captured and runtime base-pass shaders log as misses but are spurious on PC — "FastVRAM is a flag on the render target… not really used in the d3d12 PC PSOs" (per this Epic forum thread). Filter these out when triaging miss logs.
- Custom passes that build their own PSOs — render features layered on top of the deferred renderer outside the standard component-creation path can miss precache.
- Ray tracing PSOs are still problematic — per NVIDIA, they're long compiles (20–300 ms each), and runtime precaching for RT pipelines is less mature than for raster.
Open-world combinatorial blowup
Open-world projects have a specific PSO problem: the late binding of streamed content means PSOs aren't necessarily known until you stream the cell that uses them. A community estimate posted to the Epic forums: "around 1 million" PSO permutations are plausible from 20 landscape material layers (per this open-worlds thread).
Mitigations that shipped open-world UE5 titles use:
- Boot-time spawn rooms. Hidden levels with one of every prop-material combination, loaded at startup so the engine sees and precompiles every needed PSO before gameplay.
- Gameplay-rep capture. Capture a long playthrough of a representative path at QA, then use the bundled cache to ship every PSO that path needed.
- Per-cell precache hints. Custom code that tells the engine which materials are coming when a cell is about to stream in, so the precache pipeline starts before the cell is visible.
- Cap material variety — not every set-dressing prop needs its own master material. Reducing material variety reduces PSO count combinatorially.
For more on cell-streaming behavior, see the World Partition Performance tutorial; the two systems interact in practice.
Validating PSO coverage in CI
The right time to discover that a new material introduced an uncached PSO is in CI, not in QA. The validation pattern:
- Build a packaged Test/Shipping build with
r.PSOPrecache.Validation=2. - Run a scripted scenario in CI (your existing PerfGuard scenarios, or a dedicated PSO-coverage scenario that walks every map).
- Capture an Insights trace with
-trace=default -clearPSODriverCache. - Parse the PSO Insights bookmarks for entries that fire after the loading-screen window closes.
- Fail the build if any PSO compiles outside loading.
PerfGuard can drive this exact loop: capture per-scenario, parse stat psoprecache for JIT-compile counts, and gate PRs against a regression baseline. When a PR introduces a material change that breaks coverage, you know which commit to revert — instead of finding out from a Steam review six months later.
- Upscaler Tuning — upscaler plugins (DLSS/FSR) introduce additional PSOs; precache coverage is a recurring trap there.
- Gotcha #11: PSO / Shader Compilation Stutter — the field-guide version of this tutorial.
- Diagnosing CPU Regressions — a JIT PSO compile shows up as a render-thread spike; this is how to spot it in Insights.