Virtual Shadow Maps — Performance & Tuning
Virtual Shadow Maps replaced cascaded shadow maps as UE5's default, and they're fast for one specific reason: page caching. Lose the cache and you lose the perf. This tutorial walks the page-table mechanics, the content patterns that invalidate cached pages every frame, the SMRT and page-pool dials that buy you the most ms for the least visual loss, and the debug visualizer workflow for spotting the broken case.
Why VSM exists
UE4 used cascaded shadow maps (CSM): four shadow textures of fixed resolution that each cover a slice of the frustum. Pixels outside any cascade fall back to a distant blurry sun. The cost of CSM is paying for resolution everywhere, even where the camera doesn't see it.
VSM flips the model. Each shadow-casting light gets a single conceptual 16K × 16K virtual shadow texture. Physical pages of that texture are allocated only where the camera actually needs shadow data this frame. The result: high resolution where you look, no cost where you don't.
Authored by Andrew Lauritzen and Ola Olsson at Epic and first shipped at scale on Fortnite Battle Royale Chapter 4 (UE 5.1), per Epic's VSM tech blog. The underlying technique is rooted in Olsson, Sintorn et al.'s "Efficient Virtual Shadow Maps for Many Lights" Chalmers paper, but the UE5 implementation is its own significant engineering project layered on top of Nanite's culling pipeline.
The page table and caching mechanism
The full conceptual 16K × 16K shadow texture per light isn't real — it's virtual address space. The actual storage is a physical page pool with a fixed page count. Default in current UE5: 4096 physical pages per the docs, configurable via:
[/Script/Engine.RendererSettings] ; Physical page pool size. Each page is 128x128 of shadow data. ; Raise to 6144 or 8192 on Nanite-heavy / open-world scenes. r.Shadow.Virtual.MaxPhysicalPages=4096
The clever part is the cache. A page that's been computed once doesn't need to be recomputed next frame unless something in its volume has changed. r.Shadow.Virtual.Cache=1 is the master toggle (you should never turn it off in production; the page-cache is the entire point of VSM).
An additional optimization, r.Shadow.Virtual.Cache.StaticSeparate=1 (default on), splits cache pages into static and dynamic: a tree that doesn't move keeps its static page warm even when a character walks past it. The cost is paying double the page-pool memory. After 100 frames of no invalidation (r.Shadow.Virtual.Cache.FramesStaticThreshold=100), an object is promoted to the static cache.
What invalidates a page
Anything that changes the contents of the volume covered by a page invalidates that page. The complete list:
- Moving a Movable mesh — physics, animation, gameplay-driven actor movement.
- Rotating a directional light — sun motion shifts the clipmap and dirties pages.
- Light source moves — changing position/orientation of any local light.
- Skeletal mesh updates — characters animating in shot.
- World Position Offset in materials — wind on foliage, animated banners, vertex displacement.
- Particle shadows — Niagara emitters that cast shadows continuously dirty pages.
- Visibility changes — an actor toggles hidden/visible.
The Cached Pages debug view (Viewport → View Mode → Virtual Shadow Map → Cached Pages) overlays cached pages in green and invalidated pages in red. Red regions are pages being recomputed every frame. Walk the camera through your level with this view on; anywhere you see persistent red, you have a content problem.
SMRT samples — the cheapest knob
VSM produces soft-edged shadows by Stochastic Multi-sample Ray Tracing (SMRT) — each pixel takes multiple samples within a small cone around the light direction and averages them. The default sample count is 8 per ray, for both local and directional lights:
[/Script/Engine.RendererSettings] r.Shadow.Virtual.SMRT.SamplesPerRayLocal=8 r.Shadow.Virtual.SMRT.SamplesPerRayDirectional=8
AMD's UE Performance Guide recommends dropping these to 4–6 with "nearly imperceptible" visual difference (per GPUOpen). The local samples are typically the bigger win — you tend to have many local lights, only one directional.
Test before/after on your worst-case content. The visual cost is shadow softness/grain on penumbra edges; the perf win is roughly proportional to the sample reduction. Lower this on Low and Medium tiers, leave at default on High and Cinematic.
Page pool sizing and overflow diagnosis
Page pool overflow is the failure mode where the engine wants to allocate more pages than the pool holds, and visible shadows go missing or get incorrectly low resolution. The log message reads:
Page allocations were not served, this will produce visual artifacts (missing shadow), increase the page pool limit or reduce resolution bias to avoid.
Diagnose with r.Shadow.Virtual.ShowStats 1, which overlays a real-time HUD of the page allocation count, invalidation count, and pool utilization. If allocation regularly approaches the pool size, raise it.
The trade-off is GPU memory: each physical page is 128×128 of shadow data, ~32 KB at the typical formats. 4096 pages = ~128 MB. Doubling the pool to 8192 = ~256 MB, plus another 2× if StaticSeparate is on.
Sizing recommendations from shipping projects:
- Closed-corridor games: 4096 pages is usually sufficient.
- Open-world games: 6144–8192 pages, especially if Nanite-heavy.
- Sun-driven outdoor scenes: bias higher; the directional clipmap chews pages.
- Mobile / low-VRAM platforms: stay at 4096 and accept page overflow tolerance via
r.Shadow.Virtual.ResolutionLodBiasDirectional(more negative = sharper shadows = more pages; less negative = blurrier = fewer pages).
Foliage WPO — the #1 VSM killer
If we wrote one section in this tutorial in red ink, it would be this one. World Position Offset on foliage destroys VSM caching, and the symptom looks like generally-slow shadows rather than a foliage-specific cost. We covered the gotcha pattern in Performance Gotcha #6; this is the deeper version.
Wind WPO is a vertex shader expression. From the engine's perspective, the geometry is moving every frame — and any cached VSM page intersected by that geometry is dirty. On a forest scene with thousands of trees, that's thousands of pages re-rasterized every frame.
The mitigation stack, in order of effectiveness:
r.Shadow.Virtual.Cache.MaxMaterialPositionInvalidationRange— default-1(unlimited). Set to a finite range (try 5000 cm to start). Past this distance, WPO no longer invalidates cache pages. Trees in the distance keep their cache; only nearby foliage costs you.- Per-component
ShadowCacheInvalidationBehavior=Static— tell the engine that this component's WPO doesn't actually change shadow casting. Use for subtle wind on background trees. Documented in the Fortnite Battle Royale Chapter 4 VSM tech blog as Epic's own shipped fix. - Foliage WPO Disable Distance — on the static mesh asset (Nanite Settings for Nanite meshes; foliage type settings otherwise). Past this distance the engine treats WPO as zero, restoring the cluster to a non-WPO path entirely.
r.Shadow.Virtual.NonNanite.IncludeInCoarsePages 0— for non-Nanite foliage, exclude it from the coarse (low-res, far) pages where it doesn't visually contribute much. Epic-recommended in foliage-heavy scenes per the LearningUnreal Epic-derived notes.- Convert foliage to Nanite — not always feasible, but Nanite + VSM is the only combination Epic has tuned aggressively. Non-Nanite WPO foliage is the worst case.
The Black Myth Wukong shipped configuration is instructive: that team chose to not ship VSM, accepting baked CSM-style shadows instead. A community modder forced VSM on and measured a 3–5 FPS cost at PC settings (per DSOGaming's coverage). VSM is a tool, not a default; if your scene's content makes the cache invalidation rate too high, the right call is to use a different shadow path.
Nanite & cluster culling for shadows
The Nanite shadow path uses HZB-driven cluster culling for shadow rasterization, controlled by r.Shadow.Virtual.UseHZB=2 (default). The two-pass culler dramatically reduces shadow rasterization work compared to the non-Nanite path.
Practical implication: converting non-Nanite content to Nanite is the single largest VSM perf win Epic recommends. A forest of Nanite trees with a cluster culler is much cheaper to shadow than a forest of non-Nanite trees, even when the polygon counts are similar.
Diagnostics for the non-Nanite path:
// Print to screen the worst non-Nanite VSM page hogs.
r.Shadow.Virtual.NonNanite.NumPageAreaDiagSlots -1
This prints which non-Nanite meshes are eating the page budget. Convert those to Nanite first, or apply IncludeInCoarsePages=0 to them.
Light-side controls
Per-light settings that often beat global CVars:
- Cast Shadows = false — many fill lights in scenes exist purely for art shape and don't need to cast shadows. Each non-shadow-casting light is free of VSM cost. (See Gotcha #4 on movable shadow casters.)
- Shadow Resolution Scale ≤ 1 — per-light scalar that biases the resolution VSM allocates. Bring it below 1 for background lights.
- Distance Field Shadows on a per-light basis — for some scenes, switching specific point/spot lights to DFAO-style distance field shadows is cheaper than VSM, especially for lights with stable contributions.
r.Shadow.Virtual.ResolutionLodBiasDirectional— default -1.5. More negative = sharper directional shadows but more pages allocated. Less negative = blurrier but fewer pages.r.Shadow.Virtual.ResolutionLodBiasLocal— same, for local lights.
The debug workflow
The full diagnostic kit in priority order:
stat GPU— look for Shadow Depths bucket and the dedicated VSM sub-line. That's your top-level cost.- Cached Pages view (Viewport → View Mode → Virtual Shadow Map → Cached Pages) — visualize cache hits vs invalidations.
r.Shadow.Virtual.ShowStats 1— on-screen pages/invalidations/pool counters.- Worst-case bound test:
r.Shadow.Virtual.Cache 0disables the cache entirely. The new frame time is your "every page reallocated" worst case. Subtract your normal frame time to get cache value. - Sun-rotation worst case:
r.Shadow.Virtual.Cache.ForceInvalidateDirectional 1forces directional clipmap recomputation each frame, isolating sun-driven cost. - Page-area diagnostic:
r.Shadow.Virtual.NonNanite.NumPageAreaDiagSlots -1identifies which non-Nanite meshes are biggest page consumers.
Console & mobile caveats
VSM has hard requirements:
- Requires D3D12 or Vulkan. DX11 falls back to UE4-style cascaded shadow maps. (Per community shadow notes.)
- Non-Nanite VSM is disabled on most mobile platforms. Mobile typically uses CSM with whatever scalability biases your Mobile.ini sets.
- Apple Metal force-disables
Cache.StaticSeparatedue to atomic-on-Texture2DArray limitations. - Switch and other tile-based renderers have specific VSM caveats; many ship without VSM at all.
For mobile or Switch, the right answer is often to author for CSM and accept the lower-quality fallback shadows. Trying to force VSM on a tile-based mobile GPU just produces page overflow and visual corruption.
Shipped-game lessons
Three reference points from shipped UE5 titles:
- Fortnite Battle Royale Chapter 4 — the launch vehicle for VSM. Lauritzen and Olsson published a detailed tech blog on the optimization arc, including the
ShadowCacheInvalidationBehavior=Staticpattern and "Optimized WPO" mode. - Hellblade II — ships VSM at 30 fps on Series X. Tuned aggressively per the technical review: reduced reflection-and-VSM resolution, tight roughness caps, and content authored to minimize WPO cache invalidation.
- Black Myth Wukong — does not ship VSM in either Quality or Performance modes. Community-modded VSM-on builds measured 3–5 FPS cost. A clear case where the project chose a different shadow path because the content's VSM characteristics were unfavorable.
The lesson: VSM is not a default to enable and forget. It's a system to author around — or to choose not to ship.
Locking the win in CI
VSM regressions are insidious. A designer paints a new wind-WPO foliage type into a level, and your shadow pass cost climbs by 20% — but only when the player walks through that level. Branch-level CI tests on a non-foliage scene won't catch it.
PerfGuard's answer: capture per-stat budgets across each baseline scenario, including the GPU ShadowDepths → VSM bucket and r.Shadow.Virtual.ShowStats-derived invalidated-page counts. When a PR's invalidated-page count jumps, you know exactly which content change dirtied the cache — before it ever reaches a console build.
- Lumen Performance Deep Dive — Lumen and VSM share the same enemies (WPO, dynamic geometry, sun rotation).
- Nanite Performance Deep Dive — the system VSM most relies on for cluster culling.
- Gotcha #6: Foliage WPO Invalidating VSM — the field-guide version of section 6 above.