← Back to Tutorials
Advanced ~22 min read UE 5.5 / 5.6

Mobile Performance Deep Dive

Mobile is the worst possible home for "feels fine on my dev kit" optimization. Twenty-two thousand Android device models ship every year, fifty percent of them are low-end, and the wrong scalability tier silently kills frame rate on a quarter of your install base. This tutorial covers the rendering paths that actually run on mobile, the GPU-architecture deltas that decide your perf shape, and the per-device-tier discipline that shipped UE5 mobile titles use to survive.

1

Mobile rendering paths in 5.5/5.6

UE5 mobile supports three meaningfully different render paths. The choice cascades into every other decision in this tutorial:

  • Mobile Forward (default). Single-pass, lowest bandwidth on tile-based GPUs. The shipping default for any project that doesn't have a specific reason otherwise. Limited dynamic-light support, no Lumen, no VSM for non-Nanite.
  • Mobile Deferred. Vulkan/Metal only; needs MobileHDR enabled to allocate the floating-point GBuffer in tile memory. Better support for many lights, more post-processing options, but doubles tile-memory pressure and is gated off OpenGL ES via r.Mobile.AllowDeferredShadingOpenGL=0.
  • Desktop Forward / Deferred on high-end mobile. Vulkan-on-Snapdragon-flagship and iPhone Metal can run the desktop renderer at much higher quality. Used for photorealistic mobile content; expect ~3× the per-frame cost of the mobile path.

The bandwidth-vs-ALU trade is the most important shape difference: tile-based GPUs are bandwidth-bound, not compute-bound, so MobileHDR (which moves the tile buffer from RGBA8 to floating-point) is a much bigger cost than its instruction count suggests.

2

Feature levels & shading paths matrix

The CVar that gates everything: r.Mobile.ShadingPath0 is forward (default), 1 is deferred. The deferred path requires Vulkan/Metal AND MobileHDR. Dropping any of those silently falls back.

DefaultEngine.ini
[/Script/Engine.RendererSettings]
; Default mobile shading path
r.Mobile.ShadingPath=0
; Required for deferred / GTAO / PPR / post-process
r.MobileHDR=True
; Auto-instancing on mobile (must be set in DefaultEngine.ini, ECVF_ReadOnly)
r.Mobile.SupportGPUScene=1
; AA: 0=off 1=FXAA 2=TemporalAA 3=MSAA
r.Mobile.AntiAliasing=3

r.Mobile.SupportGPUScene is read-only. Setting it at runtime silently no-ops — it must live in DefaultEngine.ini. This is one of the most common misconfigurations on mobile-targeting projects (per the CVar wiki).

API selection per platform: Vulkan on Android-flagship and Snapdragon-Adreno is the modern default. OpenGL ES is the legacy path for older devices. Metal on iOS/iPadOS is non-negotiable. Forced GLES on Adreno5xx is engine-default behavior — Android_Adreno5xx_No_Vulkan profile force-disables Vulkan to avoid driver crashes.

3

Tile-based GPUs: Mali / Adreno / Apple

Every modern mobile GPU is tile-based. The frame is split into screen-aligned tiles (typically 16×16 or 32×32 pixels), each rasterized into on-chip tile memory before being written out to main RAM. The implication: bandwidth dominates. Reducing render-target writes, using FrameBufferFetch (Mali) and Pixel Local Storage (Adreno) for in-tile data flow, and avoiding mid-frame full resolves saves more than any ALU optimization.

Vendor specifics:

  • Mali (Arm). 4× MSAA is "close to zero performance penalty" on Mali because the tile buffer natively supports 4 samples per pixel (per Arm's UE mobile blog). Memory residency tends to run higher on Mali than on Adreno for the same scene (forum-confirmed).
  • Adreno (Qualcomm). 4× MSAA is not free here — Adreno tile memory is sized differently. Snapdragon Profiler is the diagnostic tool. Lightspeed/Tencent's Adreno Tile Memory Heap integration in Neverness to Everness (per their GDC 2025 session) is the canonical Adreno deep-dive.
  • Apple. A-series GPUs are bandwidth-strong but power-bound. Xcode's Metal GPU Capture is fast and surfaces tile-memory utilization. iOS thermal throttle is aggressive; design for sustained, not peak, power.
4

Draw call & instancing budgets per tier

Epic's official guideline: under 700 draw calls per view on mobile (per Performance Guidelines for Mobile Devices). Practical tier targets that shipped projects use:

TierDraw calls / viewTris on screenNotes
Low-end~250250kSnapdragon 6-series, Mali-G52, A11 Bionic and similar
Mid~450500kSnapdragon 7-series, Mali-G610, A13–A14
Flagship~7001M+Snapdragon 8 Gen 2/3, Mali-G715/G720, A16+

Auto-instancing on mobile requires r.Mobile.SupportGPUScene=1. Without it, every draw is its own submission; HISMs and merged static actors are the only path to staying under budget. Flat scenes with many unique materials are worst-case.

The single emitter beats ten emitters rule. One Niagara emitter at 1,000 particles is consistently faster than ten emitters at 100, because per-emitter tick and dispatch overhead dominate at low particle counts (per Epic's Niagara optimization tutorial).

5

Texture memory budgets

Mobile uses ASTC (Android/iOS modern) and ETC2 (legacy Android fallback). The streaming pool (r.Streaming.PoolSize) should be sized to 30–40% of the device VRAM budget; for a phone with 6 GB total RAM and ~2 GB available to the app, that's roughly 600–800 MB texture pool.

The memreport -full workflow on a packaged build is non-negotiable for finding bloat. Mali devices in particular run memory residency higher than Qualcomm for identical scenes (documented) — budget for the worst case.

Common offenders to audit per platform:

  • UI atlases too large (4K UI textures on a 1080p phone screen).
  • Lightmaps not in mobile-specific groups.
  • Per-character normal maps that should be shared.
  • Unused Anim sequences cooking into the build (audit with obj list class=AnimSequence).
6

Mobile lighting reality — what works

Mobile cannot run Lumen. Mobile cannot run Nanite. Mobile cannot run Virtual Shadow Maps for non-Nanite content. The realistic toolset:

  • One stationary directional light + Cascaded Shadow Maps for the sun. CSM Dynamic Shadow Distance ~4500 cm is a typical Fortnite-mobile reference.
  • Per-component "Receive CSM Shadows" gating. Most static set-dressing should not receive CSM — sample with sky/ambient instead.
  • Stationary point/spot lights with baked direct contribution + dynamic for character only.
  • Sky DFAO and baked GI for indirect lighting where appropriate.
  • Distance-field shadows are desktop-on-mobile only. Don't assume them on the standard mobile path.

For projects upgrading from desktop, this is usually the largest content rework. Skylight + DFAO + SSGI is a workable indirect substitute for Lumen at mobile target quality; baked GI + lightmaps is sturdier still.

7

Mobile-specific scalability

Lyra-style Mobile Device Profile inheritance is the canonical pattern for the Android long tail. Each device tier inherits from a base mobile profile and overrides specific sg.* values:

Config/Android/AndroidDeviceProfiles.ini (excerpt)
; Base mobile profile
[Mobile DeviceProfile]
DeviceType=Android
BaseProfileName=
+CVars=sg.EffectsQuality=2
+CVars=sg.ShadowQuality=2
+CVars=sg.ResolutionQuality=85

; Low-end Adreno (Snapdragon 6-series)
[Android_Adreno6xx DeviceProfile]
BaseProfileName=Mobile
+CVars=sg.EffectsQuality=0
+CVars=sg.ShadowQuality=1
+CVars=sg.ResolutionQuality=70

; Flagship Snapdragon 8 Gen 2
[Android_Adreno7xx DeviceProfile]
BaseProfileName=Mobile
+CVars=sg.EffectsQuality=3
+CVars=sg.ShadowQuality=3
+CVars=sg.ResolutionQuality=100

r.MobileContentScaleFactor is the global resolution multiplier. Note: it's silently ignored on iOS in some UE5 configurations (per a tracked forum bug) — verify on packaged iOS builds.

Dynamic Resolution complements per-device profiles; on flagship devices set min 70%, max 100% of native; on low-end set min 60%, max 85% to avoid GPU spikes. TSR's compute cost can outweigh the upscale benefit on weaker mobile GPUs — FXAA + lower screen percentage often beats TSR on the lowest tier.

8

Mobile post-process and effects

Each post-process feature is gated by MobileHDR + a quality CVar:

  • GTAO via r.Mobile.AmbientOcclusionQuality (default 0 = off; 1+ enables; requires MobileHDR + Mobile Ambient Occlusion in Project Settings).
  • Pixel Projected Reflections (PPR) via r.Mobile.PixelProjectedReflectionQuality (requires MobileHDR + Planar Reflection Mode = MobilePPR/MobilePPRExclusive). UE 5.7.2 dropped this Project Settings UI entry — verify per engine version (tracked).
  • Bloom + Tonemap — cost in tile memory; both add fp tile-buffer load.
  • Shader complexity targets: 250–300 PS instructions on flagships, 80–120 on low-end (per community measurements).

Niagara CPU sims work universally on mobile. GPU compute sims are inconsistent on Android — per Epic forums, GPUCompute sims don't display on the Mobile Previewer and behave unevenly across Android drivers. Default to CPU sims with conservative spawn rates on mobile.

UE 5.6 Mali Vulkan crash regression There's an active issue with random RHIThread crashes inside libGLES_mali.so during descriptor-set updates on Vulkan-Mali devices in UE 5.6 (per Epic forums). Until Epic ships a fix, consider 5.5 stable for Mali-targeting projects or fall back to OpenGL ES on affected SKUs.
9

Profiling toolchain (Snapdragon, Mali, Xcode)

Insights on a desktop dev kit doesn't tell you what runs on a real phone. The vendor profilers do:

  • Snapdragon Profiler (Adreno). Render-stage trace, perf counters, bandwidth, GPU timing per pass. Required for any Adreno-targeting project.
  • Arm Performance Studio / Mali Graphics Debugger. Tile timeline, FBF (FrameBufferFetch) analysis. Free download from developer.arm.com.
  • Xcode Metal GPU Capture & Frame Capture (iOS). Built into Xcode; tile-memory utilization is one click away.
  • RenderDoc Meta Fork. For Quest and Quest-class XR mobile.
  • Unreal Insights with -tracehost. Connect from desktop to a phone running a Development build.
  • Material Editor's Mali Offline Compiler / Adreno Offline Compiler. Per-material shader cycle count and register pressure, surfaced inside the editor (per Meta's docs).

Shipping checklist for mobile

Before submitting a mobile build, run through this list against a packaged Test/Shipping build on real hardware in each tier:

Pre-ship audit

  1. Render path verified per platform. Forward + Vulkan/Metal on flagship; OpenGL ES forward on low-end Adreno5xx.
  2. r.Mobile.SupportGPUScene=1 set in DefaultEngine.ini (read-only; runtime sets no-op).
  3. Draw calls under 700 per view on flagship target, scaled down per tier.
  4. MobileHDR on/off decision is intentional — not just inherited from a desktop project's defaults.
  5. Lyra-style Device Profile inheritance covering at minimum: low-Adreno, mid-Adreno, flagship-Adreno, low-Mali, mid-Mali, flagship-Mali, A11−A13 Apple, A14+ Apple.
  6. Streaming pool sized 30–40% of device VRAM budget per tier.
  7. Niagara emitters consolidated; CPU sims preferred on Android due to GPUCompute fragility.
  8. Mobile shader complexity under 300 PS instructions on flagship, 120 on low-end.
  9. UE 5.6 Mali Vulkan crash mitigation in place (engine pin or GLES fallback) until Epic ships the fix.
  10. Snapdragon Profiler + Mali Performance Studio captures saved as baseline for the next regression compare.

For continuous regression-tracking across the long tail, PerfGuard baselines per device profile bucket (low-Adreno, mid-Mali, flagship-Vulkan, Apple A-series), so a single CI run flags draw-call ceilings, texture-pool overflows, MobileHDR toggles, and 5.6-style RHI crash signatures before they ship to a quarter of your install base.