Search Unity

Optimize your mobile game performance: Tips on profiling, memory, and code architecture from Unity’s top engineers

June 23, 2021 in Games | 15 min. read
Red dragon being overtaken scene
Red dragon being overtaken scene
Share

Is this article helpful for you?

Thank you for your feedback!

Our Accelerate Solutions team knows the source code inside out and supports a plethora of Unity customers so they can get the most out of the engine. In their work, they dive deep into creator projects to help identify points where performance could be optimized for greater speed, stability, and efficiency. We sat down with this team, made up of Unity’s most senior software engineers, and asked them to share some of their expertise on mobile game optimization.

As our engineers began to share their insight on mobile game optimization, we pretty quickly realized that there was way too much great information for the single blog post we had planned. Instead, we decided to turn their mountain of knowledge into a full-length e-book (which you can download here), as well as a series of blog posts that spotlight some of these 75+ actionable tips.

We kick off the first post in this series by zooming in on how you can improve your game’s performance with profiling, memory, and code architecture. In the next few weeks, we’ll follow up with two more posts: the first covering UI physics, followed by another on audio and assets, project configuration, and graphics.

Want to check out the complete series now? Download the full e-book for free.

Let’s dig in!

Profiling

What better place to start than profiling and the process of gathering and acting on mobile performance data? This is where optimizing mobile performance truly begins.

Profile early, often, and on the target device

The Unity Profiler provides essential performance information about your application, but it can’t help you if you don’t use it. Profile your project early in development, not just when you are close to shipping. Investigate glitches or spikes as soon as they appear. As you develop a “performance signature” for your project, you’ll be able to spot new issues more easily.

While profiling in the Editor can give you an idea of the relative performance of different systems in your game, profiling on each device gives you the opportunity to gain more accurate insights. Profile a development build on target devices whenever possible. Remember to profile and optimize for both the highest- and lowest-spec devices that you plan to support.

Along with the Unity Profiler, you can leverage native tools from iOS and Android for further performance testing on their respective engines:

Certain hardware can take advantage of additional profiling tools (e.g., Arm Mobile Studio, Intel VTune, and Snapdragon Profiler). See Profiling Applications Made with Unity for more information.

Focus on optimizing the right areas

Don’t guess or make assumptions about what is slowing down your game’s performance. Use the Unity Profiler and platform-specific tools to locate the precise source of a lag. 

Of course, not every optimization described here will apply to your application. Something that works well in one project may not translate to yours. Identify genuine bottlenecks and concentrate your efforts on what benefits your work.

Understand how the Unity profiler works

The Unity Profiler can help you detect the causes of any lags or freezes at runtime and better understand what’s happening at a specific frame, or point in time. Enable the CPU and Memory tracks by default. You can monitor supplementary Profiler Modules like Renderer, Audio, and Physics, as needed for your game (e.g., physics-heavy or music-based gameplay).

Use the Unity Profiler to test performance and resource allocation for your application.
Use the Unity Profiler to test performance and resource allocation for your application.

Build the application to your device by checking Development Build and Autoconnect Profiler, or connect manually to accelerate app startup time.

Build settings in-editor

Choose the platform target to profile. The Record button tracks several seconds of your application’s playback (300 frames by default). Go to Unity > Preferences > Analysis > Profiler > Frame Count to increase this as far as 2000 if you need longer captures. While this means that the Unity Editor has to do more CPU work and take up more memory, it can be useful depending on your specific scenario.

This is an instrumentation-based profiler that profiles code timings explicitly wrapped in ProfileMarkers (such as MonoBehaviour’s Start or Update methods, or specific API calls). Also, when using the Deep Profiling setting, Unity can profile the beginning and end of every function call in your script code to tell you exactly which part of your application is causing a slowdown.

Timeline view in-editor
Use the Timeline view to determine if you are CPU-bound or GPU-bound.

When profiling your game, we recommend that you cover both spikes and the cost of an average frame in your game. Understanding and optimizing expensive operations that occur in each frame can be more useful for applications running below the target frame rate. When looking for spikes, explore expensive operations first (e.g., physics, AI, animation) and garbage collection.

Click in the window to analyze a specific frame. Next, use either the Timeline or Hierarchy view for the following: 

  • Timeline shows the visual breakdown of timing for a specific frame. This allows you to visualize how the activities relate to one another and across different threads. Use this option to determine if you are CPU- or GPU-bound.
  • Hierarchy shows the hierarchy of ProfileMarkers, grouped together. This allows you to sort the samples based on time cost in milliseconds (Time ms and Self ms). You can also count the number of Calls to a function and the managed heap memory (GC Alloc) on the frame.
Sorting ProfileMarkers by time cost
The Hierarchy view allows you to sort ProfileMarkers by time cost.

Read a complete overview of the Unity Profiler here. Those new to profiling can also watch this Introduction to Unity Profiling.

Before optimizing anything in your project, save the Profiler .data file. Implement your changes and compare the saved .data before and after the modification. Rely on this cycle to improve performance: profile, optimize, and compare. Then, rinse and repeat.

Use the Profile Analyzer

This tool lets you aggregate multiple frames of Profiler data, then locate frames of interest. Want to see what happens to the Profiler after you make a change to your project? The Compare view allows you to load and differentiate two data sets, so you can test changes and improve their outcome. The Profile Analyzer is available via Unity’s Package Manager.

Deeper look at the Profile Analyzer in-editor
Take an even deeper dive into frames and marker data with the Profile Analyzer, which complements the existing Profiler.

Work on a specific time budget per frame 

Each frame will have a time budget based on your target frames per second (fps). Ideally, an application running at 30 fps will allow for approximately 33.33 ms per frame (1000 ms / 30 fps). Likewise, a target of 60 fps leaves 16.66 ms per frame.

Devices can exceed this budget for short periods of time (e.g., for cutscenes or loading sequences), but not for a prolonged duration.

Account for device temperature

For mobile, however, we don’t recommend using this maximum time consistently as the device can overheat and the OS can thermal throttle the CPU and GPU. We recommend that you use only about 65% of the available time to allow for cooldown between frames. A typical frame budget will be approximately 22 ms per frame at 30 fps and 11 ms per frame at 60 fps. 

Most mobile devices do not have active cooling like their desktop counterparts. Physical heat levels can directly impact performance.

If the device is running hot, the Profiler might perceive and report poor performance, even if it is not cause for long-term concern. To combat profiling overheating, profile in short bursts. This cools the device and simulates real-world conditions. Our general recommendation is to keep the device cool for 10-15 minutes before profiling again.

Determine if you are GPU-bound or CPU-bound

The Profiler can tell you if your CPU is taking longer than your allotted frame budget, or if the culprit is your GPU. It does this by emitting markers prefixed with Gfx as follows:

  • If you see the Gfx.WaitForCommands marker, it means that the render thread is ready, but you might be waiting for a bottleneck on the main thread.
  • If you frequently encounter Gfx.WaitForPresent, it means that the main thread was ready but was waiting for the GPU to present the frame.

Memory

Unity employs automatic memory management for your user-generated code and scripts. Small pieces of data, like value-typed local variables, are allocated to the stack. Larger pieces of data and longer-term storage are allocated to the managed heap.

The garbage collector periodically identifies and deallocates unused heap memory. While this runs automatically, the process of examining all the objects in the heap can cause the game to stutter or run slowly.

Optimizing your memory usage means being conscious of when you allocate and deallocate heap memory, and how you minimize the effect of garbage collection. See Understanding the managed heap for more information.

A look at the Memory Profiler in-editor
Capture, inspect, and compare snapshots in the Memory Profiler.

Use the Memory Profiler 

This separate add-on (available as an Experimental or Preview package in the Package Manager) can take a snapshot of your managed heap memory, to help you identify problems like fragmentation and memory leaks.

Click in the Tree Map view to trace a variable to the native object holding onto memory. Here, you can identify common memory consumption issues, like excessively large textures or duplicate assets.

Learn how to leverage the Memory Profiler in Unity for improved memory usage. You can also check out our official Memory Profiler documentation.

Reduce the impact of garbage collection (GC)

Unity uses the Boehm-Demers-Weiser garbage collector, which stops running your program code and only resumes normal execution once its work is complete. 

Be aware of certain unnecessary heap allocations, which could cause GC spikes: 

  • Strings: In C#, strings are reference types, not value types. Reduce unnecessary string creation or manipulation. Avoid parsing string-based data files such as JSON and XML; store data in ScriptableObjects or formats like MessagePack or Protobuf instead. Use the StringBuilder class if you need to build strings at runtime.
  • Unity function calls: Some functions create heap allocations. Cache references to arrays rather than allocating them in the middle of a loop. Also, take advantage of certain functions that avoid generating garbage. For example, use GameObject.CompareTag instead of manually comparing a string with GameObject.tag (as returning a new string creates garbage).
  • Boxing: Avoid passing a value-typed variable in place of a reference-typed variable. This creates a temporary object, and the potential garbage that comes with it implicitly converts the value type to a type object (e.g., int i = 123; object o = i). Instead, try to provide concrete overrides with the value type you want to pass in. Generics can also be used for these overrides.
  • Coroutines: Though yield does not produce garbage, creating a new WaitForSeconds object does. Cache and reuse the WaitForSeconds object rather than creating it in the yield line.
  • LINQ and Regular Expressions: Both of these generate garbage from behind-the-scenes boxing. Avoid LINQ and Regular Expressions if performance is an issue. Write for loops and use lists as an alternative to creating new arrays.

Time garbage collection if possible

If you are certain that a garbage collection freeze won’t affect a specific point in your game, you can trigger garbage collection with System.GC.Collect.

See Understanding Automatic Memory Management for examples of how to use this to your advantage.

Use the incremental garbage collector to split the GC workload 

Rather than creating a single, long interruption during your program’s execution, incremental garbage collection uses multiple, much shorter interruptions that distribute the workload over many frames. If garbage collection is impacting performance, try enabling this option to see if it can reduce the problem of GC spikes. Use the Profile Analyzer to verify its benefit to your application.

A look at the Incremental Garbage Collector
Use the Incremental Garbage Collector to reduce GC spikes.

Programming and code architecture

The Unity PlayerLoop contains functions for interacting with the core of the game engine. This structure includes a number of systems that handle initialization and per-frame updates. All of your scripts will rely on this PlayerLoop to create gameplay.

When profiling, you’ll see your project’s user code under the PlayerLoop (with Editor components under the EditorLoop).

Zoomed in look at a profiler
The Profiler will show your custom scripts, settings, and graphics in the context of the entire engine’s execution.
A view of the PlayerLoop

Get to know the PlayerLoop and the lifecycle of a script.

You can optimize your scripts with the following tips and tricks.

Understand the Unity PlayerLoop 

Make sure you understand the execution order of Unity’s frame loop. Every Unity script runs several event functions in a predetermined order. You should understand the difference between Awake, Start, Update, and other functions that create the lifecycle of a script. 

Refer to the Script Lifecycle Flowchart for event functions’ specific order of execution.

Minimize code that runs every frame 

Consider whether code must run every frame. Move unnecessary logic out of Update, LateUpdate, and FixedUpdate. These event functions are convenient places to put code that must update every frame, while extracting any logic that does not need to update with that frequency. Whenever possible, only execute logic when things change.

If you do need to use Update, consider running the code every n frames. This is one way to apply time slicing, a common technique of distributing a heavy workload across multiple frames. In this example, we run the ExampleExpensiveFunction once every three frames:

Avoid heavy logic in Start/Awake 

When your first scene loads, these functions get called for each object:

  • Awake
  • OnEnable
  • Start

Avoid expensive logic in these functions until your application renders its first frame. Otherwise, you might encounter longer loading times than necessary.

Refer to the order of execution for event functions for details on the first scene load.

Avoid empty Unity events 

Even empty MonoBehaviours require resources, so you should remove blank Update or LateUpdate methods.

Use preprocessor directives if you are employing these methods for testing:

Here, you can freely use the Update in-Editor for testing without unnecessary overhead slipping into your build.

Remove Debug Log statements 

Log statements (especially in Update, LateUpdate, or FixedUpdate) can bog down performance. Disable your Log statements before making a build.

To do this more easily, consider making a Conditional attribute along with a preprocessing directive. For example, create a custom class like this:

A view of ENABLE_LOG
Adding a custom preprocessor directive lets you partition your scripts.

Generate your log message with your custom class. If you disable the ENABLE_LOG preprocessor in the Player Settings, all of your Log statements disappear in one fell swoop.

Use hash values instead of string parameters 

Unity does not use string names to address Animator, Material, and Shader properties internally. For speed, all property names are hashed into property IDs, and these IDs are actually used to address the properties.

When using a Set or Get method on an Animator, Material, or Shader, harness the integer-valued method instead of the string-valued methods. The string methods simply perform string hashing and then forward the hashed ID to the integer-valued methods.

Use Animator.StringToHash for Animator property names and Shader.PropertyToID for Material and Shader property names.

Choose the right data structure

Your choice of data structure impacts efficiency as you iterate thousands of times per frame. Not sure whether to use a List, Array, or Dictionary for your collection? Follow the MSDN guide to data structures in C# as a general guide for choosing the correct structure.

Avoid adding components at runtime 

Invoking AddComponent at runtime comes with some cost. Unity must check for duplicate or other required components whenever adding components at runtime. 

Instantiating a Prefab with the desired components already set up is generally more performant.

Cache GameObjects and components 

GameObject.Find, GameObject.GetComponent, and Camera.main (in versions prior to 2020.2) can be expensive, so it’s best to avoid calling them in Update methods. Instead, call them in Start and cache the results.

Here’s an example that demonstrates inefficient use of a repeated GetComponent call:

Instead, invoke GetComponent only once, as the result of the function is cached. The cached result can be reused in Update without any further calls to GetComponent.

Use object pools 

Instantiate and Destroy can generate garbage and garbage collection (GC) spikes, and is generally a slow process. Rather than regularly instantiating and destroying GameObjects (e.g., shooting bullets from a gun), use pools of preallocated objects that can be reused and recycled.

A zoomed in look at the ObjectPool
In this example, the ObjectPool creates 20 PlayerLaser instances for reuse.

Create the reusable instances at a point in the game (e.g., during a menu screen) when a CPU spike is less noticeable. Track this “pool” of objects with a collection. During gameplay, simply enable the next available instance when needed, disable objects instead of destroying them, and return them to the pool.

A zoomed in look at the SampleScene hierarchy
The pool of PlayerLaser objects is inactive and ready to shoot.

This reduces the number of managed allocations in your project and can prevent garbage collection problems.

Learn how to create a simple Object Pooling system in Unity here.

Use ScriptableObjects 

Store unchanging values or settings in a ScriptableObject instead of a MonoBehaviour. The ScriptableObject is an asset that lives inside of the project that you only need to set up once. It cannot be directly attached to a GameObject.

Create fields in the ScriptableObject to store your values or settings, then reference the ScriptableObject in your MonoBehaviours.

Flowchart showing a ScriptableObject called Inventory holding settings for various GameObjects
ScriptableObject called Inventory holds settings for various GameObjects

Using those fields from the ScriptableObject can prevent unnecessary duplication of data every time you instantiate an object with that MonoBehaviour.

Watch this Introduction to ScriptableObjects tutorial to see how ScriptableObjects can help your project. You can also find relevant documentation here.

Download the full list of mobile performance tips

In the next blog post, we’ll take a closer look at graphics and GPU optimization. However, if you want to access the entire list of tips and tricks from the team now, our full-length e-book is available here.

Ebook Cover, "Optimize Your Mobile Game Performance"

Download our e-book

If you’re interested in learning more about Integrated Support services and want to give your team direct access to engineers, expert advice, and best practice guidance for your projects, then check out Unity’s success plans here.

Stay tuned for more performance tips

We want to help you make your Unity applications as performant as they can be, so if there are any optimization topics that you’d like to know more about, please keep us posted in the comments.

June 23, 2021 in Games | 15 min. read

Is this article helpful for you?

Thank you for your feedback!

Related Posts