Hello! My name is Sakari Pitkänen. I work as a developer on the Toolsmiths Team here at Unity. In this blogpost I will tell you about how we do automated performance monitoring of the Unity development branches.
With an ever increasing amount of test configurations (platforms, operating systems, versions) it gets increasingly difficult to keep track of everything that is going on. We need visibility, and to get this we need data. Our main reason for getting performance data from Unity is to prevent performance regressions.
Finding performance regressions
As we do day-to-day development we are not likely to notice if performance is degrading as time goes by, which is a big problem. We want to always try to make the next version of Unity perform better than the current one – and we definitely don’t want anything to be slower without us noticing it.
Most of our current performance tests measure time over some specific functionality of Unity. For example we can measure the frame time over some number of frames when utilizing a specific rendering functionality. The tests are designed to look for regressions, so the measurements are implemented in a way that whenever the measurement increases significantly we have a performance regression. Each time we run a test, we run it many times and use the median value of the samples, so that a single bad sample won’t show up as a regression. Besides time we can measure other things like memory usage.
Before we dig into the details of how we do this, let’s look at a concrete example: How we found a performance regression and used our data points to verify that it got fixed.
Performance regression and fix
In Unity version 4.3.0 we had a performance regression that affected a specific platform, Windows Standalone. Below is a table that has results for a limited set of tests run on four different platforms and two versions of Unity, 4.2 and 4.3. For all of these tests, the values are median values of measured frame times in milliseconds. The table is not showing performance per se; instead it lists the sample values. This means that increase in a measurement value can be considered a performance regression (red) and decrease can be considered an improvement (green).
The results show that the Windows Standalone platform has a significant performance regression that affects most of the selected tests. From the test names one can already assume the cause of the regression is probably graphics related. Unfortunately, we were still working on the test rig when 4.3.0 was released, so we didn’t get the data to catch this before shipping. That will surely not happen next time, and as we widen the coverage with more configurations we expect to significantly reduce the risk of shipping with performance regressions moving forward.
We did find the cause for this particular regression and promptly fixed it for Unity version 4.3.1. Then we ran the tests again, now comparing last three released versions of Unity, 4.2, 4.3.0 and 4.3.1.
We could verify that the fix was effective and that these tests show no significant changes in performance between Unity versions 4.2 and 4.3.1.
In Part II of this post I’ll tell you about the rig we have built for performance testing.