This is the second part of my post about how we monitor Unity performance. In Part I, I explained how we monitor for and fix performance regressions. In Part II I’ll go into some details about how we do the monitoring.
Building the rig
We have built a performance test rig that can monitor selected development and release branches and report selected data points to a database.
The performance test rig is continuously running and looking for new builds to test from our build automation system. As soon as new builds are ready, it runs tests on all platforms we have setup for performance testing. The results get reported into a database with all the related information like Unity version used, platform and information about the hardware and operating system. We also have a reporting solution that continuously monitors the data in the database and shows us if we have a significant performance regression. It can tell us in which release branch the regression occurs, which platforms and tests are affected by it.
The figure below shows the components of the performance test framework. For running the tests we have dedicated machines running tests on different platforms. We use a .Net web service to report test results to the database. And we have a reporting solution that presents results in nicely formatted reports.
When analyzing the test results we use a fixed set of hardware and software configurations and we look for changes over different versions of Unity. For example when testing Windows Standalone platform we use the same hardware and Windows version for all the runs. Only the Unity version is varying.
As all of the data is saved in the database, we can make reports from the data for different uses. For each Unity version we are about to release we make a chart that shows data points vs previously released versions. We can also see almost realtime status of performance tests for release branches we are developing in parallel. When analyzing a regression we can even see the individual measurements, not only the mean value.
Currently we have three dedicated performance test machines for handling the test running. We have one Mac Mini (OSX 10.8.5) that is used to run tests on the Mac platforms (Editor, Standalone). We have two Windows machines (Windows 7, Windows 8.0) that are used to run tests on the Windows platforms (Editor, Standalone). And we have a Nexus 10 device to run tests on Android. We intend to extend this to more platforms, hardware and software configurations, but more on that later. And before you ask, yes, OSX 10.9 Mavericks is coming. We just didn't get to it yet.
Extending existing framework
We wrote about one of our test frameworks, the Runtime Test Framework, in an earlier blog post. For performance testing we have extended our existing test frameworks instead of creating a new one. This means we can tap into the existing test suites and use them for performance testing. Further, everyone that can write tests using existing frameworks can write performance tests. This also means it is easy to get new platforms running performance tests as soon as we have the platform running other tests.
The test suites
We have split the tests into two types called Editor Tests and Runtime Tests based on how they get run and what can be measured. Editor Tests are limited to the platforms the editor runs on. Editor Tests can make measurements outside of Unity user scripts. The tests measure things like editor start-up time and asset import times for different types of assets. Runtime Tests, on the other hand, can be run on all the supported runtime platforms. With Runtime Tests it is possible to measure things like frame time of rendered frames or to measure the time it takes to deserialize different types of assets. It is actually possible to measure anything you can do with user scripts in Unity.
First of all we will add more configurations. Hardware and software. OSX 10.9 is high on the list, and so are the remaining mobile platforms and consoles. We truly believe that the performance rig is a lifesaver when it comes to detecting performance regressions as early as possible. Further, the rig makes it very easy to compare performance across different platforms and versions of Unity. We will keep you posted as we find interesting results - and different usages of the rig.