Search Unity

Making AnimationEvent safe for the CoreCLR garbage collector

October 25, 2022 in Technology | 11 min. read
Making AnimationEvent safe for the CoreCLR garbage collector | Hero image
Making AnimationEvent safe for the CoreCLR garbage collector | Hero image

We’re hard at work bringing the latest .NET technology to Unity users. Part of this effort involves making existing Unity code work with the .NET CoreCLR JIT runtime from Microsoft, which includes a highly performant and more efficient garbage collector (GC).

In this blog, I’ll help you discover some of the recent changes our team has made to enable the integration of an updated AnimationEvent with the advanced GC.

Managed to native (and back again)

Unity’s engine code is written in both C# (managed) code and C++ (native) code. While crossing the boundary from managed to native code can be tricky and expensive, it provides excellent opportunities to deliver solutions with top-notch performance.

Take, for example, Unity’s AddEvent method on the AnimationClip object. The code for that method is pretty simple:

public void AddEvent(AnimationEvent evt)
{
    if (evt == null)
        throw new ArgumentNullException("evt");
    AddEventInternal(evt);
}

Here, AddEventInternal is a native method. The Unity engine makes that call directly, without normal p/invoke marshaling, so the C++ code implementing AddEventInternal gets a pointer to memory allocated from the managed heap. Since Unity uses the well-known, conservative, non-moving Boehm garbage collector, the native code can safely access the managed memory directly.

But what happens when Unity starts to use the precise, moving CoreCLR GC instead? There are two possible problems we could encounter. The CoreCLR might:

  1. Change the layout of the AnimationEvent type in managed memory
  2. Move the objects this AnimationEvent instance references in memory, while the native code is trying to use them

Let’s unpack how we’ve solved both of these problems.

To blit or not to blit?

We need to ensure the CoreCLR runtime gives us a stable representation of the data in an AnimationEvent instance that we can use in native code. This is called the blittable representation – meaning that the bits in memory, both in C# and C++, will be exactly the same. Currently, AnimationEvent is defined like this in C#:

public sealed class AnimationEvent
{
    internal float m_Time;
    internal string m_FunctionName;
    internal string m_StringParameter;
    internal Object m_ObjectReferenceParameter;
    internal float m_FloatParameter;
    internal int m_IntParameter;

    internal int m_MessageOptions;
    internal AnimationEventSource m_Source;
    internal AnimationState m_StateSender;
    internal AnimatorStateInfo m_AnimatorStateInfo;
    internal AnimatorClipInfo m_AnimatorClipInfo;
}

The CoreCLR runtime moves all of the fields that are reference types to the front of the representation in memory, so the GC code can take advantage of cache locality. CoreCLR’s internal layout will then look like this:

public sealed class AnimationEvent
{
    internal string m_FunctionName;
    internal string m_StringParameter;
    internal Object m_ObjectReferenceParameter;
    internal AnimationState m_StateSender;
    internal float m_Time;
    internal float m_FloatParameter;
    internal int m_IntParameter;
    internal int m_MessageOptions;
    internal AnimationEventSource m_Source;
    internal AnimatorStateInfo m_AnimatorStateInfo;
    internal AnimatorClipInfo m_AnimatorClipInfo;
}

Notice that the fields are all the same, but in a different order. The code in AddEventInternal needs to handle both orders. In addition, the CoreCLR is free to change the way it lays out these fields in the future, so the Unity engine native code might need to change as well.

We can avoid these problems by introducing a new internal type that is only used to pass data for the AnimationEvent across the managed/native boundary:

[StructLayout(LayoutKind.Sequential)]
internal struct AnimationEventBlittable
{
    internal float m_Time;
    internal IntPtr m_FunctionName;
    internal IntPtr m_StringParameter;
    internal IntPtr m_ObjectReferenceParameter;
    internal float m_FloatParameter;
    internal int m_IntParameter;

    internal int m_MessageOptions;
    internal AnimationEventSource m_Source;
    internal IntPtr m_StateSender;
    internal AnimatorStateInfo m_AnimatorStateInfo;
    internal AnimatorClipInfo m_AnimatorClipInfo;
}

Notice that none of the fields in this type are reference types, so CoreCLR won’t reorder them. As the name implies, this is a blittable representation of the data in AnimationEvent.

Can you handle it?

Did you catch what happened to the reference type fields, like m_FunctionName? Instead of being a string, they are now IntPtr.

This is where the solution to our second problem comes in. For the normal AnimationEvent, m_FunctionName is a pointer to a GC-allocated string that can be moved. But in AnimationEventBlittable, it is a GCHandle, which allows for safe access to a managed object from native code.

Now you can write a method to convert from AnimationEvent to AnimationEventBlittable:

internal static AnimationEventBlittable FromAnimationEvent(AnimationEvent animationEvent)
{
    var animationEventBlittable = new AnimationEventBlittable
    {
        m_Time = animationEvent.m_Time,
        m_FunctionName = GCHandle.ToIntPtr(GCHandle.Alloc(animationEvent.m_FunctionName)),
        m_StringParameter = GCHandle.ToIntPtr(GCHandle.Alloc(animationEvent.m_StringParameter)),
        m_ObjectReferenceParameter = GCHandle.ToIntPtr(GCHandle.Alloc(animationEvent.m_ObjectReferenceParameter)),
        m_FloatParameter = animationEvent.m_FloatParameter,
        m_IntParameter = animationEvent.m_IntParameter,
        m_MessageOptions = animationEvent.m_MessageOptions,
        m_Source = animationEvent.m_Source,
        m_StateSender = GCHandle.ToIntPtr(GCHandle.Alloc(animationEvent.m_StateSender)),
        m_AnimatorStateInfo = animationEvent.m_AnimatorStateInfo,
        m_AnimatorClipInfo = animationEvent.m_AnimatorClipInfo
    };
    return animationEventBlittable;
}

At this point, the safe AddEvent method looks like this:

public void AddEvent(AnimationEvent evt)
{
    if (evt == null)
        throw new ArgumentNullException("evt");
    var animationEventBlittable = AnimationEventBlittable.FromAnimationEvent(evt);
    AddEventInternal(animationEventBlittable);
    animationEventBlittable.Dispose();
}

Finally, AddEventInternal needs a few changes to unwrap those GCHandles and get back to the real, managed object data. With those changes in place, we can fully leverage the CoreCLR GC while still being memory safe – pretty cool!

Is this really faster?

Well, no, no it’s not. I mean, just look at all of that new code you’ll need to run. But this is where the fun part of development comes in.

Could we, at Unity, have made it just as fast as before? Spoiler alert: We did! And there are three important things our engineers have done to get back performance while maintaining safety.

First, observe that AnimationEventBlittable is a struct, not a class (it was a class in the first draft of this code). It is allocated on the stack (not via the GC), which makes it low-cost. Code generation from Mono, Il2CPP, and CoreCLR for the FromAnimationEvent method is great, and the team has yet to find any measurable overhead for the method itself.

Of course, the FromAnimationEvent method does call out to the GCHandle.Alloc method (four times!), and that method is not cheap. All of the .NET virtual machine implementations need to do non-trivial work to allocate GCHandles and update internal data structures to track them. While profiling these changes, we realized something important – the GCHandles don’t live long. Each handle is only needed while the native code is executing, so you can easily reuse them. This implementation pools a small number of GCHandles, and reuses them for each call to FromAnimationEvent. This means the cost to allocate these GCHandles goes to nearly zero for realistic use cases, where FromAnimationEvent will be called many times.

There is also a hidden cost that has yet to be shown in the native code. Recall the earlier discussion around C++ code needing to “unwrap those GCHandles.” Well, it turns out that CoreCLR makes this process really fast. To obtain the target of a GCHandle (i.e., unwrap it), CoreCLR charges you the same cost as a simple pointer dereference – that’s it.

However, our benchmarks were significantly slower with Mono and IL2CPP for the same code… so what gives? We found that GCHandle unwrapping was actually rather expensive for Mono and IL2CPP. Thankfully, this seldom happens on critical paths, but with the changes that have been made, it is now a factor to keep in mind. As such, we’ve implemented the same algorithm that CoreCLR uses in Mono and IL2CPP.

With all of the changes that have been laid out, our internal benchmarks for AnimationEvent – including both the AddEvent method and other public API methods – show no difference between the previous code and the new code. Sweet!

Performance or safety? Choose both.

The CoreCLR runtime and GC bring the promise of increased performance across the board. Microsoft has been investing heavily in these for .NET, and we’re really excited to bring such improvements to Unity users.

We expect to deliver the full performance of modern .NET applications while maintaining safety and stability in your existing code. The team will continue to apply the techniques learned during this investigation toward other managed/native boundary transitions in Unity engine code (there are many).

For more tips tied to animation and CoreCLR, visit us in the forums or feel free to connect with me directly on Twitter at @petersonjm1. Be sure to watch for new technical blogs from other Unity developers as part of the ongoing Tech from the Trenches series.

October 25, 2022 in Technology | 11 min. read
Related Posts