This post explains the technology behind Mecanim Humanoids. How it works, strengths and limitations, why some choices were made and hopefully some hints on how to get the best out of it. Please refer to Unity Documentation for general setup and instructions.
Mecanim Humanoid Rig and Muscle Space are an alternative solution to the standard skeleton node hierarchies and geometric transforms to represent humanoid body and animations.
The Humanoid Rig is a description on top of a skeleton node hierarchy. It identifies a set of human bones and creates a Muscle Referential for each of those. A Muscle Referential is essentially a pre and post rotation with a range and a sign for each axis.
A Muscle is a normalized value [-1,1] that moves a bone for one axis between range [min,max]. Note that the Muscle normalized value can go below or over [-1,1] to overshoot the range. The range is not a hard limit, instead it defines the normal motion span for a Muscle. A specific Humanoid Rig can augment or reduce the range of a Muscle Referential to augment or reduce its motion span.
The Muscle Space is the set of all Muscle normalized values for the Humanoid Rig. It is a Normalized Humanoid pose. A range of zero (min= max) for a bone axis means that there is no Muscle for it.
For example, the Elbow does not have a muscle for its Y axis, as it only stretches in and out (Z-Axis) and roll in and out (X-Axis). In the end, the Muscle Space is composed of at most 47 Muscle values that completely describe a Humanoid body pose.
One beautiful thing about Muscle Space, is that it is completely abstracted from its original or any skeleton rig. It can be directly applied to any Humanoid Rig and it always create a believable pose. Another beautiful thing is how well Muscle Space interpolates. Compare to standard skeleton pose, Muscle Space will always interpolate naturally between animation key frames, during state machine transition or when mixed in a blend tree.
Computation-wise it also performs as the Muscle Space can be treated as a vector of a scalar that you can linearly interpolate as opposed to quaternions or Euler angles.
Every new skeleton rig built for a humanoid character or any animation captured will be an approximation of the human body and human motion. No matter how many bones or how good your MOCAP hardware is, the result will be an approximation of the real thing.
Riggers, game companies, schools or software firms will propose their own version of what they thinks best represent the human body and motion and what will best fit their production needs.
The elaboration of Mecanim Humanoid Rig and Muscle Space was confronted to some hard choices. We had to find a compromise between fast runtime and animation quality or openness and standard definition.
This is a tough one. Why 2, not 3? or an arbitrary number of spines bones? Lets discard the latest, it is not about biomedical research. (Note that you can always use a Generic Rig if you absolutely need this level of precision). One spine bone is clearly under defined.
Adding a second one brings you most of the way. A third or even a forth one will only give a small contribution to the final human pose. Why is this? When looking at how a human spine bends, you will notice that the part of spine that is on the rib cage is almost rigid. What remains, is a main flexion point at the base of the spine and one other at the base of the rib cage. So there are two main flexion points. Looking at a contortionist even in extreme poses clearly show this. Considering all of this we decided to have 2 spine bones for the Humanoid Rig.
This one is easier than for spine. Note that many game skeleton rigs don’t even have a neck bone and manage to do the job with only a head bone.
As with most skeleton rigs (it is even more often the case for games), the Mecanim Humanoid Rig only supports rotation animation. The bones are not allowed to change their local translation relative to their parent. Some 3D packages induce a certain amount of translation on bones to simulate elasticity of articulations or squash and stretch animation. We are currently looking at adding translation DoF as it is a relatively cheap way in term of computation performance to compensate for animation quality on less detailed skeleton rigs. It would also allow users to create retargetable squash and stretch animation.
Twist bones are often added to skeleton rigs to prevent skin deformation problems on arms and legs when they are in extreme twist configuration.
Twist bones help to distribute the deformation induced by twist from start to end of the limb.
In the Muscle Space, the amount of twist is represented by a Muscle and it is always associated with the parent bone of a limb. Ex: The twist on the forearm happens at the elbow and not on the wrist.
Humanoid Rigs don’t support twist bones, but Mecanim solver let you specify a percentage of twist to be taken out of the parent and put onto the child of the limb.It is defaulted at 50% and greatly helps to prevent skin deformation problem.
Now, what would be the best way to represent the position and orientation of human body in world space?
The top most bone in hierarchy (usually hips, pelvis or whatever it is called) is where lies the world space position and orientation curves in a standard skeleton rig. While this works fine for a specific character, it becomes inappropriate when doing retargeting since from one skeleton rig to another the top most bone usually have a different position and rotation relative to the rest of the skeleton.
The Muscle Space uses the humanoid center of mass to represent its position in world space. The center of mass is approximated using a human average body parts mass distribution. We do the assumption that, after scale adjustments, the center of mass for a humanoid pose is the same for any humanoid character. It is a big assumption, but it has shown to work very well for a wide set of animations and humanoid characters.
It is true that for standing up or walking animations, the centre of mass lies around hips, but for more dynamic motion like a back flip, you can see how body moves away from the centre of mass and how the centre of mass feels like the most stable point over the animation.
Similar to what the centre of mass does for Muscle Space world space position, we use an average body orientation for world space orientation. The average body orientation up vector is computed out of the hips and shoulders middle points. The front vector is then the cross product of the up vector and average left/right hips/shoulders vectors. It is also assumed that this average body orientation for a humanoid pose is the same for all humanoid rigs. As for the centre of mass, an average body orientation tends to be a stable referential as lower and upper body orientation naturally compensates when walking, running, etc.
A more in depth paper about root motion will follow, but as an introduction, the projection of the centre of mass and average body orientation is used to automatically create root motion. The fact that the centre of mass and average body orientation are stable properties of humanoid animation leads to a stable root motion that can be used for navigation or motion prediction.
One thing is still missing in Muscle Space to be a completely normalized humanoid pose… the overall size of it. Again we are looking for a way to describe the size of a humanoid that does not rely on a specific point like head bone position since it is not consistent from rig to rig. The center of mass height for a humanoid character in T-Stance is directly used as its scale. The center of mass position of the Muscle Space is divided by this scale to produce the final normalized humanoid pose. Said in another way, the Muscle Space is normalized for a humanoid that has a centre of mass height of 1 when in T-Stance. All the positions in the Muscle Space are said to be in normalized meters.
When applying a Muscle Space to a Humanoid Rig, hands and feet may end up in different position and orientation from the original animation, due to the difference in proportions of Humanoid Rigs. This may result in feet sliding or hands not reaching properly. This is why Muscle Space optionally contains the original position and orientation of hands and feet. The hands and feet position and orientation are normalized relative to Humanoid Root (center of mass, average body rotation and humanoid scale) in the Muscle Space. Those original positions and orientations can be used to fix the retargeted skeleton pose to match the original world space position using an IK pass.
The main goal of IK Solver on arms and legs is to reach the original hands and feet position and orientation optionally found in the Muscle Space. This is what happens under the hood for feet when “Foot IK” toggle is enabled in a Mecanim Controller State.
In these cases, the retargeted skeleton pose is never very far from the original IK goals. The IK error to fix is small since it is only induced by difference in proportion of humanoid rigs. The IK solver will only modify the retargeted skeleton pose slightly to produce the final pose that matches original positions and orientations.
Since the IK only modifies slightly the retargeted skeleton pose, it will rarely induce animation artefacts like knee or elbow popping. Even then, there is a Squash and Stretch solver, part of IK solver, that is there to prevent popping when arms or legs come close to maximum extension. By default the amount of squash and stretch allowed is limited to 5% of the total length of the arm or leg. An elbow or knee popping is more noticeable (and ugly) than a 5% or less stretch on arm or leg. Note that squash and stretch solve can be turned off by setting it to 0%.
A more in depth paper about IK rigs will follow. It will explain how to handle props, use multiple IK passes, interaction with environment or between humanoid characters, etc.
The Humanoid Rig has some bones that are optional. This is the case for Chest, Neck, Left Shoulder, Right Shoulder, Left Toes and Right Toes. Many existing skeleton rigs don’t have some of the optional bones, but we still wanted to created valid humanoids with those.
The Humanoid Rig also supports LeftEye and RightEye optional bones. Eye bones have two Muscles each, one that goes up and down and one to move in and out. The Eye bones also work with Humanoid Rig LookAt solver that can distribute look at adjustments on Spine, Chest, Neck, Head and Eyes. There will be more about LookAt solver in the upcoming Humanoid IK rig paper.
Finally, the Humanoid Rig supports fingers. Each finger may have 0 to 3 digits. 0 digit simply means that this finger is not defined. The are two Muscles (Stretch and Spread) for the first digit and one Muscle (Stretch) for 2nd and last digit. Note that there is no solver overhead for fingers when no fingers are defined for a hand.
In many case, skeleton rigs will have more bones than the ones defined by the Humanoid Rig. In-between bones are bones that are between humanoid defined bones. For example, a 3rd spine bone in a 3DSMAX Biped will be treated as an in-between bone. Those are supported by Humanoid Rig, but keep in mind that in-between bones won’t get animated. They will stay at their default position and orientation relative to their parent defined in the Humanoid Rig.
The skeleton rig must respect a standard hierarchy to be compatible with our Humanoid Rig. The skeleton may have any number of in-between bones between humanoid bones, but it must respect the following pattern:
Hips - Upper Leg - Lower Leg - Foot - Toes
Hips - Spine - Chest - Neck - Head
Chest - Shoulder - Arm - Forearm - Hand
Hand - Proximal - Intermediate - Distal
The T-Stance is the most important step of Humanoid Rig creation since muscles setup is based on it. The T-Stance pose was chosen as reference pose since it is easy conceptualize and that there is not that much room for interpretation of what it should be:
- Standing straight facing z axis
- Head and eyes facing z axis
- Feet on the ground parallel to z axis
- Arms open parallel to the ground along x axis
- Hands flat, palm down parallel to the ground along x axis
- Fingers straight parallel to the ground along x axis
-Thumbs straight parallel to the ground half way (45 degrees) between x and z axis
When saying “straight”, it does not mean bones necessarily need to be perfectly aligned. It depends on how skin attaches to skeleton. Some rig may have the skin that looks straight, but underneath skeleton is not. So it is important that the T-Stance be set for final skinned character. In the case you are creating a Humanoid Rig to retarget MOCAP data, it is good practice to capture at least of few frames of a T-Stance done by the actor in the MOCAP suite.
By default muscle ranges are set to values that best represent human muscle ranges. Most of the time, they should not be modified. For some more cartoony character you may want to reduce the range to prevent arms entering body or augment it to exaggerate legs motion. If you are creating a Humanoid Rig to retarget MOCAP data you should not modify the ranges since the produced animation clip will not respect default.
Mecanim retargeting is split into two phases. The first phase consists of converting a standard skeleton transforms animation to a normalized humanoid animation clip (or Muscle Clip). This phase happens in the editor when the animation file is imported. It is internally called “RetargetFrom”. The second phase happens in play mode when Muscle Clip is evaluated and applied to the skeleton bones of a Humanoid Rig.
It is internally called “RetargetTo”.There are two big advantages of splitting retargeting into two phases. The first one is solving speed. Half of the retargeting process is done offline, only the other half is done at runtime. The other advantage is scene complexity and memory usage. Since the Muscle Clip is completely abstracted for its original skeleton, the source skeleton does not need to be included in runtime to perform the retargeting.
The second phase is straight forward. Once you have a valid Humanoid Rig, you simply apply Muscle Clip to it with RetargetTo solver. This is done automatically under the hood.
The first phase, converting a skeleton animation to a Muscle Clip, may be a bit trickier. The skeleton animation clip is sampled at a fixed rate. For each sample, the skeleton pose is converted to a muscle space pose and a key is added to the Muscle Clip. Not all the skeleton rig will fit, there are so many different ways a skeleton rig can be built and animated. Some skeleton rig will produce a valid output, but with possible loss of information. We will now review what is needed to create a lossless normalized humanoid animation… the Muscle Clip.
Note: By lossless we mean that retargeting from a skeleton rig to Muscle Clip and then retargeting back to the same skeleton rig will preserve the animation intact. In fact, it will be almost intact. The original twist on arms and legs will be lost and replaced by what the Twist solver computes. As explained earlier in this document, there is no representation of twist repartition in Muscle Space.
The 3DSMAX Biped is pointed as a problematic rig here. It is probably because of its popularity and the fact that we had to support many cases of it being used with Mecanim. Note that if you are going to create new animations to be used with Mecanim Humanoid Rig, you should follow the rules stated above from the start. If you want to use already existing animation that break some of the rules, it is still possible, the Mecanim retarget solver is robust and will produce valid output, but the lossless conversion can’t be guarantied.
Note that if you are going to create new animations to be used with the Mecanim Humanoid Rig, you should follow the rules stated above from the start. If you want to use already existing animation that breaks some of the rules, it is still possible as the Mecanim retarget solver is robust and will produce valid output, but the lossless conversion can’t be guaranteed.