Since its global release in August 2021, the battle royale title Naraka: Bladepoint has been making waves. Its stunning in-game artwork, inspired by ancient Chinese fantasy, has especially impressed audiences worldwide.
As an independent subsidiary of NetEase, 24 Entertainment continues to expand their portfolio, making a major mark with Naraka: Bladepoint as their first desktop game to quickly attain global appeal. This includes hitting Steam’s top 10 charts and drawing in more active players than popular titles like Rainbow Six: Siege or Splitgate, all within its first week of launch.
This action-adventure battle royale takes place across colorful mountaintops, lush forests, and sprawling ancient cities, all rendered in striking detail. To capture these beautiful environments, while maintaining the performance and frame rate necessary to support a 60-player Multiplayer, 24 Entertainment worked closely with NVIDIA and Unity.
In partnership with NVIDIA, 24 Entertainment was granted early access to their Deep Learning Super Sampling (DLSS), the latest rendering technology engineered to run real-time worlds at high frame rates and resolutions. Leveraging artificial intelligence, DLSS augments graphics’ performance and overall artistic quality, without compromise.
To maintain its strong performance, Naraka: Bladepoint renders at low-resolution and avoids the need for things like pixel shading calculations. Behind the scenes, DLSS makes use of a neural network to generate high-resolution images, preserving the artistic detail that gamers experience. Not only does this produce high-quality results, but by leveraging artificial intelligence to fill in the missing information, the game renders almost twice as fast, which is crucial for this level of competitive, multiplayer gaming.
With DLSS, 24 Entertainment achieved high-resolution results and crisp detail without sacrificing frame rates. As such, Native 4K became nearly indistinguishable from DLSS 4K.
The AI is trained to account for previous frames, which helps with factors such as anti-aliasing. And by using a consistent model, there’s no need to retrain the neural network for a given game.
Naraka: Bladepoint’s development began a few years ago, with the implementation of a custom render pipeline built on top of Unity’s Scriptable Render Pipeline (SRP), which allows for plugging in different rendering architectures scripted in C#.
Working side by side with NVIDIA’s expert Developer Relations support, as well as Unity’s Core Support, 24 Entertainment implemented the first-ever usage of DLSS in Unity.
For a deeper understanding of how DLSS works in the context of a real-time platform, the graphics team at Naraka: Bladepoint unpacked some implementation details, tips, and challenges they faced.
The first step of implementing DLSS is to upsample low-resolution input images.
To reduce the impact on final frame quality, 24 Entertainment set this process up prior to applying post-processing effects such as bloom, tone mapping, and special effect lighting. All post-processing effects were then applied to the high-resolution images after sampling.
Here’s how they called in their pipeline:
Step 1: After selecting the quality mode, their team used the NVIDIA function getOptimalsettings to get the input size and recommended sharpness. Different quality modes have different scaling.
Step 2: They prepared a DLSS feature for each camera through the NVIDIA CreateFeature interface, and used it to set the selected quality mode, the size of the target rendering output, and additional sharpening parameters. With additional sharpening, DLSS makes the output more detailed.
Step 3: Before post-processing, they called the execute function for DLSS inference, which needs this input:
To ensure compatible inputs for DLSS, some slight modifications to the pipeline were made.
After all, the low-resolution input demands rendering objects at scale. The viewport should thus be set in gbuffer to forward pass correctly. You must create the render target with the correctly scaled size.
pixelRect = new Rect(0.0f, 0.0f, Mathf.CeilToInt(renderingData.cameraData.pixelWidth * viewportScale), Mathf.CeilToInt(renderingData.cameraData.pixelHeight * viewportScale)); commandBuffer.SetViewport(pixelRect); // RenderScale Supported
After rendering is complete, DLSS arguments should be filled with the low-res image-related parameters, including the size of the source, size of the destination, input color rendering target, and depth target.
int scaledWidth = UpSamplingTools.Instance.GetRTScaleInt(cameraData.pixelWidth); int scaledHeight = UpSamplingTools.Instance.GetRTScaleInt(cameraData.pixelHeight);
// Set the argument m_DLSSArguments.SrcRect.width = scaledWidth; m_DLSSArguments.SrcRect.height = scaledHeight;
m_DLSSArguments.DestRect.width = cameraData.pixelWidth; m_DLSSArguments.DestRect.height = cameraData.pixelHeight;
m_DLSSArguments.InputColor = sourceHandle.rt; m_DLSSArguments.InputDepth = depthHandle.rt;
Next, let’s address the Jitter Offset inputs. Jitter Offset is related to accumulating more samples across frames.
The rasterization of the primitive is quite discrete if the pixels are only shaded based on the area covered by the triangle. The result is rather jagged and unnatural in appearance, with non-smooth edges.
If you increase the resolution, however, the result will be more detailed and natural. But it’s also worth noting how difficult it is to reconstruct a continuous signal when there are insufficient discrete samples. This can also cause aliasing.
If you aim to achieve 4K results, simply increase the resolution to alleviate aliasing. If 8K is the desired resolution, rendering will be four times slower, and you will likely run into issues with 8K texture bandwidth (memory usage).
Another method to solve for aliasing is Multisample Anti-aliasing (commonly known as MSAA), which is supported by GPU hardware. MSAA checks multiple samples in different sub-pixel positions, instead of only checking the center sample of the pixel. The triangle fragment color can be adjusted to make the edges smoother based on the number of samples in a pixel covered by primitives.
Temporal Anti-aliasing (TAA) is another method that accumulates samples across multiple frames. This method adds a different jitter to each frame in order to change the sampling position. With the help of motion vectors, it then blends the color results between frames.
If the historical color of each pixel in the current frame can be identified, you can use that information going forward.
Jitter generally means that the sampling position in the pixel is slightly adjusted, so that samples can be accumulated across frames instead of attempting to solve an undersampling problem all at once.
That’s why 24 Entertainment turned to DLSS, which not only reduced the overall rendering resolution to rectify these issues, but also obtained high-quality results with the desired smooth edges.
Here, the recommended sample pattern comprises Halton sequences, which are low-discrepancy sequences that look random but cover the space more evenly.
In practice, applying Jitter Offset can be rather intuitive. Consider these steps to do it effectively:
Step 1: Generate samples from Halton sequences of a specific camera, according to different settings. The output jitter should be between negative 0.5 to 0.5.
Vector2 temporalJitter = m_HalotonSampler.Get(m_TemporalJitterIndex, samplesCount);
Step 2: Store the jitter into a Vector4. Then multiply the jitter by 2 and divide by the scaled resolution to convert the jitter to screen space in pixels. Store these in the zw component.
These two were later used to modify the projection matrix and globally affect the rendering result:
m_TemporalJitter = new Vector4(temporalJitter.x, temporalJitter.y, temporalJitter.x * 2.0f / UpSamplingTools.GetRTScalePixels(cameraData.pixelWidth), temporalJitter.y * 2.0f / UpSamplingTools.GetRTScalePixels(cameraData.pixelHeight) );
Step 3: Set the View Projection matrix to the global property UNITY_MATRIX_VP. This will result in the shader working smoothly without any modification, as the vertex shader calls the same function to convert the World position onscreen.
var projectionMatrix = cameraData.camera.nonJitteredProjectionMatrix; projectionMatrix.m02 += m_TemporalJitter.z; projectionMatrix.m12 += m_TemporalJitter.w;
projectionMatrix = GL.GetGPUProjectionMatrix(projectionMatrix, true); var jitteredVP = projectionMatrix * cameraData.viewMatrix;
With the Jitter Offset input successfully resolved, motion vectors can now be generated using the motion vector tools.
If the camera and dynamic object change their position, the screen image might also change. So if the desired output is to accumulate samples across multiple frames, you need to identify the previous screen position.
When the camera changes its position from q to p, as shown in the diagram, the same point in the World space might end up projecting to a completely different screen point. The result of their subtraction is the motion vector, which indicates the position of the previous frame relative to the current frame onscreen.
Here are the steps used to compute motion vectors:
Step 1: Calculate the motion vectors of the dynamic objects in the depth-only pass of the pipeline. The depth-only pass is used to draw the depth of dynamic moving objects to help depth testing.
Step 2: Fill the empty pixels according to the camera’s movement, using the previous frame’s View Projection matrix of the camera to translate the World position and attain the previous screen position.
Step 3: Fill the property related to motion vectors in DLSS arguments.
Motion vector scaling allows for the production of motion vectors according to varying degrees of fidelity, to suit the desired outcome. In this example, 24 Entertainment set it to a negative width and height scale because they generated motion vectors in screen space, whereas DLSS required it to be in pixels.
The negative sign was used here because the pipeline implemented swapped the position of the current and previous frame when subtracting.
m_DLSSArguments.MotionVectorScale = new Vector2(-scaledWidth, -scaledHeight); m_DLSSArguments.InputMotionVectors = motionHandle.rt;
Note that 24 Entertainment’s rendering pipeline is a combination of deferred and forward frameworks. To save memory, all the render targets are allocated without scaling.
The DLSS manager uses the RTHandle system to allocate the scaled render target only once when the camera is created, to prevent allocations for every camera loop.
To achieve temporal effects, generate motion vectors into the depth-only pass (only dynamic objects need to calculate motion vectors). Next, use a fullscreen pass to generate camera motion vectors.
The DLSS manager supports the RTHandle system in allocating the scaled render target at the very beginning of post-processing.
Here, the DLSS evaluation step can obtain all the information required by DLSS arguments, except for jitter.
Add jitter to make temporal multisampling available.
In 24 Entertainment’s pipeline for Naraka: Bladepoint, there was a system that cached the camera’s information, including the phase index of Halton pattern and matrices with or without jitter. The team harnessed the View Projection and Jitter matrices in all rasterization steps except motion vector pass. In motion vector pass, they used the non-jitter matrices.
Mip Maps are pre-calculated sequences of images, each of which represents half of the previous resolution. Use them to efficiently down-filter a texture, and then sample all texels in the original texture that would contribute to a screen pixel.
This rendering example shows how the object far from the camera might sample a lower-resolution texture in Mip Maps for cleaner results.
Be sure to add Mip Map Bias when sampling texture. Since DLSS is an upsampling method, like other similar algorithms such as checkerboard rendering, the texture resolution needs to be increased in low-resolution viewport rendering. As such, the texture will not be blurred when the high resolution is reconstructed.
The Mip Map Bias can be calculated in this way:
MipLevelBias = log2(RenderResolution.x / DisplayResolution.x)
If the output is negative, you can bias the level back to the higher-resolution image. Only once Mip Map Bias is set will the image quality of DLSS stay the same when sampling low-resolution textures.
The DLSS feature should be cached for the camera.
A whole new feature must be created to account for changes in the size and quality mode of the camera changes. In some cases, multiple cameras of the same size and quality mode might be rendered at the same time. Seeing as each feature will leverage the information from the previous frame, it should not be used by other cameras.
In Naraka: Bladepoint, the team at 24 Entertainment used the hash descriptor of each camera as the key to cache the features in a dictionary and ensure that every feature would only be used by one camera.
Finally, note that you should not combine DLSS with other anti-aliasing methods. Mixing methods can result in unpredictable artifacts.
Thanks to our close technical partnership with NVIDIA, we integrated DLSS and all features described above into Unity and HDRP – and we can’t wait to push things even further!
This integration, available as of Unity 2021.2, is maintained as any other Unity core feature. Here are some details on how NVIDIA’s technology is now integrated in Unity:
DLSS integration in core
At the inner core, we created a proper C# Scripting API layer for the DLSS module. This layer also handles platform compatibility, support for in-engine #defines (so you can exclude DLSS-specific code from platforms), and proper documentation. This is the official API for SRPs to complete DLSS. Start from here to integrate it into any Scriptable Render Pipeline.
Graphics integration in HDRP
We implemented DLSS as a Dynamic Resolution System (DRS) feature in HDRP that uses the script API described above.
We build on the same jittering algorithm of TAA that was employed in Naraka: Bladepoint, and similarly apply the resolution upsampling before post-processing, and render post-processing at full resolution.
One improvement is that our DLSS implementation is built on our Dynamic Resolution System, which allows you to quickly switch resolutions in real-time. This system is wholly integrated with our RTHandle system, and supports hardware dynamic resolution systems (texture-aliased based) like DLSS, as well as software ones (viewport based). The Mip Map Bias feature discussed earlier was developed in Unity, but remains available for all other DRS filters and techniques.
To limit the ghosting of particles, we added a special pass to ensure that our dynamic resolution systems are compatible with all HDRP features.
We also added a comprehensive debug panel with full information on DLSS versioning and state in the frame.
Lastly, we added support for DX11, DX12, and Vulkan, in addition to VR and real-time ray tracing.
Here’s how to enable DLSS in your project:
2. Enable DLSS in the cameras required for your project.