Synthetic data is powered by your library of 3D assets. Read on to learn about sources and techniques for acquiring 3D content for common computer vision problems.
Building computer vision systems that use synthetic data is a transformative shift from ones that use real data. While real-world data requires painstaking collection and annotation, a synthetic dataset can be constructed in a matter of minutes by a single machine learning (ML) engineer, enabling a rapid, data-centric development cycle. It is fast, so you can experiment with many dataset variations at once to find ones that improve your model.
But data sourcing is still a challenge. Instead of sourcing thousands of images and annotations, you need tens or hundreds of 3D assets such as meshes, textures, and animations. While synthetic data reduces the content requirements, the 3D nature of these assets requires getting creative about acquiring them.
Luckily, the film and video game industries have faced the same content challenges for more than forty years. In that time they have developed new techniques for content creation as well as vast repositories of content, much of which is already perfect for synthetic data.
This post will introduce the best ways for sourcing 3D content, each lending itself to a different type of computer vision application.
The fastest and cheapest way to acquire 3D content is to leverage existing work distributed online. Online marketplaces such as the Unity Asset Store provide repositories of free and paid content created by 3D artists for use in games, film, and other 3D applications.
Assets often come in “packs,” including various 3D models, texture variations, materials, and animations. These can all be randomized during dataset generation to create nearly infinite variations on each object. Sourcing from different artists increases the breadth of data even further, reducing bias and overfitting in the model.
Online asset sources are great sources of data when the task requires detecting broad classes of objects like “chair” or “cat,” where a wide variety of content is required. To achieve the variety required to help the model to generalize, you need to source variations on every relevant axis, including shape, material, color, and pose.
Even nonrealistic assets like cartoonish or odd textures and colors can lead to better generalization in the model. In one recent project, moving from a small set of highly realistic assets to a broad spectrum of 3D models and textures from the Unity Asset Store substantially increased model performance – a bigger jump than any other experiment we tried.
Synthetic datasets for detecting specific objects or manufactured items require accurate 3D models of the real-world objects, which are rarely found online. With 3D scanning, you can create these assets yourself.
Assets from SynthDet were created using a Flatbed scanner and 3D modeling. You can explore a tutorial here.
There are a range of technologies for 3D scanning, each suited to different types of objects, quality levels, and budgets. Some notable examples are:
Pros: Suitable for objects of any size or shape; captures texture and fine detail
Cons: Sensitive to lighting conditions and object color; somewhat tedious
Pros: Inexpensive, highly accurate textures
Cons: Only suitable for flattenable objects; some hand-authoring required
Pros: Highly accurate geometry on most objects; fast
Cons: Expensive; object sizes are limited
Scanned objects and online 3D assets are generally snapshots of real objects – specific cats or chairs or wood grains – but another growing form of content called procedural assets uses rule-based algorithms to produce infinite variation. Procedural assets each provide parameters that enable customization of their look and shape. This is perfect for synthetic data, where each of these parameters can then be randomized to achieve great diversity. You can then iteratively tune the randomization to improve model performance.
Procedural assets are useful when real assets are varied but generally follow common patterns. Roads, human faces, and materials like cloth and concrete are all great examples. Relevant factors such as wear and tear, age, shape, and color can be built in as parameters for your randomization algorithm. Unity Shader Graph, Adobe Substance 3D, and Houdini are flexible tools for building procedural assets, and the Unity Asset Store offers several procedural tools for specific types of content.
Procedural techniques also power environment generation for synthetic data. Environment generators range from unstructured and highly random to complex, hierarchical systems for generating plausible scenes. Unity’s API is designed to enable you to write procedural scene generators; Houdini is commonly used for scenes that require more structure.
For more examples and to get started with procedural content, check out Getting Started with Houdini & Unity and Introduction to ShaderGraph on Unity Learn.
The game and film industries have provided us with a wealth of dynamic 3D content, letting you quickly bootstrap our synthetic data projects and start iterating on the data. With the Unity Perception package, you can import those assets, set them up for randomization, and generate highly varied datasets very quickly.
The Unity Computer Vision team uses all of the above content strategies when building synthetic data. Contact us to learn more about how we can create a synthetic dataset for your specific needs or for additional guidance on leveraging synthetic data.