logo

Embedding Synthetic Assets to Train AI Models

Conference:  Transform X 2021

2021-10-07

Authors:   Dr. Jonathan Laserson


Summary

The presentation discusses the use of synthetic data and neural networks to generate high-quality labels and diverse datasets for training models. The focus is on assigning a latent code to each asset to represent a family of objects.
  • Synthetic data can provide high-quality labels on a pixel level that is impossible to achieve with real data
  • Variety in datasets is important for training models
  • Assets are generated by 3D artists using a dedicated software and encoded with a family of assets
  • Assigning a latent code to each asset can represent a family of objects
  • The network is incentivized to decompose information in the latent code
The presenter explains that the team collected over one million high-quality assets generated by artists to create diverse datasets. However, not all attributes of the assets were explicitly written down, making it difficult to know certain details such as whether a chair swivels or not. To solve this issue, the team assigned a vector to each asset and added a latent code to represent a family of objects. This allowed the network to accurately render images of each asset and decompose information in the latent code.

Abstract

Dr. Jonathan Laserson, Head of AI Research at Datagen Technologies, is an expert in the field of photorealistic synthetic images. He shares how Nural Radience Fields (NeRF) can be used to generate a nearly infinite number of synthetic assets to train AI models. Dr. Lasserson also explains how synthetic objects can be represented in a latent space where features can be perturbed to modify the shape and texture of each asset. Join this session to learn how NeRF can accelerate the curation of data for your computer vision use cases.

Materials: