CVPR2025
Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects
Amir Barda, Matheus Gadelha, Vladimir G. Kim, Noam Aigerman, Amit H. Bermano, Thibault Groueix
Abstract
Original Mesh Input mesh + mask "An elven warrior" Original Mesh 25 sec. "Man wearing a medieval helmet" Mesh Adaptive Remeshing NeRF Mesh Adaptive Remeshing Figure 1. Our method takes as input a 3D object along with a 3D mask (first column) and a text prompt, and uses our multiview inpainting diffusion model to consistently paint the mask in four rendered views of the object. Off-the-shelf reconstructors can be used on the multiview output to give an NeRF, a Gaussian Splat (second column), or a mesh (third column) that can be used along with adaptive remeshing to ensure the unmasked region is exactly preserved e.g. topology, uvs, (fourth and fifth column). This feedforward approach is orders of magnitude faster than previous works in generative 3D editing, taking just ≈ 3 seconds per multiview edit, then 0.7 seconds to reconstruct a GS or a NeRF, 3 seconds for a mesh, and ≈ 20 seconds for optional mesh post-processing.