ICLR2025

GALA: Geometry-Aware Local Adaptive Grids for Detailed 3D Generation

Dingdong Yang, Yizhi Wang, Konrad Schindler, Ali Mahdavi Amiri, Hao Zhang

Abstract

We propose GALA, a novel representation of 3D shapes that (i) excels at capturing and reproducing complex geometry and surface details, (ii) is computationally efficient, and (iii) lends itself to 3D generative modelling with modern, diffusionbased schemes. The key idea of GALA is to exploit both the global sparsity of surfaces within a 3D volume and their local surface properties. Sparsity is promoted by covering only the 3D object boundaries, not empty space, with an ensemble of tree root voxels. Each voxel contains an octree to further limit storage and compute to regions that contain surfaces. Adaptivity is achieved by fitting one local and geometry-aware coordinate frame in each non-empty leaf node. Adjusting the orientation of the local grid, as well as the anisotropic scales of its axes, to the local surface shape greatly increases the amount of detail that can be stored in a given amount of memory, which in turn allows for quantization without loss of quality. With our optimized C++/CUDA implementation, GALA can be fitted to an object in less than 10 seconds. Moreover, the representation can efficiently be flattened and manipulated with transformer networks. We provide a cascaded generation pipeline capable of generating 3D shapes with great geometric detail. For more information, please visit our project page. Our GALA representation can be implemented to yield sets of vectors that can be easily processed by transformer-based neural networks (Vaswani et al., 2017) , while the octree forest provides a hierarchical 3D representation with limited depths. In Table 1 , we compare GALA with several state-of-the-art representative methods over several useful criteria. The contributions of our work are threefold: 1. Our geometry-aware, locally adaptive, and anisotropic grids enable more efficient (Table 1 ) and accurate sampling of shape structures, capturing geometric details, such as thin strings