ICML2025

PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting for Novel View Synthesis

Sunghwan Hong, Jaewoo Jung, Heeseong Shin, Jisang Han, Jiaolong Yang, Chong Luo, Seungryong Kim

摘要

We tackle the problem of view synthesis from sparse, unposed images in a single feed-forward pass. Our method builds on 3DGS and relaxes common requirements such as dense views, accurate camera poses or depth, and large image overlaps. However, the main challenge arises from the parametrization of pixel-aligned 3D Gaussians, as their misalignments inevitably yield noisy or sparse gradients that destabilize training. To address this, we leverage pretrained monocular depth estimation and visual correspondence networks for coarse alignment, then refine depth and pose via lightweight learnable modules. We further estimate geometry confidence scores, driven by aggregated monocular and multi-view depth, to assess the reliability of 3D Gaussian centers and condition the prediction of Gaussian parameters accordingly. Extensive experiments on largescale real-world datasets confirm that PF3plat achieves state-of-the-art performance across all benchmarks, with ablation studies validating our design choices.