CVPR2025

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos

Felix Wimbauer, Weirong Chen, Dominik Muhle, Christian Rupprecht, Daniel Cremers

Abstract

Figure 1. AnyCam. Given a casual video and pretrained monocular depth estimation (MDE) and optical flow networks, AnyCam outputs camera poses, camera intrinsics, and uncertainty maps in a single forward pass. The uncertainty maps represent probable movement in the scene. By using a novel loss formulation, AnyCam can be trained on a large corpus of unlabelled videos mostly obtained from YouTube.