CVPR2025

CH3Depth: Efficient and Flexible Depth Foundation Model with Flow Matching

Jiaqi Li, Yiran Wang, Jinghong Zheng, Junrui Zhang, Liao Shen, Tianqi Liu, Zhiguo Cao

Abstract

Ours(2Steps 0.36s) DepthFM(3Steps 0.69s) Lotus(1Step 0.20s) Ours(1Step 0.25s) Ours (3Steps 0.43s) Marigold (50Steps 4.45s) RGB Temporal Slice Ours DepthCrafter RGB Temporal Slice Ours NVDS Figure 1. Top Half: We qualitatively compare the SOTA image depth estimation models and the inference time. CH3Depth achieves meticulous details and reasonable structure in results with balanced efficiency. Bottom Half: The comparison of video methods is presented with temporal slices showing the depth of the vertical red line in videos over time to evaluate the temporal stability. CH3Depth is compatible with image and video depth estimation, achieving meticulous, accurate and consistent prediction results.