CVPR2025

Vid2Sim: Realistic and Interactive Simulation from Video for Urban Navigation

Ziyang Xie, Zhizheng Liu, Zhenghao Peng, Wayne Wu, Bolei Zhou

摘要

Figure 1. Vid2Sim converts monocular video captured by a hand-held camera into realistic and interactive 3D simulation environments. It facilitates RL training of navigation agents in digital twins of urban scenes and provides realistic observations like RGB and depth to reduce the sim-to-real gap. The pink mobile robot in the image is a food delivery bot that avoids collisions with pedestrians and obstacles.