CVPR2025
InteractVLM: 3D Interaction Reasoning from 2D Foundational Models
Sai Kumar Dwivedi, Dimitrije Antic, Shashank Tripathi, Omid Taheri, Cordelia Schmid, Michael J. Black, Dimitrios Tzionas
Abstract
Figure 1 . We present InteractVLM, a novel method for estimating contact points on both human bodies and objects from a single in-thewild image, shown here as red patches. Our method goes beyond traditional binary contact estimation methods by estimating contact points on a human in relation to a specified object. We do so by leveraging the broad visual knowledge of a large Visual Language Model.