CVPR2025
Seeing Speech and Sound: Distinguishing and Locating Audio Sources in Visual Scenes
Hyeonggon Ryu, Seongyu Kim, Joon Son Chung, Arda Senocak
摘要
DenseAV "Look at the cello. I like that cello." "Look at the cello." "I like that cello."
CVPR2025
Hyeonggon Ryu, Seongyu Kim, Joon Son Chung, Arda Senocak
DenseAV "Look at the cello. I like that cello." "Look at the cello." "I like that cello."