CVPR2021

Visual Semantic Role Labeling for Video Understanding

Arka Sadhu, Tanmay Gupta, Mark Yatskar, Ram Nevatia, Aniruddha Kembhavi

摘要

Verb: deflect (block, avoid) Arg0 (deflector) woman with shield Arg1 (thing deflected) boulder Scene city park Verb: talk (speak) Arg0 (talker) woman with shield Arg2 (hearer) man with trident ArgM (manner) urgently Scene city park Verb: leap (physically leap) Arg0 (jumper) man with trident Arg1 (obstacle) over stairs ArgM (direction) towards shirtless man ArgM (goal) to attack shirtless man Scene city park Verb: punch (to hit) Arg0 (agent) shirtless man Arg1 (entity punched) man with trident ArgM (direction) far into distance Scene city park Verb: punch (to hit) Arg0 (agent) shirtless man Arg1 (entity punched) woman with shield ArgM (direction) down the stairs Scene city park 2 Seconds