CVPR2023
MetaCLUE: Towards Comprehensive Visual Metaphors Research
Arjun R. Akula, Brendan Driscoll, Pradyumna Narayana, Soravit Changpinyo, Zhiwei Jia, Suyash Damle, Garima Pruthi, Sugato Basu, Leonidas J. Guibas, William T. Freeman, Yuanzhen Li, Varun Jampani
摘要
Figure 1 . With MetaCLUE, we introduce several interesting tasks related to visual metaphors. We collect metaphor annotations (objects, abstract concepts, relationships and object boxes) for evaluating existing models on these tasks. Specifically we perform a comprehensive evaluation of vision and language models on four different tasks (Classification, Localization, Understanding, and gEneration). Comprehensive experiments in this work show that state-of-the-art techniques mostly focus on literal interpretation and perform poorly in understanding and generation of metaphor images.