CVPR2023

MetaCLUE: Towards Comprehensive Visual Metaphors Research

Arjun R. Akula, Brendan Driscoll, Pradyumna Narayana, Soravit Changpinyo, Zhiwei Jia, Suyash Damle, Garima Pruthi, Sugato Basu, Leonidas J. Guibas, William T. Freeman, Yuanzhen Li, Varun Jampani

DOI arXiv 出版方

摘要

Figure 1 . With MetaCLUE, we introduce several interesting tasks related to visual metaphors. We collect metaphor annotations (objects, abstract concepts, relationships and object boxes) for evaluating existing models on these tasks. Specifically we perform a comprehensive evaluation of vision and language models on four different tasks (Classification, Localization, Understanding, and gEneration). Comprehensive experiments in this work show that state-of-the-art techniques mostly focus on literal interpretation and perform poorly in understanding and generation of metaphor images.