CVPR2024

Describing Differences in Image Sets with Natural Language

Lisa Dunlap, Yuhui Zhang, Xiaohan Wang, Ruiqi Zhong, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez, Serena Yeung-Levy

Abstract

Figure 1. Set difference captioning. Given two sets of images DA and DB, output natural language descriptions of concepts which are more true for DA. In this example, DA and DB are images from the "Dining Table" class in ImageNetV2 and ImageNet, respectively.