CVPR2024

InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models

Jiun Tian Hoe, Xudong Jiang, Chee Seng Chan, Yap-Peng Tan, Weipeng Hu

Abstract

Caption: a person is holding a bag, another person is talking on a cell phone Caption: a person is feeding a cat Generated Image Generated Image Generated Image person another person person feeding person cell phone cat another person person talking holding bag bag cell phone cat Figure 1. Generated samples of size 512x512. Stable Diffusion conditions on text caption only, while GLIGEN conditions on extra layout input. Our proposed InteractDiffusion conditions on extra interaction label and its location shown by the shaded area.