CVPR2025

MoEdit: On Learning Quantity Perception for Multi-object Image Editing

Yanfeng Li, Ka-Hou Chan, Yue Sun, Chan-Tong Lam, Tong Tong, Zitong Yu, Keren Fu, Xiaohong Liu, Tao Tan

Abstract

Ten koalas "...steampunck style" "...vibrant portrait painting of Salvador Dalí" "...with blanket" "→mice, by the sea" "→foxes, futuristic metropolis style" Reference TurboEdit MoEdit (Ours) Three rabbits and two foxes "...dark horror style" "...with smilling faces" "→bears, in a natural field" "→corgis, by the sea" "→foxes, in a natural field" Figure 1. Visual comparisons of our MoEdit with TurboEdit [52]. Reference represents input images. Five different images edited by each method are based on five distinct text prompts.