CVPR2025

Hierarchical Compact Clustering Attention (COCA) for Unsupervised Object-Centric Learning

Can Küçüksözen, Yücel Yemez

Abstract

Figure 1 . Compactness scores obtained for each pixel in the scene, across four different datasets. The transition from bright yellow to deep purple signifies decreasing compactness. To obtain these scores, a trained COCA-Net encoder is used to generate object masks. Each object mask is then broadcasted to pixels based on the pixel-object assignments. This operation associates every pixel with a copy of its object's mask. Finally, compactness scores for each pixel's mask are calculated via Eq. 3.