CVPR2025

Improving Semi-Supervised Semantic Segmentation with Sliced-Wasserstein Feature Alignment and Uniformity

Chen-Yi Lu, Kasra Derakhshandeh, Somali Chaterji

Abstract

Semi-supervised semantic segmentation with consistency regularization capitalizes on unlabeled images to enhance the accuracy of pixel-level segmentation. Current consistency learning methods primarily rely on the consistency loss between pseudo-labels and unlabeled images, neglecting the information within the feature representations of the backbone encoder. Preserving maximum information in feature embeddings requires achieving the alignment and uniformity objectives, as widely studied. To address this, we present SWSEG, a semi-supervised semantic segmentation algorithm that optimizes alignment and uniformity using the Sliced-Wasserstein Distance (SWD), and rigorously and empirically proves this connection. We further resolve the computational issues associated with conventional Monte Carlo-based SWD by implementing a Gaussian-approximated variant, which not only maintains the alignment and uniformity objectives but also improves training efficiency. We evaluate SWSEG on the PASCAL VOC 2012, Cityscapes, and ADE20K datasets, outshining supervised baselines in mIoU by up to 11.8%, 8.9%, and 8.2%, respectively, given an equivalent number of labeled samples. Further, SWSEG surpasses state-of-the-art methods in multiple settings across these three datasets. Our extensive ablation studies confirm the optimization of the uniformity and alignment objectives of the feature representations.