KDD2026

Enhancing Multimodal Recommendation via Multimodal Representation Calibration in Spectral Domain

Minghui Wang, Tingting Zhang, Yu Li, Yi Chang

Abstract

Multimodal recommendation, which integrates user-item interactions with rich item multimodal features to infer user preferences, has emerged as one of the state-of-the-art recommendation paradigms. However, most existing multimodal recommendation methods focus solely on the multimodal fusion process of items, overlooking the impact of modality-specific noise characteristics and inter-modal distribution shifts on the fusion process. As a result, noise from specific modalities may be inadvertently amplified during the fusion of multimodal representations, thereby degrading the quality of fused representations and ultimately undermining recommendation accuracy. To address this issue, we propose a novel framework, DAMPS, which Denoises and cAlibrates multimodal representations in the spectral domain from the perspectives of both aMplitude and Phase Spectrum. Specifically, we first apply the Fast Fourier Transform to decompose multimodal representations into amplitude spectrum and phase spectrum components. We then propose the adaptive phase calibrator, which resolves the phase distribution shift between modalities through the optimal minimum variance unbiased estimate of phase offset and adaptive phase rotation. We further develop the amplitude variance-ratio filter, which suppresses modality-specific amplitude noise while preserving prominent modality-specific representations. Finally, we introduce the inter-modal coherence filter, which quantifies and enhances the intrinsic shared information between modalities through amplitude-squared coherence, thereby eliminating irrelevant modality noise. Technically, DAMPS is model-agnostic and highly flexible, enabling seamless integration with various multimodal recommendation backbones. Extensive evaluations on five public multimodal recommendation datasets demonstrate that DAMPS consistently improves recommendation performance and achieves state-of-the-art results.