WWW2026
Anchor Drift No More: Hierarchical Consistency-Guided Prompt Distillation for Incomplete Multimodal Learning
Ruiting Dai, Zesen Cai, Lisi Mo, Guiduo Duan, Keren Shi, Tao He
3 citations
Abstract
Web-scale content is rich in modalities yet frequently incomplete due to device limits, transmission errors, or privacy controls, making learning with missing modalities a core challenge. Prior reconstruction and alignment strategies often fail to preserve a stable class geometry when inputs are partial, leading to anchor drift -- a shift of class prototypes between complete and incomplete views that distorts the shared representation space and degrades generalization. We introduce HiCoD (Hierarchical Consistency-Guided Pro mpt Distillation), which learns a robust, class-anchored semantic space. HiCoD combines: (1) a modality-aware semantic graph that restores cross-modal structure under partial observations; (2) dual-level anchoring that unifies large-language-model–derived global category prototypes with top-K local exemplars to balance cross-modal coherence and modality-specific detail; and (3) multi-level distillation that aligns unimodal features, fused embeddings, and prompt-completed signals within a single anchor space. Across CMU-MOSI, CMU-MOSEI, and additional benchmarks, HiCoD sets a new state of the art under both fixed-pattern and random missingness, improving Acc-2 by up to 6.4 points over MPLMM and remaining robust when key modalities are absent.