KDD2022
Submodular Feature Selection for Partial Label Learning
Wei-Xuan Bao, Jun-Yi Hang, Min-Ling Zhang
19 citations
Abstract
Partial label learning induces a multi-class classifier from training examples each associated with a candidate label set where the ground-truth label is concealed. Feature selection improves the generalization ability of learning system via selecting essential features for classification from the original feature set, while the task of partial label feature selection is challenging due to ambiguous labeling information. In this paper, the first attempt towards partial label feature selection is investigated via mutual-information-based dependency maximization. Specifically, the proposed approach SAUTE iteratively maximizes the dependency between selected features and labeling information, where the value of mutual information is estimated from confidence-based latent variable inference. In each iteration, the near-optimal features are selected greedily according to properties of submodular mutual information function, while the density of latent label variable is inferred with the help of updated labeling confidences over candidate labels by resorting to kNN aggregation in the induced lower-dimensional feature space. Extensive experiments over synthetic as well as real-world partial label data sets show that the generalization ability of well-established partial label learning algorithms can be significantly improved after coupling with the proposed feature selection approach.