CVPR2024
Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching
Shitong Shao, Zeyuan Yin, Muxin Zhou, Xindong Zhang, Zhiqiang Shen
被引用 12 次
摘要
The lightweight “local-match-global” matching introduced by SRe2L successfully creates a distilled dataset with comprehensive information on the full 224×224 ImageNetlk. However, this one-sided approach is limited to a particular backbone, layer, and statistics, which limits the improvement of the generalization of a distilled dataset. We suggest that sufficient and various “local-match-global” matching are more precise and effective than a single one and have the ability to create a distilled dataset with richer information and better generalization ability. We call this perspective “generalized matching” and propose Generalized Various Backbone and Statistical Matching (G-VBSM) in this work, which aims to create a synthetic dataset with densities, ensuring consistency with the complete dataset across various backbones, layers, and statistics. As experimentally demonstrated, G-VBSM is the first algorithm to obtain strong performance across both small-scale and large-scale datasets. Specifically, G-VBSM achieves performances of 38.7% on CIFAR-I00, 47.6% on Tiny-ImageNet, and 31.4% on the full 224×224 ImageNet1 k, respectively<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup><sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>Settings: CIFAR-I00 with 128-width ConvNet under 10 images per class (lPC), Tiny-ImageNet with ResNet18 under 50 IPC, and ImageNetlk with ResNet18 under 10 IPC.. These results surpass all SOTA methods by margins of 3.9%, 6.5%, and 10.1%, respectively.