CVPR2025

Hybrid Concept Bottleneck Models

Yang Liu, Tianwei Zhang, Shi Gu

摘要

Table 6 provides detailed statistics for all datasets. For concept translator pre-training, we use the entire Concept-Net [42] dataset, concepts generated by an LLM (Concept-Generate), and the training split of MSCOCO [4]. In Hy-bridCBM training, each dataset is annotated with a oneword description indicating its super class. We follow the train/dev/test splits provided by CoOp [56] for Food-101, Aircraft, Flower-102, UCF-101, and DTD. For CUB, we randomly sample 10 images per category as the development set. For CIFAR-10 and CIFAR-100, 10% of the training data is set aside for development. For HAM10000, we use an 80/10/10 split across classes, and for ImageNet, we evaluate only on the development set.