CVPR2025

Discovering Hidden Visual Concepts Beyond Linguistic Input in Infant Learning

Xueyi Ke, Satoshi Tsutsui, Yayun Zhang, Bihan Wen

Abstract

recognize objects beyond the model's original vocabulary. Furthermore, we compare the differences in representation between infant models and those in modern computer vision models, such as CLIP and ImageNet pre-trained model. Ultimately, our work bridges cognitive science and computer vision by analyzing the internal representations of a computational model trained on an infant visual and linguistic inputs. Our code is available at https://github.com/ Kexueyi/discover_infant_vis .