EMNLP2025

Mind the Gap: How BabyLMs Learn Filler-Gap Dependencies

Chi-Yun Chang, Xueyang Huang, Humaira Nasir, Shane Storks, Olawale Akingbade, Huteng Dai

Abstract

Humans acquire syntactic constructions like filler-gap dependencies from limited and often noisy input. Can neural language models do the same? We investigate this question by evaluating GPT-2 models trained on childoriented input from the BabyLM Challenge. Our experiments focus on whether these "baby" language models acquire filler-gap dependencies, generalize across constructions, and respect structural constraints such as island effects. We apply a suite of syntactic constructions to four models trained on child language, including two base models (trained on 10M and 100M tokens) and two well-performing models from the BabyLM Challenge (ConcreteGPT and BabbleGPT). We evaluate model behavior using wh-licensing scores, flip tests, and grammaticality contrasts across four constructions. Results show that BabyLM-scale models partially acquire filler-gap dependencies but often fail to generalize or fully capture island constraints. Our code and datasets are available at https://github.com/um-cap-lab/ EMNLP-2025-submission .