ACL2024

Language Models Don't Learn the Physical Manifestation of Language

Bruce W. Lee, Jaehyuk Lim

Abstract

We argue that language-only models don't learn the physical manifestation of language. We present an empirical investigation of visualauditory properties of language through a series of tasks, termed H-TEST. These tasks highlight a fundamental gap between human linguistic understanding and the sensory-deprived linguistic understanding of LLMs. In support of our hypothesis, 1. deliberate reasoning (Chain-of-Thought), 2. few-shot examples, or 3. stronger LLM from the same model family (LLaMA 2 13B → LLaMA 2 70B) has no significant effect on H-TEST performance. We bring in the philosophical case of Mary, who learns about the world in a sensorydeprived environment as a useful conceptual framework to understand how languageonly models learn about the world (Jackson, 1986). Our experiments show that some of the strongest proprietary LLMs stay near random chance baseline accuracy of 50%, highlighting the limitations of linguistic knowledge acquired in the absence of sensory experience. Our code and data are available at <github.com/brucewlee/h-test>.