ICLR2026

On the Thinking-Language Modeling Gap in Large Language Models

Chenxi Liu, Yongqiang Chen, Tongliang Liu, James Cheng, Bo Han, Kun Zhang

1 citation

Abstract

System 2 reasoning is one of the defining characteristics of intelligence, which requires slow and logical thinking. Human conducts System 2 reasoning via the language of thoughts that organizes the reasoning process as a causal sequence of mental language, or thoughts. Recently, it has been observed that System 2 reasoning can be elicited from Large Language Models (LLMs) pre-trained on large-scale natural languages. However, in this work, we show that there is a significant gap between the modeling of languages and thoughts. As language is primarily a tool for humans to share knowledge and thinking, modeling human language can easily absorb language biases into LLMs deviated from the chain of thoughts in minds. Furthermore, we show that the biases will mislead the eliciting of "thoughts" in LLMs to focus only on a biased part of the premise. To this end, we propose a new prompt technique termed Language-of-Thoughts(LoT) to demonstrate and alleviate this gap. Instead of directly eliciting the chain of thoughts from partial information, LoTinstructs LLMs to adjust the order and token using for the expressions of all the relevant information. We show that the simple strategy significantly reduces the language modeling biases in LLMs and improves the performance of LLMs across a variety of reasoning tasks. Recently, Large Language Models (LLMs), pre-trained on massive natural language written by humans, have demonstrated impressive performances across a variety of System 1 and System 2 tasks (Brown et al., 2020; OpenAI, 2022; Touvron et al., 2023; OpenAI, 2023) . Specifically, when given proper instructions such as Chain-of-Thoughts (CoT), LLMs can reason for the desired answer via generating and following the intermediate steps (Wei et al., 2022) . However, * These authors contributed equally.