ICML2025

PENCIL: Long Thoughts with Short Memory

Chenxiao Yang, Nathan Srebro, David McAllester, Zhiyuan Li

摘要

While state-of-the-art LLMs have demonstrated great promise of using long Chains-of-Thought (CoT) to boost reasoning, scaling it up to more challenging problems at test-time is fundamentally limited by suboptimal memory usage -intermediate computations accumulate indefinitely in context even no longer needed for future thoughts. We introduce PENCIL, which incorporates a novel reduction mechanism into the autoregressive generation process that recursively clean up intermediate thoughts based on patterns learned from training. By iteratively generating and erasing thoughts, PENCIL can think deeper to solve harder problems using shorter context and less computes. Empirically, we observe PENCIL is significantly more effective and efficient than CoT. For example, we demonstrate PENCIL with a small 25M-parameter transformer and 2048 context length solves Einstein's puzzle -a task that challenges much larger models like GPT-4. Theoretically, we prove PENCIL can perform universal efficient computation by simulating any Turing machines with optimal time and space complexity, and thus can solve arbitrary computable tasks that are otherwise intractable for vanilla CoT. Response : Let's break this problem down into parts! First, let's figure out how many toys were in all blue bags. Looking at the blue bags, they made 3 bags with 5 toys in each, so multiplying 3 × 5 = 15. There were 15 toys in all blue bags. That's just part of the story though -we still need to know how many toys were in all red bags. Looking at the red bags, they made 2 bags with 4 toys in each, so multiplying 2 × 4 = 8. There were 8 toys in all red bags. Now that we know both amounts, we can find the total toys by adding the toys from blue and red bags together: 15 + 8 = 23. There were 23 toys used in total. ...[EndOfPrompt][CALL]Let's break this problem down into parts! [CALL] First, let's figure out how many toys were in all blue bags. Looking at the blue bags, they made 3 bags with 5 toys in each, so multiplying 3 × 5 = 15. [SEP] There were 15 toys in all blue bags. [RETURN] Response : There were 23 toys used in total. Chain-of-Thought PENCIL ...[EndOfPrompt][CALL]Let's break this problem down into parts! There were 15 toys in all blue bags. ...[EndOfPrompt][CALL]Let's break this problem down into parts! There were 15 toys in all blue bags. [CALL] That's just part of the story though -we still need to know how many toys were in all red bags. Looking at the red bags, they made 2 bags with 4 toys in each, so multiplying 2 × 4 = 8. [SEP] There were 8 toys in all red bags. [RETURN] ...[EndOfPrompt][CALL]Let's break this problem down into parts! There were 15 toys in all blue bags. There were 8 toys in all red bags. Now that we know both amounts, we can find the total toys by adding the toys from blue and red bags together: 15 + 8 = 23. [SEP]There were 23 toys used in total. [RETURN]