ICLR2025

Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Parameters for Reasoning

Charlie Victor Snell, Jaehoon Lee, Kelvin Xu, Aviral Kumar

摘要

We select the best test-time compute configuration for a given problem and test-time budget. In practice: we select algorithm configurations, such as which search algorithm to use, and use question difficulty as a sufficient statistic to represent the question, instead of specializing the algorithm to each question.