ICLR2026

DreamPhase: Offline Imagination and Uncertainty-Guided Planning for Large-Language-Model Agents

Shayan Mohajer Hamidi, Linfeng Ye, Konstantinos N. Plataniotis

摘要

Autonomous agents capable of perceiving complex environments, understanding instructions, and performing multi-step tasks hold transformative potential across domains such as robotics, scientific discovery, and web automation. While large language models (LLMs) provide a powerful foundation, they struggle with closed-loop decision-making due to static pretraining and limited temporal grounding. Prior approaches either rely on expensive, real-time environment interactions or brittle imitation policies, both with safety and efficiency trade-offs. We introduce DreamPhase, a modular framework that plans through offline imagination. A learned latent world model simulates multi-step futures in latent space; imagined branches are scored with an uncertainty-aware value and filtered by a safety gate. The best branch is distilled into a short natural-language reflection that conditions the next policy query, improving behavior without modifying the LLM. Crucially, DreamPhase attains its performance with substantially fewer real interactions: on WebShop, average API calls per episode drop from $\sim$ 40 with ARMAP-M (token-level search) to $<10$ with DreamPhase, a $4\times$ reduction that lowers latency and reduces executed irreversible actions by $\sim 5\times$ on WebShop (4.9 $\times$ on ALFWorld) per incident logs. Across web, science, and embodied tasks, DreamPhase improves sample efficiency, safety, and cost over search-based and reward-based baselines. This offers a scalable path toward safe, high-performance autonomous agents via imagination-driven planning. Code: https://anonymous.4open.science/r/DreamPhase-A8AD/README.md.