ICLR2026

Horizon Imagination: Efficient On-Policy Rollout in Diffusion World Models

Lior Cohen, Ofir Nabati, Kaixin Wang, Navdeep Kumar, Shie Mannor

摘要

We study diffusion-based world models for reinforcement learning, which offer high generative fidelity but face critical efficiency challenges in control. Current methods either require heavyweight models at inference or rely on highly sequential imagination, both of which impose prohibitive computational costs. We propose Horizon Imagination (HI), an on-policy imagination process for discrete stochastic policies that denoises multiple future observations in parallel. HI incorporates a stabilization mechanism and a novel sampling schedule that decouples the denoising budget from the effective horizon over which denoising is applied while also supporting fractional steps-per-frame budgets (sub-step budgets). Experiments on Atari 100K and Craftium show that our approach maintains control performance with a sub-step budget of half the denoising steps (i.e., 0.5 denoising steps per frame) and achieves superior generation quality under varied schedules. Code is available at https://github.com/leor-c/horizon-imagination.