ICLR2025

Towards Empowerment Gain through Causal Structure Learning in Model-Based Reinforcement Learning

Hongye Cao, Fan Feng, Meng Fang, Shaokang Dong, Tianpei Yang, Jing Huo, Yang Gao

摘要

In Model-Based Reinforcement Learning (MBRL), incorporating causal structures into dynamics models provides agents with the structured understanding of environments, enabling more efficient and effective decisions. Empowerment, as an intrinsic motivation, enhances the ability of agents to actively control environments by maximizing mutual information between future states and actions. We posit that empowerment coupled with the causal understanding of the environment can improve the agent's controllability over environments, while enhanced empowerment gain can further facilitate causal reasoning. To this end, we propose the framework that pioneers the integration of empowerment with causal reasoning, Empowerment through Causal Learning (ECL), where an agent with the awareness of the causal dynamics model achieves empowerment-driven exploration and optimizes its causal structure for task learning. Specifically, we first train a causal dynamics model of the environment based on collected data. Next, we maximize empowerment under the causal structure for exploration, simultaneously using data gathered through exploration to update the causal dynamics model, which could be more controllable than dynamics models without the causal structure. We also design an intrinsic curiosity reward to mitigate overfitting during downstream task learning. Importantly, ECL is method-agnostic and can integrate diverse causal discovery methods. We evaluate ECL combined with 3 causal discovery methods across 6 environments including both state-based and pixel-based tasks, demonstrating its performance gain compared to other causal MBRL methods, in terms of causal structure discovery, sample efficiency, and asymptotic performance in policy learning. The project page is https://sites.google.com/view/ecl-1429/ .