ICLR2021
Genetic Soft Updates for Policy Evolution in Deep Reinforcement Learning
Enrico Marchesini, Davide Corsi, Alessandro Farinelli
被引用 33 次
摘要
A huge number of trials is required to achieve good performance. Devising robust learning approaches improving sample efficiency 2 Convergence to local optima, mainly caused by the lack of diverse exploration in high-dimensional spaces. Recent approaches for the exploration [1, 2] problem relies on task-specific hyperparameters [1] Pathak et al.