ICLR2021

Genetic Soft Updates for Policy Evolution in Deep Reinforcement Learning

Enrico Marchesini, Davide Corsi, Alessandro Farinelli

被引用 33 次

摘要

A huge number of trials is required to achieve good performance. Devising robust learning approaches improving sample efficiency 2 Convergence to local optima, mainly caused by the lack of diverse exploration in high-dimensional spaces. Recent approaches for the exploration [1, 2] problem relies on task-specific hyperparameters [1] Pathak et al.