ICML2023

Reinforcement Learning Can Be More Efficient with Multiple Rewards

Christoph Dann, Yishay Mansour, Mehryar Mohri

被引用 24 次

摘要

We introduce new, efficient algorithms for value iteration with multiple reward functions and continuous state. We also give an algorithm for finding the set of all nondominated actions in the continuous state setting. This novel extension is appropriate for environments with continuous or finely discretized states where generalization is required, as is the case for data analysis of randomized controlled trials.