ICML2021

First-Order Methods for Wasserstein Distributionally Robust MDP

Julien Grand-Clément, Christian Kroer

32 citations

Abstract

Markov Decision Processes (MDPs) are known to be sensitive to parameter specification. Distributionally robust MDPs alleviate this issue by allowing for ambiguity sets which give a set of possible distributions over parameter sets. The goal is to find an optimal policy with respect to the worst-case parameter distribution. We propose a first-order methods framework for solving Distributionally robust MDPs, and instantiate it for several types of Wasserstein ambiguity sets. By developing efficient proximal updates, our algorithms achieve a convergence rate of $O(NA^{2.5}S^{3.5}\log(S)\log(\epsilon^{-1})\epsilon^{-1.5})$ for the number of kernels $N$ in the support of the nominal distribution, states $S$ , and actions $A$ (this rate varies slightly based on the Wasserstein setup). Our dependence on $N$ , $A$ and $S$ is significantly better than existing methods; compared to Value Iteration, it is better by a factor of $O(N^{2.5}A S)$ . Numerical experiments on random instances and instances inspired from a machine replacement example show that our algorithm is significantly more scalable than state-of-the-art approaches.