ICLR2022

Actor-Critic Policy Optimization in a Large-Scale Imperfect-Information Game

Haobo Fu, Weiming Liu, Shuang Wu, Yijia Wang, Tao Yang, Kai Li, Junliang Xing, Bin Li, Bo Ma, Qiang Fu, Wei Yang

32 citations

Abstract

An optimal solution to a 2-player zero-sum IIG usually refers to a Nash Equilibrium (NE), where no player could improve by unilaterally deviating to a different policy. Figure: For instance, in the 2-player Rock-Paper-Scissors game, the NE is for both players playing the Uniform random policy: [ 1 3 , 1 3 , 1 3 ].