ICLR2025

Generalized Principal-Agent Problem with a Learning Agent

Tao Lin, Yiling Chen

Abstract

In classic principal-agent problems such as Stackelberg games, contract design, and Bayesian persuasion, the agent best responds to the principal's committed strategy. We study repeated generalized principal-agent problems under the assumption that the principal does not have commitment power and the agent uses algorithms to learn to respond to the principal. We reduce this problem to a one-shot problem where the agent approximately best responds, and prove that: (1) If the agent uses contextual no-regret learning algorithms with regret Reg(T ), then the principal can guarantee utility at least U * -Θ Reg(T ) T , where U * is the principal's optimal utility in the classic model with a best-responding agent. (2) If the agent uses contextual no-swap-regret learning algorithms with swap-regret SReg(T ), then the principal cannot obtain utility more than U * + O( SReg(T) T ). ( 3 ) In addition, if the agent uses mean-based learning algorithms (which can be no-regret but not no-swap-regret), then the principal can sometimes do significantly better than U * . These results not only refine previous works on Stackelberg games and contract design, but also lead to new results for Bayesian persuasion with a learning agent and all generalized principal-agent problems where the agent does not have private information. * A short version of this paper was published at ICLR'25 (spotlight).