NeurIPS2023

Cascading Contextual Assortment Bandits

Hyun-Jun Choi, Rajan Udwani, Min-hwan Oh

3 citations

Abstract

We present a new combinatorial bandit model, the cascading contextual assortment bandit . This model serves as a generalization of both existing cascading bandits and assortment bandits, broadening their applicability in practice. For this model, we propose our ﬁrst UCB bandit algorithm, UCB-CCA . We prove that this algorithm achieves a T -step regret upper-bound of ˜ O ( 1  d p T ) , sharper than existing bounds for cascading contextual bandits by eliminating dependence on cascade length K . To improve the dependence on problem-dependent constant  , we introduce our second algorithm, UCB-CCA+ , which leverages a new Bernstein-type concentration result. This algorithm achieves ˜ O ( d p T ) without dependence on  in the leading term. We substantiate our theoretical claims with numerical experiments, demonstrating the practical efﬁcacy of our proposed methods.