ICML2024

Total Variation Floodgate for Variable Importance Inference in Classification

Wenshuo Wang, Lucas Janson, Lihua Lei, Aaditya Ramdas

2 citations

Abstract

Inferring variable importance is the key problem of many scientific studies, where researchers seek to learn the effect of a feature XX on the outcome YY in the presence of confounding variables ZZ. Focusing on classification problems, we define the expected total variation (ETV), which is an intuitive and deterministic measure of variable importance that does not rely on any model context. We then introduce algorithms for statistical inference on the ETV under design-based/model-X assumptions. These algorithms build on the floodgate notion for regression problems (Zhang and Janson 2020). The algorithms we introduce can leverage any user-specified regression function and produce asymptotic lower confidence bounds for the ETV. We show the effectiveness of our algorithms with simulations and a case study in conjoint analysis on the US general election.