NeurIPS2023

Conditional independence testing under misspecified inductive biases

Felipe Maia Polo, Yuekai Sun, Moulinath Banerjee

被引用 6 次

摘要

Conditional independence (CI) testing is a fundamental and challenging task in modern statistics and machine learning. Many modern methods for CI testing rely on powerful supervised learning methods to learn regression functions or Bayes predictors as an intermediate step; we refer to this class of tests as regressionbased tests. Although these methods are guaranteed to control Type-I error when the supervised learning methods accurately estimate the regression functions or Bayes predictors of interest, their behavior is less understood when they fail due to misspecified inductive biases; in other words, when the employed models are not flexible enough or when the training algorithm does not induce the desired predictors. Then, we study the performance of regression-based CI tests under misspecified inductive biases. Namely, we propose new approximations or upper bounds for the testing errors of three regression-based tests that depend on misspecification errors. Moreover, we introduce the Rao-Blackwellized Predictor Test (RBPT), a regression-based CI test robust against misspecified inductive biases. Finally, we conduct experiments with artificial and real data, showcasing the usefulness of our theory and methods. where the first five entries of βX are set to 20, and the remaining entries are zero, while the last five entries of βY are set to 20, and the remaining entries are zero. This results in X and Y being conditionally independent given Z and depending on Z only through a small number of entries. Additionally, indicating that the linear model class is correctly specified. To 1 Simulation-based tests usually rely on estimating conditional distributions. 2 See Appendix A.3 for more details