NeurIPS2023

The s-value: evaluating stability with respect to distributional shifts

Suyash Gupta, Dominik Rothenhäusler

被引用 21 次

摘要

Common statistical measures of uncertainty such as p-values and confidence intervals quantify the uncertainty due to sampling, that is, the uncertainty due to not observing the full population. However, sampling is not the only source of uncertainty. In practice, distributions change between locations and across time. This makes it difficult to gather knowledge that transfers across data sets. We propose a measure of instability that quantifies the distributional instability of a statistical parameter with respect to Kullback-Leibler divergence, that is, the sensitivity of the parameter under general distributional perturbations within a Kullback-Leibler divergence ball. In addition, we quantify the instability of parameters with respect to directional or variable-specific shifts. Measuring instability with respect to directional shifts can be used to detect under which kind of distribution shifts a statistical conclusion might be reversed. We discuss how such knowledge can inform data collection for transfer learning of statistical parameters under shifted distributions. We evaluate the performance of the proposed measure on real data and show that it can elucidate the distributional instability of a parameter with respect to certain shifts and can be used to improve estimation accuracy under shifted distributions. Introduction Test data sets collected in different locations or at different time points often are drawn from different distributions, due to changing circumstances, changes in unmeasured confounders, time shifts in distribution, or distributional shifts in covariates [46, 20, 16, 21] . This makes it difficult to gather knowledge that transfers across data sets. Statistical estimands such as a regression coefficient or the average treatment effect (ATE) may vary as the underlying distribution changes and hence, statistical findings (such as that the treatment effect is positive) may not replicate across data sets [4, 23] . In causal inference, the rapidly growing field of sensitivity analysis [12, 42, 16, 55, 11] quantifies the stability of an estimate with respect to unobserved confounding. Roughly speaking, this line of work sees stability analysis as part of uncertainty quantification. Inspired by this line of work, we aim to bring a similar type of stability analysis to a wider range of statistical procedures. In this paper, we propose a measure of instability, called the s-value, to investigate the stability of a given statistical parameter with respect to a shift in the underlying distribution (Figure 1 ). The s-value quantifies the minimum shift in distribution required to tilt the parameter to a given value, using Kullback-Leibler divergence. We also investigate the stability of parameters with respect to directional or variable-specific shifts. The proposed measure can be used as an exploratory tool to identify the kind of distribution shift that could reverse a statistical conclusion. We further discuss 37th Conference on Neural Information Processing Systems (NeurIPS 2023).