ICML2024

Performance Bounds for Active Binary Testing with Information Maximization

Aditya Chattopadhyay, Benjamin David Haeffele, René Vidal, Donald Geman

1 citation

Abstract

Machine learning theorists have developed clever ways to set bounds on the performance capability of data mining procedures. Suppose one has a training sample (y i , x i ) and wants to predict a future Y value from measured X values. To do this, one chooses a family of models F = f (x, θ) and uses the training sample to find a value of θ that gives "good" performance. This structure holds in both regression or classification. Here f (•, θ) : IR p → IR, and the family of functions (the model F) is indexed by θ ∈ Θ. In multiple linear regression, F consists of all p-dimensional hyperflats and θ are the regression coefficients. In two-class linear discriminant analysis, F consists of all p-dimensional hyperflats and the θ are the values in the Mahalanobis-distance rule.