NeurIPS2023
CS4ML: A general framework for active learning with arbitrary data based on Christoffel functions
Juan M. Cardenas, Ben Adcock, Nick C. Dexter
12 citations
Abstract
We introduce a general framework for active learning in regression problems. Our framework extends the standard setup by allowing for general types of data, rather than merely pointwise samples of the target function. This generalization covers many cases of practical interest, such as data acquired in transform domains (e.g., Fourier data), vector-valued data (e.g., gradient-augmented data), data acquired along continuous curves, and, multimodal data (i.e., combinations of different types of measurements). Our framework considers random sampling according to a finite number of sampling measures and arbitrary nonlinear approximation spaces (model classes). We introduce the concept of generalized Christoffel functions and show how these can be used to optimize the sampling measures. We prove that this leads to near-optimal sample complexity in various important cases. This paper focuses on applications in scientific computing, where active learning is often desirable, since it is usually expensive to generate data. We demonstrate the efficacy of our framework for gradient-augmented learning with polynomials, Magnetic Resonance Imaging (MRI) using generative models and adaptive sampling for solving PDEs using Physics-Informed Neural Networks (PINNs). Example 2.3 (Active learning in standard regression) The above framework extends the standard active learning problem in regression. In the classic regression problem, D ⊆ R d is a domain and X = L 2 ρ (D) is the space of square-integrable functions f * : D → R with respect to a measure ρ. Note that ρ is considered fixed -it is the measure with respect to which we measure the error. To embed this problem into the above framework, we let X 0 = C(D) be the space of continuous functions on D, C = 1, the measurement domain D c = D be equal to the domain of the function, ρ 1 = ρ and the measurement space Y 1 = R (with the Euclidean inner product). We then define the sampling operator L 1 (θ)(f * ) = f * (θ) as the pointwise evaluation operator. In particular, for a measure µ = µ 1 satisfying Assumption 2.2, the training data (2.1) is Hence, the aim is to choose the measure µ (or equivalently, its Radon-Nikodym derivative ν) to ensure as good generalization as possible. Next, we let F ⊆ X 0 be a subset within which we seek to learn f * . We term F the approximation space. Note this could be a linear space such as a space of algebraic or trigonometric polynomials, or a nonlinear space such as the space of sparse Fourier functions, a space of functions with sparse