ICML2024

Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum

Tin Sum Cheng, Aurélien Lucchi, Anastasis Kratsios, David Belius

12 citations

Abstract

We derive new bounds for the condition number of kernel matrices, which we then use to enhance existing non-asymptotic test error bounds for kernel ridgeless regression (KRR) in the over-parameterized regime for a fixed input dimension. For kernels with polynomial spectral decay, we recover the bound from previous work; for exponential decay, our bound is non-trivial and novel. Our contribution is two-fold: (i) we rigorously prove the phenomena of tempered overfitting and catastrophic overfitting under the sub-Gaussian design assumption, closing an existing gap in the literature; (ii) we identify that the independence of the features plays an important role in guaranteeing tempered overfitting, raising concerns about approximating KRR generalization using the Gaussian design assumption in previous literature. Contributions This work aims to understand the overfitting behaviour for kernel ridge regression (KRR). Our main contributions are summarized as follows: 1. We rigorously derive tight non-asymptotic upper and lower bounds for the test error of the minimum norm interpolant under a sub-Gaussian design assumption on the features. While this assumption assumes independence of the features, this point will be relaxed in contribution #3.