AAAI2026

Rademacher Complexity for Distributionally Robust Learning

Zhengyu Zhou, Weiwei Liu

摘要

The ultimate goal of passive supervised machine learning is to find a hypothesis function based on a set of examples that has small error with respect to some target function. This goal is independent of most aspects of the learning setting-that is, the same goal applies to both classification and regression problems in both the realizable and agnostic cases-so it would be nice to have a general way of dealing with this type of problem. Often, our attempts to get a handle on the sufficient conditions for learning (most notably in the PAC model) led us to proving results known as uniform convergence bounds. These bounds stated that the empirical errors of concepts from a given concept class H converge uniformly to their true errors. In other words, we bounded the difference between the training error and generalization error for all functions in H. Through our discussions of VC theory we have seen that we can improve generalization by controlling the complexity of the concept class H from which we are choosing a hypothesis. We saw that the shatter coefficient and VC dimension were useful measures of complexity because we could bound the performance of a learning algorithm in terms of these quantities and the amount of data we have. The bounds we derived based on VC dimension were distribution independent. In some sense, distribution independence is a nice property because it guarantees the bounds to hold for any data distribution. On the other hand, the bounds may not be tight for some specific distributions that are more benign than the worst case. Furthermore, the concepts used in defining VC dimension apply only to binary classification, but we are often interested in generalization bounds for multiclass classification and regression as well. Rademacher complexity is a more modern notion of complexity that is distribution dependent and defined for any class real-valued functions (not only discrete-valued functions).