ICML2024

Fast Co-Training under Weak Dependence via Stream-Based Active Learning

Ilias Diakonikolas, Mingchen Ma, Lisheng Ren, Christos Tzamos

4 citations

Abstract

Co-training is a classical semi-supervised learning method which only requires a small number of labeled examples for learning, under reasonable assumptions. Despite extensive literature on the topic, very few hypothesis classes are known to be provably efficiently learnable via co-training, even under very strong distributional assumptions. In this work, we study the co-training problem in the stream-based active learning model. We show that a range of natural concept classes are efficiently learnable via co-training, in terms of both label efficiency and computational efficiency. We provide an efficient reduction of co-training under the standard assumption of weak dependence, in the stream-based active model, to online classification. As a corollary, we obtain efficient co-training algorithms with error independent label complexity for every concept class class efficiently learnable in the mistake bound online model. Our framework also gives cotraining algorithms with label complexity Õ(d log(1/ϵ)) for any concept class with VC dimension d, though in general this reduction is not computationally efficient. Finally, using additional ideas from online learning, we design the first efficient co-training algorithms with label complexity Õ(d 2 log(1/ϵ)) for several concept classes, including unions of intervals and homogeneous halfspaces.