NeurIPS2022

Algorithms and Hardness for Learning Linear Thresholds from Label Proportions

Rishi Saket

被引用 15 次

摘要

We study the learnability of linear threshold functions (LTFs) in the learning from label proportions (LLP) framework. In this, the feature-vector classiﬁer is learnt from bags of feature-vectors and their corresponding observed label proportions which are satisﬁed by (i.e., consistent with) some unknown LTF. This problem has been investigated in recent work ([37]) which gave an algorithm to produce an LTF that satisﬁes at least ( 2 / 5 ) -fraction of a satisﬁable collection of bags, each of size  2 , by solving and rounding a natural SDP relaxation. However, this SDP relaxation is speciﬁc to at most 2 -sized bags and does not apply to bags of larger size. In this work we provide a fairly non-trivial SDP relaxation of a non-quadratic formulation for bags of size 3 . We analyze its rounding procedure using novel matrix decomposition techniques to obtain an algorithm which outputs an LTF satisfying at least ( 1 / 12 ) -fraction of the bags of size  3 . We also apply our techniques to bags of size q � 4 to provide a ⌦ ( 1 / q ) -approximation guarantee for a weaker notion of satisﬁability. We include comparative experiments on simulated data demonstrating the applicability of our algorithmic techniques. From the complexity side we provide a hardness reduction to produce instances with bags of any constant size q . Our reduction proves the NP-hardness of satisfying more than ( 1 / q ) + o (1) fraction of a satisﬁable collection of such bags using as hypothesis any function of constantly many LTFs, showing thereby that the problem is harder to approximate as the bag size q increases. Using a strengthened analysis, for q = 2 we obtain a ( 4 / 9 ) + o (1) hardness factor for this problem, improving upon the ( 1 / 2 ) + o (1) factor shown by [37].