NeurIPS2022
Algorithms and Hardness for Learning Linear Thresholds from Label Proportions
Rishi Saket
被引用 15 次
摘要
We study the learnability of linear threshold functions (LTFs) in the learning from label proportions (LLP) framework. In this, the feature-vector classifier is learnt from bags of feature-vectors and their corresponding observed label proportions which are satisfied by (i.e., consistent with) some unknown LTF. This problem has been investigated in recent work ([37]) which gave an algorithm to produce an LTF that satisfies at least ( 2 / 5 ) -fraction of a satisfiable collection of bags, each of size 2 , by solving and rounding a natural SDP relaxation. However, this SDP relaxation is specific to at most 2 -sized bags and does not apply to bags of larger size. In this work we provide a fairly non-trivial SDP relaxation of a non-quadratic formulation for bags of size 3 . We analyze its rounding procedure using novel matrix decomposition techniques to obtain an algorithm which outputs an LTF satisfying at least ( 1 / 12 ) -fraction of the bags of size 3 . We also apply our techniques to bags of size q � 4 to provide a ⌦ ( 1 / q ) -approximation guarantee for a weaker notion of satisfiability. We include comparative experiments on simulated data demonstrating the applicability of our algorithmic techniques. From the complexity side we provide a hardness reduction to produce instances with bags of any constant size q . Our reduction proves the NP-hardness of satisfying more than ( 1 / q ) + o (1) fraction of a satisfiable collection of such bags using as hypothesis any function of constantly many LTFs, showing thereby that the problem is harder to approximate as the bag size q increases. Using a strengthened analysis, for q = 2 we obtain a ( 4 / 9 ) + o (1) hardness factor for this problem, improving upon the ( 1 / 2 ) + o (1) factor shown by [37].