ICLR2025
Oracle efficient truncated statistics
Konstantinos Karatapanis, Vasilis Kontonis, Christos Tzamos
Abstract
We study the problem of learning from truncated samples: instead of observing samples from some underlying population p * , we observe only the examples that fall in some survival set S ⊂ R d whose probability mass (measured with respect to p * ) is at least α. Assuming membership oracle access to the truncation set S, prior works obtained algorithms for the case where p * is Gaussian Daskalakis et al. (2018) or more generally an exponential family with strongly convex likelihood Lee et al. ( 2023 ) -albeit with a super-polynomial dependency on the (inverse) survival mass 1/α both in terms of runtime and in number of oracle calls to the set S. In this work we design a new learning method with runtime and query complexity polynomial in 1/α. Our result significantly improves over the prior works by focusing on efficiently solving the underlying optimization problem using a general purpose optimization algorithm with minimal assumptions.