STOC2025
Solving the Correlation Cluster LP in Sublinear Time
Nairen Cao, Vincent Cohen-Addad, Euiwoong Lee, Shi Li, David Rasmussen Lolck, Alantha Newman, Mikkel Thorup, Lukas Vogl, Shuyi Yan, Hanwen Zhang
摘要
Correlation Clustering is a fundamental and widely-studied problem in unsupervised learning and data mining. The input is a graph and the goal is to construct a clustering minimizing the number of inter-cluster edges plus the number of missing intra-cluster edges. CCL+24 introduced the cluster LP for Correlation Clustering, which they argued captures the problem much more succinctly than previous linear programming formulations. However, the cluster LP has exponential size, with a variable for every possible set of vertices in the input graph. Nevertheless, CCL+24 showed how to find a feasible solution for the cluster LP in time with objective value at most times the value of an optimal solution for the respective Correlation Clustering instance. Furthermore, they showed how to round a solution to the cluster LP, yielding a -approximation algorithm for the Correlation Clustering problem. The main technical result of this paper is a new approach to find a feasible solution for the cluster LP with objective value at most of the optimum in time , where is the number of vertices in the graph. We also show how to implement the rounding within the same time bounds, thus achieving a fast -approximation algorithm for the Correlation Clustering problem. This bridges the gap between state-of-the-art methods for approximating Correlation Clustering and the recent focus on fast algorithms.