ICML2025
Faster Approximation Algorithms for k-Center via Data Reduction
Arnold Filtser, Shaofeng H.-C. Jiang, Yi Li, Anurag Murty Naredla, Ioannis Psarros, Qiaoyuan Yang, Qin Zhang
Abstract
We study efficient algorithms for the Euclidean k-Center problem, focusing on the regime of large k. We take the approach of data reduction by considering α-coreset, which is a small subset S of the dataset P such that any β-approximation on S is an (α + β)-approximation on P . We give efficient algorithms to construct coresets whose size is k • o(n), which immediately speeds up existing approximation algorithms. Notably, we obtain a near-linear time O(1)-approximation when k = n c for any 0 < c < 1. We validate the performance of our coresets on real-world datasets with large k, and we observe that the coreset speeds up the well-known Gonzalez algorithm by up to 4 times, while still achieving similar clustering cost. Technically, one of our coreset results is based on a new efficient construction of consistent hashing with competitive parameters. This general tool may be of independent interest for algorithm design in high dimensional Euclidean spaces.