NeurIPS2024
A Combinatorial Algorithm for the Semi-Discrete Optimal Transport Problem
Pankaj K. Agarwal, Sharath Raghvendra, Pouyan Shirzadian, Keegan Yao
Abstract
Optimal Transport (OT, also known as the Wasserstein distance) is a popular metric for comparing probability distributions and has been successfully used in many machine-learning applications. In the semi-discrete 2 -Wasserstein problem, we wish to compute the cheapest way to transport all the mass from a continuous distribution µ to a discrete distribution ν in R d for d ≥ 1 , where the cost of transporting unit mass between points a and b is d ( a, b ) = ∥ a − b ∥ 2 . When both distributions are discrete, a simple combinatorial framework has been used to find the exact solution (see e.g. [Orlin, STOC 1988]). In this paper, we propose a combinatorial framework for the semi-discrete OT, which can be viewed as an extension of the combinatorial framework for the discrete OT but requires several new ideas. We present a new algorithm that given µ and ν in R 2 and a parameter ε > 0 , computes an ε -additive approximate semi-discrete transport plan in O ( n 4 log n log 1 ε ) time (in the worst case), where n is the support-size of the discrete distribution ν and we assume that the mass of µ inside a triangle can be computed in O (1) time. Our algorithm is significantly faster than the known algorithms, and unlike many numerical algorithms, it does not make any assumptions on the smoothness of µ . As an application of our algorithm, we describe a data structure to store a large discrete distribution µ (with support size N ) using O ( N ) space so that, given a query discrete distribution ν (with support size k ), an ε -additive approximate transport plan can be computed in O ( k 3 √ N log 1 ε ) time in 2 dimensions. Our algorithm and data structure extend to higher dimensions as well as to p -Wasserstein problem for any p ≥ 1 .