NeurIPS2022

Coresets for Wasserstein Distributionally Robust Optimization Problems

Ruomin Huang, Jiawei Huang, Wenjie Liu, Hu Ding

10 citations

Abstract

Wasserstein distributionally robust optimization (WDRO) is a popular model to enhance the robustness of machine learning with ambiguous data. However, the complexity of WDRO can be prohibitive in practice since solving its minimax'' formulation requires a great amount of computation. Recently, several fast WDRO training algorithms for some specific machine learning tasks (e.g., logistic regression) have been developed. However, the research on designing efficient algorithms for general large-scale WDROs is still quite limited, to the best of our knowledge. *Coreset* is an important tool for compressing large dataset, and thus it has been widely applied to reduce the computational complexities for many optimization problems. In this paper, we introduce a unified framework to construct the $\epsilon$-coreset for the general WDRO problems. Though it is challenging to obtain a conventional coreset for WDRO due to the uncertainty issue of ambiguous data, we show that we can compute a dual coreset'' by using the strong duality property of WDRO. Also, the error introduced by the dual coreset can be theoretically guaranteed for the original WDRO objective. To construct the dual coreset, we propose a novel grid sampling approach that is particularly suitable for the dual formulation of WDRO. Finally, we implement our coreset approach and illustrate its effectiveness for several WDRO problems in the experiments.