CCS2025

Systematic Assessment of Tabular Data Synthesis

Yuntao Du, Ninghui Li

2 citations

Abstract

Data synthesis has been advocated as an important approach for utilizing data while protecting data privacy. In recent years, a plethora of tabular data synthesis algorithms (i.e., synthesizers) have been proposed. Some synthesizers satisfy Differential Privacy, while others aim to provide privacy in a heuristic fashion. A comprehensive understanding of the strengths and weaknesses of these synthesizers remains elusive due to drawbacks in evaluation metrics and missing head-to-head comparisons of newly developed synthesizers that take advantage of diffusion models and large language models with state-of-the-art statistical synthesizers.