WWW2024

A Fast Similarity Matrix Calibration Method with Incomplete Query

Changyi Ma, Runsheng Yu, Youzhi Zhang

被引用 2 次

摘要

The similarity matrix is at the core of similarity search problems. However, incomplete observations are ubiquitous in real scenarios making the similarity matrix less accurate. To estimate a high-quality similarity matrix, one popular trend is to impute the missing values into the vectors directly, which provides a simple and highly efficient way to recover the similarity matrix. However, these methods lack of theoretical guarantee due to ignoring the entire similarity matrix property directly. In this paper, based on the key insight that the similarity matrix is symmetric and enjoys the positive semi-definiteness (PSD) property, we proposed a novel similarity matrix calibration method, which is scalable, adaptive, and sound. Specifically, we first show the similarity matrix provably holds the PSD property as the constraint. Then, we proposed a parallel matrix calibration method to estimate the similarity matrix to approximate the unknown fully observed ground-truth similarity matrix. Further, we discover its factored form which bypasses the computation of singular values and allows fast optimization by general optimization algorithm. Stable recovery and convergence are guaranteed. Extensive similarity matrix calibration experiments on the real-world dataset demonstrated that the proposed method obtains superior performance while being the fastest in comparison to baseline methods. CCS CONCEPTS • Computing methodologies → Machine learning methodes.