SIGMOD2025

Rainbow: Risk-aware Index Benefit Estimation Facing Out Of Distribution Workloads

Kecheng Luo, Ruiyang Ma, Peng Cai

摘要

Index tuning is crucial for optimizing database performance, and selecting the optimal index configuration relies on accurately estimating the benefits of each candidate index. However, existing learning-based works face key limitations-including limited accuracy, overfitting risks, and lack of estimation confidence-hindering practical deployment. Especially when a new workload follows a different distribution (a.k.a. ''out-of-distribution'', OOD), previously trained models become inaccurate and blind trust on their estimations can lead to severe performance regression. Therefore, this paper proposes Rainbow, a novel framework that more accurately estimates index benefits under in-distribution scenarios while also providing uncertainty quantification to adapt to OOD workloads. We propose a graph-based encoder with Graph Transformer for expressive and efficient feature encoding, directly leveraging the original query plan's structural information to capture the global and local impact of indexes. Unlike existing deterministic estimators designed to entirely replace the what-if caller, we employ a Bayesian neural network (BNN) whose probabilistic nature enhances robustness and whose weight distributions provide principled uncertainty quantification. This allows our BNN estimator to fall back to the what-if caller when necessary, avoiding performance regression. Extensive experiments demonstrate that Rainbow not only outperforms state-of-the-art estimators in accuracy and index tuning quality when integrated with index advisors, but also shows robustness to OOD workloads.