KDD2025

CODA: Temporal Domain Generalization via Concept Drift Simulator

Chia-Yuan Chang, Yu-Neng Chuang, Zhimeng Jiang, Kwei-Herng Lai, Anxiao Jiang, Na Zou

Abstract

Machine learning models in real-world applications often suffer performance issues due to data distribution shifts. Temporal domain generalization aims to adapt models to the ''concept drift,'' maintaining future performance. Existing works based on model-centric training strategies may entail extensive interaction between data and model to appropriately train the model for distribution shifts. To this end, we aim to nip the problem in the bud by generating future domain data for model training and naturally bypassing the cumbersome interaction between data and model. We propose the COncept Drift simulAtor (CODA) framework incorporating a predicted feature correlation matrix to simulate future data for model training. Specifically, the feature correlations matrix serves as a delegation to represent data characteristics at each time point and the trigger for future data generation. Experimental results demonstrate that using CODA-generated data as training input effectively achieves temporal domain generalization across different model architectures with great transferability.