NeurIPS2024

Towards Reliable Model Selection for Unsupervised Domain Adaptation: An Empirical Study and A Certified Baseline

Dapeng Hu, Romy Luo, Jian Liang, Chuan Sheng Foo

Abstract

Selecting appropriate hyperparameters is crucial for unlocking the full potential 1 of advanced unsupervised domain adaptation (UDA) methods in unlabeled target 2 domains. Although this challenge remains under-explored, it has recently garnered 3 increasing attention with the proposals of various model selection methods. Reli-4 able model selection should maintain performance across diverse UDA methods 5 and scenarios, especially avoiding highly risky worst-case selections—selecting 6 the model or hyperparameter with the worst performance in the pool. Are existing 7 model selection methods reliable and versatile enough for different UDA tasks? In 8 this paper, we provide a comprehensive empirical study involving 8 existing model 9 selection approaches to answer this question. Our evaluation spans 12 UDA meth-10 ods across 5 diverse UDA benchmarks and 5 popular UDA scenarios. Surprisingly, 11 we find that none of these approaches can effectively avoid the worst-case selection. 12 In contrast, a simple but overlooked ensemble-based selection approach, which we 13 call EnsV, is both theoretically and empirically certified to avoid the worst-case 14 selection, ensuring high reliability. Additionally, EnsV is versatile for various 15 practical but challenging UDA scenarios, including validation of open-partial-set 16 UDA and source-free UDA. Finally, we call for more attention to the reliability 17 of model selection in UDA: avoiding the worst-case is as significant as achieving 18 peak selection performance and should not be overlooked when developing new 19 model selection methods. Code is available in the supplementary materials. 20