NeurIPS2023
Importance-aware Co-teaching for Offline Model-based Optimization
Ye Yuan, Can Chen, Zixuan Liu, Willie Neiswanger, Xue (Steve) Liu
被引用 37 次
摘要
Offline model-based optimization aims to find a design that maximizes a property of interest using only an offline dataset, with applications in robot, protein, and molecule design, among others. A prevalent approach is gradient ascent, where a proxy model is trained on the offline dataset and then used to optimize the design. This method suffers from an out-of-distribution issue, where the proxy is not accurate for unseen designs. To mitigate this issue, we explore using a pseudo-labeler to generate valuable data for fine-tuning the proxy. Specifically, we propose Importance-aware Co-Teaching for Offline Model-based Optimization (ICT). This method maintains three symmetric proxies with their mean ensemble as the final proxy, and comprises two steps. The first step is pseudo-label-driven co-teaching. In this step, one proxy is iteratively selected as the pseudo-labeler for designs near the current optimization point, generating pseudo-labeled data. Subsequently, a co-teaching process identifies small-loss samples as valuable data and exchanges them between the other two proxies for fine-tuning, promoting knowledge transfer. This procedure is repeated three times, with a different proxy chosen as the pseudo-labeler each time, ultimately enhancing the ensemble performance. To further improve accuracy of pseudo-labels, we perform a secondary step of meta-learning-based sample reweighting, which assigns importance weights to samples in the pseudo-labeled dataset and updates them via meta-learning. ICT achieves state-of-the-art results across multiple design-bench tasks, achieving the best mean rank of and median rank of , among methods. Our source code can be found here.