CVPR2025

Test-Time Visual In-Context Tuning

Jiahao Xie, Alessio Tonioni, Nathalie Rauschmayr, Federico Tombari, Bernt Schiele

摘要

Figure 1 . Test-time visual in-context tuning (VICT) on six representative vision tasks under distribution shifts. We benchmark the robustness of VICL with 15 common corruptions adopted in [23, 31] , and report the averaged performance across all corruptions. Existing VICL models like Painter exhibit poor generalization capability to unseen new domains when the task prompts come from the training distribution (i.e., zero-shot). Performances are even worse when given task prompts from the test distribution (i.e., one-shot). By performing VICT at test time, we can significantly improve Painter in both zero-shot and one-shot manners.