ICLR2023

AnyDA: Anytime Domain Adaptation

Omprakash Chakraborty, Aadarsh Sahoo, Rameswar Panda, Abir Das

摘要

Unsupervised domain adaptation is an open and challenging problem in computer vision. While existing research shows encouraging results in addressing crossdomain distribution shift on common benchmarks, they are often limited to testing under a specific target setting. This can limit their impact for many real-world applications that present different resource constraints. In this paper, we introduce a simple yet effective framework for anytime domain adaptation that is executable with dynamic resource constraints to achieve accuracy-efficiency trade-offs under domain-shifts. We achieve this by training a single shared network using both labeled source and unlabeled data, with switchable depth, width and input resolutions on the fly to enable testing under a wide range of computation budgets. Starting with a teacher network trained from a label-rich source domain, we utilize bootstrapped recursive knowledge distillation as a nexus between source and target domains to jointly train the student network with switchable subnetworks. Extensive experiments on several diverse benchmark datasets well demonstrate the superiority of our proposed approach over state-of-the-art methods. Recently, anytime prediction (Cai et al., 2019; Huang et al., 2018; Jie et al., 2019) that train a network to carry out inference under varying budget constraints have witnessed great success in many vision tasks. However, all these methods assume that the models are trained and tested using data coming from some fixed distribution and lead to substantially poor generalization when the two data distributions are different. The twin goals of aligning two domains and operating at different constrained computation budgets bring in additional challenges for anytime domain adaptation. To this end, we propose a simple yet effective method for anytime domain adaptation, called AnyDA, by considering domain alignment in addition to varying both network (width and depth) and input (resolution) scales to enable testing under a wide range of computation budgets. Such variation