CVPR2023

FlexiViT: One Model for All Patch Sizes

Lucas Beyer, Pavel Izmailov, Alexander Kolesnikov, Mathilde Caron, Simon Kornblith, Xiaohua Zhai, Matthias Minderer, Michael Tschannen, Ibrahim Alabdulmohsin, Filip Pavetic

Abstract

Figure 1. FlexiViT is a standard ViT model that sees randomized patch sizes, hence sequence lengths, during training. The patch embedding weights are resized adaptively for each patch size and the model weights are shared as-is across all patch sizes.