ICLR2022

Omni-Dimensional Dynamic Convolution

Chao Li, Aojun Zhou, Anbang Yao

被引用 408 次

摘要

Learning a single static convolutional kernel 1 in each convolutional layer is the common training paradigm of modern Convolutional Neural Networks (CNNs). Instead, recent research in dynamic convolution shows that learning a linear combination of n convolutional kernels weighted with their input-dependent attentions can significantly improve the accuracy of light-weight CNNs, while maintaining efficient inference. However, we observe that existing works endow convolutional kernels with the dynamic property through one dimension (regarding the convolutional kernel number) of the kernel space, but the other three dimensions (regarding the spatial size, the input channel number and the output channel number for each convolutional kernel) are overlooked. Inspired by this, we present Omni-dimensional Dynamic Convolution (ODConv), a more generalized yet elegant dynamic convolution design, to advance this line of research. ODConv leverages a novel multi-dimensional attention mechanism with a parallel strategy to learn complementary attentions for convolutional kernels along all four dimensions of the kernel space at any convolutional layer. As a drop-in replacement of regular convolutions, ODConv can be plugged into many CNN architectures. Extensive experiments on the ImageNet and MS-COCO datasets show that OD-Conv brings solid accuracy boosts for various prevailing CNN backbones including both light-weight and large ones, e.g., 3.77%∼5.71%|1.86%∼3.72% absolute top-1 improvements to MobivleNetV2|ResNet family on the ImageNet dataset. Intriguingly, thanks to its improved feature learning ability, ODConv with even one single kernel can compete with or outperform existing dynamic convolution counterparts with multiple kernels, substantially reducing extra parameters. Furthermore, ODConv is also superior to other attention modules for modulating the output features or the convolutional weights. Code and models are available at https://github.com/OSVAI/ODConv . * This work was done when Chao Li was an intern at Intel Labs China, supervised by Anbang Yao who proposed the original idea and led the writing of the paper. † Corresponding author. 1 Here, we follow the definitions in (Yang et al., 2019; Chen et al., 2020) where a convolutional kernel refers to the filter set of a convolutional layer.