CVPR2023

Mask-Free Video Instance Segmentation

Lei Ke, Martin Danelljan, Henghui Ding, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

摘要

Figure 1. Video instance segmentation (VIS) results of our MaskFreeVIS, trained without using any video or image mask annotation. By achieving a remarkable 42.5% mask AP on the YouTube-VIS val dataset, with a ResNet-50 backbone, our approach demonstrates that high-performing VIS can be learned even without any mask annotations.