ICCV2023

Label-Efficient Online Continual Object Detection in Streaming Video

Jay Zhangjie Wu, David Junhao Zhang, Wynne Hsu, Mengmi Zhang, Mike Zheng Shou

24 citations

Abstract

Humans can watch a continuous video stream and effortlessly perform continual acquisition and transfer of new knowledge with minimal supervision yet retaining previously learnt experiences. In contrast, existing continual learning (CL) methods require fully annotated labels to effectively learn from individual frames in a video stream. Here, we examine a more realistic and challenging problem-Label-Efficient Online Continual Object Detection (LEOCOD) in streaming video. We propose a plugand-play module, Efficient-CLS, that can be easily inserted into and consistently improve existing CL algorithms for object detection in video streams with reduced data annotation costs and model retraining time. We show that our method has achieved significant improvement with minimal forgetting across all supervision levels on two challenging CL benchmarks for streaming real-world videos. Remarkably, with only 25% annotated video frames, our proposed method still outperforms the state-of-the-art CL models trained with 100% annotations on all video frames. The data and source code will be publicly available at https: //github.com/showlab/Efficient-CLS .