CVPR2025

VIRES: Video Instance Repainting via Sketch and Text Guided Generation

Shuchen Weng, Haojie Zheng, Peixuan Zhang, Yuchen Hong, Han Jiang, Si Li, Boxin Shi

Abstract

A football field with a brown-green graffiti wall as the background. (b) Video instance replacement A dark-colored SUV is seen driving on the curve of the road, away from the camera. (a) Video instance repainting The man is dressed in a blue shirt, walking in the park. Input Result (c) Custom instance generation A corgi, with its orange and white fur, runs towards the camera. Input Result Figure 1. Our VIRES model demonstrates powerful video editing capabilities with sketch and text guidance, as shown in four typical scenarios: (a) Repainting the color and style of the man's shirt. (b) Replacing the pickup truck with the dark-colored SUV. (c) Generating a running corgi within a video clip. (d) Removing a specified football from a video clip.