CVPR2024

RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models

Ozgur Kara, Bariscan Kurtkaya, Hidir Yesiltepe, James M. Rehg, Pinar Yanardag

Abstract

Input Input "a white cat" "an ancient Egyptian pharaoh is typing" "a bear" "a zombie" "a dinosaur" "a man wearing a glitter jacket is typing" Figure 1 . RAVE is a lightweight and fast video editing method that enhances temporal consistency in video edits, utilizing pre-trained text-to-image diffusion models. It is capable of modifying local attributes, like changing a person's jacket (bottom right), and can also handle complex shape transformations, such as turning a wolf into a dinosaur (bottom left).