CVPR2025

EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation

Diljeet Jagpal, Xi Chen, Vinay P. Namboodiri

摘要

Eight equally spaced frames from 24-frame GIFs generated by our EIDT-V model. Top row shows SD3 Medium [8] results for prompt: "A peacock displaying its feathers". Bottom row shows SDXL [32] results for prompt: "A child blowing bubbles that float and pop gently". These examples highlight the model's ability to generate high-quality videos with semantic and temporal coherence.