AAAI2025

EditBoard: Towards a Comprehensive Evaluation Benchmark for Text-Based Video Editing Models

Yupeng Chen, Penglin Chen, Xiaoyu Zhang, Yixian Huang, Qian Xie

5 citations

Abstract

The rapid development of diffusion models has significantly advanced AI-generated content (AIGC), particularly in Textto-Image (T2I) and Text-to-Video (T2V) generation. Textbased video editing, leveraging these generative capabilities, has emerged as a promising field, enabling precise modifications to video content based on textual prompts. Despite the proliferation of innovative video editing models, there is a conspicuous lack of comprehensive evaluation frameworks that holistically assess these models' performance across various dimensions. Existing metrics are limited, inconsistent, and focused on assigning a single score per metric, failing to reveal model's performance on each editing task. To address this gap, we propose EditBoard, the first comprehensive evaluation benchmark for text-based video editing models. EditBoard encompasses nine automatic metrics across four key dimensions, evaluating models on four categories of tasks, and introduces three new metrics to assess fidelity. This task-oriented framework facilitates objective evaluation by breaking down model performance into details, providing insights into each model's strengths and weaknesses. By open-sourcing EditBoard, we aim to standardize evaluation and advance the development of robust video editing models.