WWW2026

AdaQE-CG: Adaptive Query Expansion for Web-Scale Generative AI Model and Data Card Generation

Haoxuan Zhang, Ruochi Li, Zhenni Liang, Mehri Sattari, Phat Vo, Collin Qu, Ting Xiao, Junhua Ding, Yang Zhang, Haihua Chen

摘要

Transparent and standardized documentation is essential for building trustworthy generative AI (GAI) systems. However, current automated model and data card generation methods still face three key challenges: (i) Static templates. Most systems rely on fixed query templates that cannot adapt to diverse paper structures or evolving documentation requirements. (ii) Information scarcity. Web-scale repositories such as Hugging Face often provide incomplete or inconsistent metadata, resulting in missing or noisy information. (iii) Lack of benchmarks. The absence of standardized datasets and evaluation protocols prevents fair and reproducible assessment of documentation quality. To address these challenges, we propose AdaQE-CG, an Adaptive Query Expansion for Card Generation framework that integrates dynamic information extraction with cross-card knowledge transfer. The Intra-Paper Extraction via Context-Aware Query Expansion (IPE-QE) module iteratively refines extraction queries to capture richer and more complete information from scientific papers and repositories. The Inter-Card Completion using the MetaGAI Pool (ICC-MP) module enriches missing fields by transferring semantically relevant content from similar cards within a curated dataset. In addition, we construct MetaGAI-Bench, the first large-scale, expert-annotated benchmark for evaluating GAI documentation. Comprehensive experiments across five quality dimensions demonstrate that AdaQE-CG significantly outperforms existing approaches, surpasses human-authored data cards, and approaches human-level quality for model cards. Code, prompts, and data are publicly available at: https://github.com/haoxuan-unt2024/AdaQE-CG.