ICLR2025
Improving Data Efficiency via Curating LLM-Driven Rating Systems
Jinlong Pang, Jiaheng Wei, Ankit Shah, Zhaowei Zhu, Yaxuan Wang, Chen Qian, Yang Liu, Yujia Bao, Wei Wei
摘要
Recent studies challenge the general data scaling law, indicating that most of the knowledge is acquired during pre-training. New Censensus: data quality matters far more than quantity.