ACL2025

DynaQuest: A Dynamic Question Answering Dataset Reflecting Real-World Knowledge Updates

Qian Lin, Junyi Li, Hwee Tou Ng

被引用 5 次

摘要

The rapidly changing nature of real-world information presents challenges for large language models (LLMs), which are typically trained on static datasets. This limitation makes it difficult for LLMs to accurately perform tasks that require up-to-date knowledge, such as time-sensitive question answering (QA). In this paper, we introduce DynaQuest, a Dynamic Question answering dataset reflecting knowledge updates in the real world. DynaQuest is based on Wikipedia Infoboxes, which are frequently updated to reflect real-world changes. Our dataset is created by automatically identifying and comparing changes between different versions of Wikipedia pages and generating question-answer pairs based on these updates. To address the challenges posed by our dynamic dataset, we propose CARL, a Context-Aware Reinforcement Learning framework to improve the performance of LLMs on timesensitive question answering. We conduct experiments on our collected dataset across recent time periods and demonstrate the effectiveness of our approach. Furthermore, we maintain a dynamic knowledge updating process, providing a periodically evolving benchmark to continually evaluate LLMs' ability to answer time-sensitive questions. 1