ACL2025
UAQFact: Evaluating Factual Knowledge Utilization of LLMs on Unanswerable Questions
Chuanyuan Tan, Wenbiao Shao, Hao Xiong, Tong Zhu, Zhenhua Liu, Kai Shi, Wenliang Chen
被引用 2 次
摘要
Handling unanswerable questions (UAQ) is crucial for LLMs, as it helps prevent misleading responses in complex situations. While previous studies have built several datasets to assess LLMs' performance on UAQ, these datasets lack factual knowledge support, which limits the evaluation of LLMs' ability to utilize their factual knowledge when handling UAQ. To address the limitation, we introduce a new unanswerable question dataset UAQFact, a bilingual dataset with auxiliary factual knowledge created from a Knowledge Graph. Based on UAQFact, we further define two new tasks to measure LLMs' ability to utilize internal and external factual knowledge, respectively. Our experimental results across multiple LLM series show that UAQFact presents significant challenges, as LLMs do not consistently perform well even when they have factual knowledge stored. Additionally, we find that incorporating external knowledge may enhance performance, but LLMs still cannot make full use of the knowledge which may result in incorrect responses. 1 * Corresponding author 1 Our code and dataset are available at https://github.com/cytan17726/UAQ_Fact Task1 UAQ: Who is p 1 of e 1 and also p 2 of e 2 ? Label: 'UAQ' Answer: Ø ABQ: Who is p 1 of e 3 and also p 2 of e 4 ? Label: 'ABQ' Answer: a 1 , a 2 ① Question Type Definition QType -Inter: Set 1 ∩Set 2 Factual Triple UAQ: [(e 1 , p 1 , x 1 , ...x i ) & (e 2 , p 2 , y 1 , ...y j )] ABQ: [(e 3 , p 1 ,a 1 , a 2 ) & (e 4 , p 2 , a 1 , a 2 )] Property Description 1. p 1 means … 2. p 2 means … ② Factual Triple Sampling Sample Question Template Who is p 1 of [E 1 ]? Who is p 2 of [E 2 ]? Who is p 1 of [E 1 ] and also p 2 of [E 2 ] ? ④ Task Definition ③ Template Generation Query Prepare Query Sample Property: p 1 , p 2 ABQ: Join((?e 1 , p 1 , ?ans), (?e 2 , p 2 , ?ans)) UAQ: 1. (?e1, p1, ?ans 1 ) 2. (?e2, p2, ?ans 2