EMNLP2024

Don't Just Say "I don't know"! Self-aligning Large Language Models for Responding to Unknown Questions with Explanations

Yang Deng, Yong Zhao, Moxin Li, See-Kiong Ng, Tat-Seng Chua

7 citations

Abstract

Despite the remarkable abilities of Large Language Models (LLMs) to answer questions, they often display a considerable level of overconfidence even when the question does not have a definitive answer. To avoid providing hallucinated answers to these unknown questions, existing studies typically investigate approaches to refusing to answer these questions. In this work, we propose a novel and scalable self-alignment method to utilize the LLM itself to enhance its response-ability to different types of unknown questions, being capable of not just refusing to answer but further proactively providing explanations to the unanswerability of unknown questions. Specifically, the Self-Align method first employ a two-stage classaware self-augmentation approach to generate a large amount of unknown question-response data. Then we conduct disparity-driven selfcuration to select qualified data for fine-tuning the LLM itself for aligning the responses to unknown questions as desired. Experimental results on two datasets across four types of unknown questions validate the superiority of the Self-Aligned method over existing baselines in terms of three types of task formulation. 1 * Equal contribution. 1 The data and code will be released at https://github. com/zhaoy777/KUQP-Dataset . Q: What animal can be found at the top of the men's Wimbledon trophy? Direct Answer A: The animal that can be found at the top of the men's Wimbledon trophy is a falcon. Unknown Question Detection A: The answer is unknown. A: The question is incorrect. Unknown Question Classification A: The question is incorrect because the Wimbledon men's singles trophy does not feature an animal at the top. Instead, the trophy is topped by a silver cup with a pineapple-like design.