NeurIPS2023
Thrust: Adaptively Propels Large Language Models with External Knowledge
Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Jianshu Chen
5 citations
Abstract
Although large-scale pre-trained language models (PTLMs) are shown to encode rich knowledge in their model parameters, the inherent knowledge in PTLMs can be opaque or static, making external knowledge necessary. However, the existing information retrieval techniques could be costly and may even introduce noisy and sometimes misleading knowledge. To address these challenges, we propose the instance-level adaptive propulsion of external knowledge (IAPEK), where we only conduct the retrieval when necessary. To achieve this goal, we propose measuring whether a PTLM contains enough knowledge to solve an instance with a novel metric, Thrust, which leverages the representation distribution of a small number of seen instances. Extensive experiments demonstrate that Thrust is a good measurement of PTLM models' instance-level knowledgeability. Moreover, we can achieve higher cost-efficiency with Thrust score as the retrieval indicator than the naive usage of external knowledge on 88% of the evaluated tasks with 26% average performance improvement. Such findings shed light on the real-world practice of knowledge-enhanced LMs with a limited knowledge-seeking budget due to computation latency or costs ⋆ . Introduction Knowledge is crucial for understanding human language and solving various NLP tasks [59] . In recent years, the pre-trained language models (PTLM) have demonstrated great success on various NLP tasks [10, 43, 32, 44, 5 ] by storing rich encyclopedic [42] and commonsense [25] knowledge in their model parameters. However, such implicit knowledge could be opaque, static, or inefficient [23] . These issues motivate the common practice of seeking external knowledge [30, 57, 53, 17] with information retrieval methods and augmenting the inference models (e.g., PTLMs) [20, 12, 24] with the retrieved knowledge. However, this approach has two limitations: (i) extracting external knowledge with existing information retrieval tools can be costly for a large-scale knowledge resource. (ii) external knowledge can be unnecessary or even misleading. For instance, one of the best retrieving models ColBERT v2 [46] achieved 68.9 Success@5 on Natural Question [27] , which suggests that gold documents do not appear in the top five retrieved documents for 31.1% of the queries. Considering the limited input sequence length, the most useful documents may not be included for generating a prediction, while others may add noise to the model. On the other hand, PTLMs, which grow from millions (e.g., BERT [10]) to billions of parameters (e.g., OPT [61]), may solve the queries directly without