EMNLP2024

FAC²E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition

Xiaoqiang Wang, Lingfei Wu, Tengfei Ma, Bang Liu

被引用 1 次

摘要

Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks. However, such a paradigm fails to comprehensively differentiate the fine-grained language and cognitive skills, rendering the lack of sufficient interpretation to LLMs' capabilities. In this paper, we present FAC 2 E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation. Specifically, we formulate LLMs' evaluation in a multi-dimensional and explainable manner by dissociating the language-related capabilities and the cognitionrelated ones. Besides, through extracting the intermediate reasoning from LLMs, we further break down the process of applying a specific capability into three sub-steps: recalling relevant knowledge, utilizing knowledge, and solving problems. Finally, FAC 2 E evaluates each sub-step of each fine-grained capability, providing a two-faceted diagnosis for LLMs. Utilizing FAC 2 E, we identify a common shortfall in knowledge utilization among models and propose a straightforward, knowledge-enhanced method to mitigate this issue. Our results not only showcase promising performance enhancements but also highlight a direction for future LLM advancements. Capability Description Skill Example LINGUISTIC KNOWLEDGE Grammaticality: agreements, licensing, long-distance dependencies, and garden-path effects. Encoding grammatical concepts support linguistic operations regarding word meanings and their combinatorial processing. Semantics: synonymy, antonymy, and hypernymy. FORMAL KNOWLEDGE Mechanism: deductive, inductive, and analogical. Conducting word-based formal reasoning through understanding lexical semantics. Skill: numeric, logic, and manipulation. WORLD MODELING Remember: factual knowledge, context, and commensense. Understanding text based on given context and associating it with world knowledge. Understand: narrative structure and discourse comprehension. SOCIAL MODELING Pragmatics: polite deceits, irony, maxims of conversation, metaphor, indirect speech, and humor. Infering mental state behind text and intended meaning beyond literal content. Theory-of-mind unexpected content and unexpected transfer tasks.