ICLR2025

Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps

Han Wang, Yilin Zhao, Dian Li, Xiaohan Wang, Sinbadliu, Xuguang Lan, Hui Wang

摘要

Humor is previously regarded as a gift exclusive to humans for the following reasons. Humor is a culturally nuanced aspect of human language, presenting challenges for its understanding and generation. Humor generation necessitates a multi-hop reasoning process, with each hop founded on proper rationales. Although many studies, such as those related to GPT-o1, focus on logical reasoning with reflection and correction, they still fall short in humor generation. Due to the sparsity of the knowledge graph in creative thinking, it is arduous to achieve multihop reasoning. Consequently, in this paper, we propose a more robust framework for addressing the humor reasoning task, named LoL. LoL aims to inject external information to mitigate the sparsity of the knowledge graph, thereby enabling multi-hop reasoning. In the first stage of LoL, we put forward an automatic instruction-evolution method to incorporate the deeper and broader thinking processes underlying humor. Judgment-oriented instructions are devised to enhance the model's judgment capability, dynamically supplementing and updating the sparse knowledge graph. Subsequently, through reinforcement learning, the reasoning logic for each online-generated response is extracted using GPT-4o. In this process, external knowledge is re-introduced to aid the model in logical reasoning and the learning of human preferences. Finally, experimental results indicate that the combination of these two processes can enhance both the model's judgment ability and its generative capacity. These findings deepen our comprehension of the creative capabilities of large language models (LLMs) and offer approaches to boost LLMs' creative abilities for cross-domain innovative applications. * Equal Contribution. Work done during internship at Tencent QQ, as a part of QQ MLLM project † corresponding author ‡ Project leader of QQ MLLM project We evaluated the humor judgment abilities of various large language models on both Chinese and English humor datasets. Experiments demonstrate that LoL outperforms other models on almost all test sets. Additional confirmatory experiments were conducted to show that LoL enhances the model's divergent thinking ability and effectiveness in humor generation. Our contributions are summarized as follows. 1. We propose an automatic instruction-evolution system for conversation data. A three-agent system is introduced to inject and augment knowledge into the original training data. This * https://github.com/Leymore/ruozhiba/tree/main?tab=readme-ov-file * https://www.wjx.cn/