ACL2025

Should I Trust You? Detecting Deception in Negotiations using Counterfactual RL

Wichayaporn Wongkamjan, Yanze Wang, Feng Gu, Denis Peskoff, Jonathan K. Kummerfeld, Jonathan May, Jordan Lee Boyd-Graber

被引用 2 次

摘要

An increasingly common socio-technical problem is people being taken in by offers that sound "too good to be true", where persuasion and trust shape decision-making. This paper investigates how AI can help detect these deceptive scenarios. We analyze how humans strategically deceive each other in Diplomacy, a board game that requires both natural language communication and strategic reasoning. This requires extracting logical forms representing proposals-agreements that players suggest during communication-and computing their relative rewards using agents' value functions. Combined with text-based features, this can improve our deception detection. Our method detects human deception with a high precision when compared to a Large Language Model approach that flags many true messages as deceptive. Future human-AI interaction tools can build on our methods for deception detection by triggering friction to give users a chance of interrogating suspicious proposals. 1 u(a,a) Text→AMR→Proposals ( §3) CICERO's Value Model AMR Human Player Austria needs help and they are our ally! I(taly) can support hold from Trieste. Help, Turkey will definitely attack me in Serbia, please support hold!