KDD2025
ALSA: Context-Sensitive Prompt Privacy Preservation in Large Language Models
Hongru Ma, Wenpeng Lu, Yanjie Liang, Tianyi Wang, Qi Zhang, Yingjie Zhu, Jiasheng Si
1 citation
Abstract
The remarkable prompting capability of large language models (LLMs) offers substantial convenience to users across diverse backgrounds. Nevertheless, as the sensitive information within prompts is inevitably exposed to LLMs, caution must be exercised to preserve privacy. Among various studies, text anonymization is considered an effective approach to preventing privacy leakage in prompts through text substitution. However, existing works overemphasize privacy while overlooks preserving contextual integrity, degrading semantic consistency. To address these concerns, this paper introduces a context-sensitive prompt privacy-preserving framework, namely Adaptive Linguistic Sanitization and Anonymization (ALSA). In specific, ALSA incorporates a three-dimensional scoring mechanism to dynamically quantify the substitutability of each word within a prompt by integrating the Privacy Leakage Risk Score (PLRS), the Contextual Information Importance Score (CIIS), and the Task Relevance Score (TRS). Subsequently, a clustering technique is adopted to dynamically determine the threshold for assigning an anonymization action (i.e., Retain, Replace, Encrypt, or Delete) by balancing privacy, semantics, and task relevance. Extensive experiments on five benchmark datasets validate the superiority of ALSA over state-of-the-art baselines in terms of accuracy, privacy preservation, and semantic integrity.