WWW2025

Effectiveness of Privacy-preserving Algorithms in LLMs: A Benchmark and Empirical Analysis

Jinglin Sun, Basem Suleiman, Imdad Ullah, Imran Razzak

2 citations

Abstract

Preserving individual privacy is crucial when interacting with Large Language Models (LLMs) during both training and inference stages. Privacy leakage at either stage can lead to irreversible negative consequences. Although data-level privacy-preserving algorithms have been developed for smaller Natural Language Processing (NLP) models, their application to LLMs has not been extensively explored. Moreover, with plenty of algorithms emerging, it brings challenges for organizations or researchers to compare and evaluate these different algorithms to select the most suitable one for their certain requirements. To address these challenges, we introduce ''Privacy-preserving4LLM Benchmarking'', a systematic evaluation framework that systematically assesses different privacy-preserving algorithms' utility-privacy trade-offs across different LLM architectures. Our framework evaluates these algorithms in three practical scenarios: protecting training data only, user queries only, and both. We also introduce a novel Parameter Optimizer to ensure fair comparisons. To quantify privacy protection levels, we use exposure metrics, where canary data sequences are intentionally inserted into training data to measure information memorization and potential leakage. Our study presents a comprehensive empirical analysis comparing three privacy-preserving algorithms across three LLM architectures (Mistral-7B, Llama2-7b, Falcon-7b) using three different datasets. Our findings reveal that algorithm selection, protection scenarios, LLM architectures, and privacy budget settings all impact the utility and privacy level.