AAAI2026

Optimization and Robustness-Informed Membership Inference Attacks for LLMs

Zichen Song, Qixin Zhang, Ming Li, Yao Shu

被引用 1 次

摘要

The proliferation of Large Language Models (LLMs) has raised concerns over training data privacy. Membership Inference Attacks (MIA), aiming to identify whether specific data was used for training, pose significant privacy risks. However, existing MIA methods struggle to address the scale and complexity of modern LLMs. This paper introduces OR-MIA, a novel MIA framework inspired by model optimization and input robustness. First, training data points are expected to exhibit smaller gradient norms due to optimization dynamics. Second, member samples show greater stability, with gradient norms being less sensitive to controlled input perturbations. OR-MIA leverages these principles by perturbing inputs, computing gradient norms, and using them as features for a robust classifier to distinguish members from non-members. Evaluations on LLMs (70M to 6B parameters) and various datasets demonstrate that OR-MIA outperforms existing methods, achieving over 90% accuracy. Our findings highlight a critical vulnerability in LLMs and underscore the need for improved privacy-preserving training paradigms. Optimization and Robustness-Informed Membership Inference Attacks for LLMs Quantum Machine Learning: A Review of Algorithms and Applications Artificial Intelligence is revolutionizing the way we approach healthcare data analytics. This paper explores the applications of deep learning in predicting patient outcomes. This paper belongs to the field of quantum computing and machine learning and is classified as: Quantum Computing