NeurIPS2023

ProPILE: Probing Privacy Leakage in Large Language Models

Siwon Kim, Sangdoo Yun, Hwaran Lee, Martin Gubri, Sungroh Yoon, Seong Joon Oh

被引用 197 次

摘要

The rapid advancement and widespread use of large language models (LLMs) have raised significant concerns regarding the potential leakage of personally identifiable information (PII). These models are often trained on vast quantities of web-collected data, which may inadvertently include sensitive personal data. This paper presents ProPILE, a novel probing tool designed to empower data subjects, or the owners of the PII, with awareness of potential PII leakage in LLM-based services. ProPILE lets data subjects formulate prompts based on their own PII to evaluate the level of privacy intrusion in LLMs. We demonstrate its application on the OPT-1.3B model trained on the publicly available Pile dataset. We show how hypothetical data subjects may assess the likelihood of their PII being included in the Pile dataset being revealed. ProPILE can also be leveraged by LLM service providers to effectively evaluate their own levels of PII leakage with more powerful prompts specifically tuned for their in-house models. This tool represents a pioneering step towards empowering the data subjects for their awareness and control over their own data on the web. Recent years have seen staggering advances in large language models (LLMs) [27, 3, 33, 7, 30, 34, 24] . The remarkable improvement is commonly attributed to the massive scale of training data crawled indiscriminately from the web. The web-collected data is likely to contain sensitive personal information crawled from personal web pages, social media, personal profiles on online forums, and online databases such as collections of in-house emails [13] . They include various types of personally identifiable information (PII) for the data subjects, including their names, phone numbers, addresses, education, career, family members, and religion, to name a few. This poses an unprecedented level of privacy concern not matched by prior web-based products like social media. In social media, the affected data subjects were precisely the users who have consciously shared their private data with the awareness of associated risks. In contrast, products based on LLMs trained on uncontrolled, web-scale data have quickly expanded the scope of the affected data subjects far beyond the actual users of the LLM products. Virtually anyone who has left some form of PII on the world-wide-web is now relevant to the question of PII leakage. Currently, there is no assurance that adequate safeguards are in place to prevent the inadvertent disclosure of PII. Understanding of the probability and mechanisms through which PII could leak