ICML2025
Invariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-Tuning
Changsheng Wang, Yihua Zhang, Jinghan Jia, Parikshit Ram, Dennis Wei, Yuguang Yao, Soumyadeep Pal, Nathalie Baracaldo, Sijia Liu
Abstract
We adapt IRM unlearning by replacing the ERM loss with an unlearning objective โ ๐ข , while keeping the invariance regularization to resist downstream fine-tuning Here, ๐ ๐ encodes the fine-tuning environment (e.g., GSM8K or AGNews), unrelated to unlearning. โข The invariance regularization encourages ๐ฝ to be robust to fine-tuning across all ๐ ๐ .