Differentially Private Learning Needs Better Model Initialization and Self-Distillation
NAACL 2025
TLDR
“Differentially private language models often produce poor quality text, but what exactly makes the text 'bad'? We first systematically analyze the types of errors in private models, finding two main categories: language errors (grammar, spelling, incomplete sentences) and inconsistencies (hallucinations, wrong attributions). Based on this analysis, we propose DPRefine: (1) strong initialization using filtered synthetic data before private training, and (2) self-distillation refinement after. This approach significantly reduces both types of errors while maintaining privacy guarantees, showing that understanding failure modes is key to improving private learning.”- Paper
Citation
@inproceedings{ngong-etal-2025-differentially, title = "Differentially Private Learning Needs Better Model Initialization and Self-Distillation", author = "Ngong, Ivoline C. and Near, Joseph and Mireshghallah, Niloofar", editor = "Chiruzzo, Luis and Ritter, Alan and Wang, Lu", booktitle = "Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)", month = apr, year = "2025", address = "Albuquerque, New Mexico", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2025.naacl-long.455/", pages = "9009--9027", ISBN = "979-8-89176-189-6", }