Recent advances in the development and deployment of large language models (LLMs) promise profound impact on society and economy. However, LLMs with billions of parameters tend to memorize and reproduce significant portions of the vast amount of the data that is used for training them, which poses significant challenges from a security/privacy point of view.
The goal of this timely thesis is to explore various subtle ways in which LLMs can leak private training data, propose suitable defense mechanisms, and practically evaluate the proposed attacks/defenses using publicly available LLMs. Initially, the most recent literature in this field, such as [1,2], will be comprehensively reviewed and systematized. The primary objective is to derive a conclusion regarding the criticality of the information leakage and to identify a practical mitigation strategy. Then, a comprehensive evaluation of the defense’s effectiveness, encompassing the implementation of a prototype and benchmarking, must be carried out.
- Solid understanding of state-of-the-art NLP techniques
- Interest in cryptography and secure computation
- Programming experience
- High motivation + ability to work independently
- Knowledge of the English language, Git, LaTeX, etc. goes without saying
-  Nils Lukas, Ahmed Salem, Robert Sim, Shruti Tople, Lukas Wutschitz, Santiago Zanella Béguelin: Analyzing Leakage of Personally Identifiable Information in Language Models. In S&P’23. (opens in new tab)https://arxiv.org/pdf/2302.00539.pdf
-  Nicholas Carlini, Florian Tramèr, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom B. Brown, Dawn Song, Úlfar Erlingsson, Alina Oprea, Colin Raffel: Extracting Training Data from Large Language Models. In USENIX Security’21. (opens in new tab)https://www.usenix.org/system/files/sec21-carlini-extracting.pdf