Privacy-preserving machine learning (PPML) is a hot topic in the privacy research community. Especially private deep learning has gained a lot of attention in the last few years. However, most early works focus on privacy-preserving inference, e.g., [1-3], assuming an already trained model. But to leverage the vast amount of data available distributed across many devices, it is necessary to collaboratively train models. Additionally, outsourcing the computation is required due to the resource limitations at the edge. Both multi-party computation and outsourcing require to preserve the privacy of the data used for training. Just recently, a few works have started to investigate the privacy-preserving training of deep neural networks using function sharing or secure multi-party computation techniques [4-11].
An interesting work by Tramèr and Boneh  leverages a synergy of a trusted execution environment such as Intel SGX and blinding with random values to design a highly-efficient inference system. This thesis should extend their work to privacy-preserving training. The goal is to first design a protocol that splits the computation needed to train classical neural networks between the TEE and the untrusted operating system. An option is to investigate the efficient conversion between several secure computation techniques with TEEs. The protocol should balance between the secure environment with limited resources of the TEE and the untrusted but powerful operating system that enables multi-threading, GPU usage, etc., to achieve maximal efficiency while protecting data privacy.
The student should first review related work and identify possibilities for accelerating the neural network training by using TEEs. Then, the protocol should be designed, implemented and benchmarked on various model sizes as well as theoretically and experimentally compared to related work.
- Good programming skills in C/C++
- Basic knowledge of cryptography
- First experiences with TEEs (e.g., Intel SGX)
- Good mathematical background
- Sound background knowledge on neural networks
- High motivation + ability to work independently
- Knowledge of the English language, Git, LaTeX, etc. goes without saying
-  Fabian Boemer, Rosario Cammarota, Daniel Demmler,Thomas Schneider, and Hossein Yalame. MP2ML: A mixed-protocol machine learning framework for private inference (opens in new tab). In ARES, 2020.
-  Chiraag Juvekar, Vinod Vaikuntanathan, and Anantha Chandrakasan. Gazelle: A low latency framework for secure neural network inference (opens in new tab). In USENIX Security, 2018.
-  Florian Tramèr and Dan Boneh. Slalom: Fast, verifiable and private execution of neural networks in trusted hardware (opens in new tab). In ICLR, 2019.
-  Paymann Mohassel and Yupeng Zeng. SecureML: A system for scalable privacy-preserving machine learning (opens in new tab). In S&P, 2017.
-  Mohassel, Payman and Peter Rindal. ABY3: A mixed protocol framework for machine learning. In CCS, 2018.
-  Qian Lou, Bo Feng, Geoffrey C. Fox, and Lei Jiang. Glyph: Fast and accurately training deep neural networks on encrypted data (opens in new tab). In arXiv:1911.07101, 2019.
-  Nitin Agrawal, Ali Shahin Shamsabadi, Matt J. Kusner, and Adrià Gascón. QUOTIENT: Two-party secure neural network training and prediction. In CCS, 2019.
-  Karthik Nandakumar, Nalini Ratha, Sharath Pankanti, and Shai Halevi. Towards deep neural network training on encrypted data (opens in new tab). In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019.
-  Runhua Xu, James B.D. Joshi, and Chao Li. CryptoNN: Training neural networks over encrypted data (opens in new tab). In arXiv:1904.07303, 2019
-  Wagh, Sameer, Divya Gupta, and Nishanth Chandran. SecureNN: Efficient and private neural network training (opens in new tab). In PETS, 2019.
-  Sameer Wagh, Shruti Tople, Fabrice Benhamouda, Eyal Kushilevitz, Prateek Mittal, and Tal Rabin. FALCON: Honest-majority maliciously secure framework for private deep learning (opens in new tab). In arXiv:2004.02229, 2020.