EZPrivSecFL: Practical Private and Secure Federated Learning Framework

Bachelor Thesis, Master Thesis

Motivation

Federated learning (FL) is an emerging collaborative machine learning paradigm that addresses critical data privacy issues by enabling clients to train a global model using an aggregation server without revealing their training data [1]. FL is more efficient than traditional training as it uses the computation power and data of potentially millions of clients for training in parallel. However, such FL training raises several concerns about privacy and security.

Regarding privacy, FL is vulnerable to inference attacks by malicious aggregators that can infer clients’ data from their model updates. To tackle this concern, secure aggregation restricts the central aggregator to only learn the summation or average of the updates of clients [2,3,4]. Regarding security, standard FL techniques are vulnerable to Byzantine failures when a bounded number of clients are malicious and send fake local models to the server. The key idea of existing Byzantine-robust FL methods is that the server analyzes the clients’ local model updates and removes suspicious ones, before aggregating them to update the global model [5,6]. Mitigating both privacy and security concerns in FL simultaneously is highly challenging, because private FL prohibits access to individual model updates to avoid leakage, while secure FL requires access for comprehensive mathematical analysis [7,8,9]. On the other hand, existing FL libraries like LEAF [10], TensorFlow Federated [11], and FedML [12] do not support privacy and security concerns in FL yet.

This thesis should develop a framework called EZPrivSecFL, which simultaneously achieves model privacy and security with the help of cryptography. More concretely, this thesis should provide a Private and Secure Byzantine-robust FL in a dishonest-majority setting (clients). EZPrivSecFL will use the FLTrust aggregation approach [6] to resist Byzantine attacks. It will employ two non-colluding semi-honest servers and use secure two-party computation, specifically ABY2.0 primitives [13], to construct efficient private building blocks for FLTrust secure aggregation.

Goal

The student should implement the FLTrust approach in the MPC framework MOTION2NX [14]. To do so, the combination of Arithmetic Sharing and Garbled Circuits from ABY2.0 [13] should be used. The student should use an approximation for the ReLU function as proposed in [15]. In the end, EZPrivSecFL should convert FL libraries' code into private and secure FL in a fully automated manner. More concretely, the student should implement an end-to-end compiler from FL libraries like TensorFlow Federated [11] to a semi-honest 2PC protocol in the MOTION2NX framework. The EZPrivSecFL framework should have the following main contributions:

  • Easy to use: EZPrivSecFL should natively support FL libraries like TensorFlow Federated [11]. It should be easy to implement and to understand for users without knowledge about cryptography. EZPrivSecFL should provide an efficient and reproducible means for developing and evaluating private and secure FL algorithms.
  • Private FL: EZPrivSecFL should keep the input data of each user secure against any other user and semi-honest servers using secure aggregation.
  • Secure FL: EZPrivSecFL should offer Byzantine robustness and allow the incorporation of FLTrust robust aggregation [6].
  • Evaluation: EZPrivSecFL should be evaluated on existing poisoning attacks and different datasets proposed in [6, section VI]. The performance also should be compared to FLTrust [6].

In summary, by converting code in the FL library to private and secure FL, this thesis will significantly lower the entry barrier for FL engineers to use cryptographic MPC protocols in real-world FL applications.

Requirements

  • High motivation for challenging engineering tasks
  • At least basic knowledge of secure two party computation and ML algorithms
  • Good programming skills in Python, C/C++
  • High motivation + ability to work independently
  • Knowledge of the English language, Git, LaTeX, etc. goes without saying

References

Supervisors