Silently Learning your Support Vector Machines Models

Silently Learning your Support Vector Machines Models

Bachelor Thesis

Motivation

Machine learning (ML) is widely used nowadays to extract new information from existing data and to make predictions based on the learned information. The use of ML, however, raises privacy concerns due to learning on users' data – which often leaks information about the training set – and/or the possibility of learning the service provider's model based on its forecasts. Because of the latter, for example, an adversary possessing the ML model of a company that offers ML as a service (MLaaS), which is a company secret, would (i) not require to pay to the company for the classification anymore and (ii) could even sell the model to a third party which would cause the company to loose in value.

This thesis will focus on deducing the server's Support Vector Machines (SVM) model based on its predictions. Although many solutions were proposed for privacy-preserving machine learning (including frameworks based on multi-party computation), there was a lack of attention to the attacks on the ideal functionality of machine learning. These enable an adversary to learn the service provider's model using specially crafted queries without interfering with the protocol. Possible attacks are shown in, for example, [1]. Furthermore, an example of vulnerable SVM can be found in [2].

Goal

The goal of this thesis is to analyze the most popular SVM kernels for the vulnerability to model stealing attacks, identify how many queries are needed to steal the model and other important attack characteristics (e.g., bandwidth, runtime), and eventually implement these attacks.

Requirements

  • Good analytical skills
  • At least basic understanding of machine learning
  • High motivation + ability to work independently
  • Knowledge of the English language and LaTeX

References

  • [1] Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart. “Stealing Machine Learning Models via Prediction APIs”. In USENIX Security, 2016.
  • [2] Tao Zhang, Sherman SM Chow, Zhe Zhou, and Ming Li. “Privacy-Preserving Wi-Fi Fingerprinting Indoor Localization.” In International Workshop on Security, 2016.

Supervisor

Publications