Mnemonist: locating model parameters that memorize training examples

AS Shamsabadi, J Hayes, B Balle… - Uncertainty in Artificial …, 2023 - proceedings.mlr.press
Uncertainty in Artificial Intelligence, 2023proceedings.mlr.press
Recent work has shown that an adversary can reconstruct training examples given access to
the parameters of a deep learning image classification model. We show that the quality of
reconstruction depends heavily on the type of activation functions used. In particular, we
show that ReLU activations lead to much lower quality reconstructions compared to smooth
activation functions. We explore if this phenomenon is a fundamental property of models
with ReLU activations, or if it is a weakness of current attack strategies. We first study the …
Abstract
Recent work has shown that an adversary can reconstruct training examples given access to the parameters of a deep learning image classification model. We show that the quality of reconstruction depends heavily on the type of activation functions used. In particular, we show that ReLU activations lead to much lower quality reconstructions compared to smooth activation functions. We explore if this phenomenon is a fundamental property of models with ReLU activations, or if it is a weakness of current attack strategies. We first study the training dynamics of small MLPs with ReLU activations and identify redundant model parameters that do not memorise training examples. Building on this, we propose our Mnemonist method, which is able to detect redundant model parameters, and then guide current attacks to focus on informative parameters to improve the quality of reconstructions of training examples from ReLU models.
proceedings.mlr.press
Showing the best result for this search. See all results