Posts

Showing posts from July, 2019

Reading notes: Adversarial Examples Are Not Bugs, They Are Features

Image
Source:  Ilyas, Andrew, et al. "Adversarial examples are not bugs, they are features."  arXiv preprint arXiv:1905.02175  (2019).   https://arxiv.org/abs/1905.02175 Introduction Over the past few years, the adversarial attacks - which aim to force the machine learning systems to make misclassification by adding slightly perturbation - have received signification attention in the community. There are a lot of works show that intentional perturbation which is imperceptible to human can easily fool a deep learning classifier. In response to the threat, there has been much work on defensive techniques that help models against adversarial examples. But none of them really answer the fundamental question: Why do these adversarial attacks arise? By far, some previous research works view the adversarial examples as aberrations come from the high dimensional nature of the input space that will eventually disappear when we have enough training dataset or better training algo

Adversarial Examples Are a Natural Consequence of Test Error in Noise

Over the last few years, the phenomenon of adversarial examples — maliciously constructed inputs that fool trained machine learning models — has captured the attention of the research community, especially when the adversary is restricted to small modifications of a correctly handled input. Less surprisingly, image classifiers also lack human-level performance on randomly corrupted images, such as images with additive Gaussian noise. In this paper, we provide both empirical and theoretical evidence that these are two manifestations of the same underlying phenomenon. We establish close connections between the adversarial robustness and corruption robustness research programs, with the strongest connection in the case of additive Gaussian noise. This suggests that improving adversarial robustness should go hand in hand with improving performance in the presence of more general and realistic image corruptions. Based on our results we recommend that future adversarial defenses consider eva