Emergent Mind

BUZz: BUffer Zones for defending adversarial examples in image classification

(1910.02785)
Published Oct 3, 2019 in cs.LG , cs.CR , eess.IV , and stat.ML

Abstract

We propose a novel defense against all existing gradient based adversarial attacks on deep neural networks for image classification problems. Our defense is based on a combination of deep neural networks and simple image transformations. While straightforward in implementation, this defense yields a unique security property which we term buffer zones. We argue that our defense based on buffer zones offers significant improvements over state-of-the-art defenses. We are able to achieve this improvement even when the adversary has access to the {\em entire} original training data set and unlimited query access to the defense. We verify our claim through experimentation using Fashion-MNIST and CIFAR-10: We demonstrate $<11\%$ attack success rate -- significantly lower than what other well-known state-of-the-art defenses offer -- at only a price of a $11-18\%$ drop in clean accuracy. By using a new intuitive metric, we explain why this trade-off offers a significant improvement over prior work.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.