TY - JOUR
T1 - Understanding the one-pixel attack
T2 - 2020 Workshop on Artificial Intelligence Safety, AISafety 2020
AU - Vargas, Danilo Vasconcellos
AU - Su, Jiawei
N1 - Publisher Copyright:
© 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
PY - 2020
Y1 - 2020
N2 - Deep neural networks were shown to be vulnerable to single pixel modifications. However, the reason behind such phenomena has never been elucidated. Here, we propose Propagation Maps which show the influence of the perturbation in each layer of the network. Propagation Maps reveal that even in extremely deep networks such as Resnet, modification in one pixel easily propagates until the last layer. In fact, this initial local perturbation is also shown to spread becoming a global one and reaching absolute difference values that are close to the maximum value of the original feature maps in a given layer. Moreover, we do a locality analysis in which we demonstrate that nearby pixels of the perturbed one in the one-pixel attack tend to share the same vulnerability, revealing that the main vulnerability lies in neither neurons nor pixels but receptive fields. Hopefully, the analysis conducted in this work together with a new technique called propagation maps shall shed light into the inner workings of other adversarial samples and be the basis of new defense systems to come.
AB - Deep neural networks were shown to be vulnerable to single pixel modifications. However, the reason behind such phenomena has never been elucidated. Here, we propose Propagation Maps which show the influence of the perturbation in each layer of the network. Propagation Maps reveal that even in extremely deep networks such as Resnet, modification in one pixel easily propagates until the last layer. In fact, this initial local perturbation is also shown to spread becoming a global one and reaching absolute difference values that are close to the maximum value of the original feature maps in a given layer. Moreover, we do a locality analysis in which we demonstrate that nearby pixels of the perturbed one in the one-pixel attack tend to share the same vulnerability, revealing that the main vulnerability lies in neither neurons nor pixels but receptive fields. Hopefully, the analysis conducted in this work together with a new technique called propagation maps shall shed light into the inner workings of other adversarial samples and be the basis of new defense systems to come.
UR - http://www.scopus.com/inward/record.url?scp=85089629334&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85089629334&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85089629334
SN - 1613-0073
VL - 2640
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
Y2 - 5 January 2021 through 10 January 2021
ER -