Toward Visual Distortion in Black-Box Attacks

Constructing adversarial examples in a black-box threat model injures the original images by introducing visual distortion. In this paper, we propose a novel black-box attack approach that can directly minimize the induced distortion by learning the noise distribution of the adversarial example, ass...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 30(2021) vom: 02., Seite 6156-6167
1. Verfasser: Li, Nannan (VerfasserIn)
Weitere Verfasser: Chen, Zhenzhong
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2021
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM327607149
003 DE-627
005 20231225201143.0
007 cr uuu---uuuuu
008 231225s2021 xx |||||o 00| ||eng c
024 7 |a 10.1109/TIP.2021.3092822  |2 doi 
028 5 2 |a pubmed24n1091.xml 
035 |a (DE-627)NLM327607149 
035 |a (NLM)34214038 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Li, Nannan  |e verfasserin  |4 aut 
245 1 0 |a Toward Visual Distortion in Black-Box Attacks 
264 1 |c 2021 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 12.07.2021 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a Constructing adversarial examples in a black-box threat model injures the original images by introducing visual distortion. In this paper, we propose a novel black-box attack approach that can directly minimize the induced distortion by learning the noise distribution of the adversarial example, assuming only loss-oracle access to the black-box network. To quantify visual distortion, the perceptual distance between the adversarial example and the original image, is introduced in our loss. We first approximate the gradient of the corresponding non-differentiable loss function by sampling noise from the learned noise distribution. Then the distribution is updated using the estimated gradient to reduce visual distortion. The learning continues until an adversarial example is found. We validate the effectiveness of our attack on ImageNet. Our attack results in much lower distortion when compared to the state-of-the-art black-box attacks and achieves 100% success rate on InceptionV3, ResNet50 and VGG16bn. Furthermore, we theoretically prove the convergence of our model. The code is publicly available at https://github.com/Alina-1997/visual-distortion-in-attack 
650 4 |a Journal Article 
700 1 |a Chen, Zhenzhong  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society  |d 1992  |g 30(2021) vom: 02., Seite 6156-6167  |w (DE-627)NLM09821456X  |x 1941-0042  |7 nnns 
773 1 8 |g volume:30  |g year:2021  |g day:02  |g pages:6156-6167 
856 4 0 |u http://dx.doi.org/10.1109/TIP.2021.3092822  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 30  |j 2021  |b 02  |h 6156-6167