DOI: https://doi.org/10.15588/1607-3274-2019-2-12

SCOPING ADVERSARIAL ATTACK FOR IMPROVING ITS QUALITY

K. S. Khabarlak, L. S. Koriashkina

Abstract


Context. The subject of this paper is adversarial attacks, their types, reasons for the emergence. A simplified fast and effective logistic regression attack algorithm has been presented. The work’s relevance is explained by the fact that neural network’s critical vulnerability the so-called adversarial examples is yet to be deeply explored. By exploiting such a mechanism, it is possible to get a deliberate result from it breaking defenses of neural-network-based safety systems.
Objective. The purpose of the work is to develop algorithms for different kinds of attacks of a trained neural network with respect to preliminary the network’s weights analysis, to estimate attacked image quality loss, to perform a comparison of the developed algorithms and other adversarial attacks of a similar type.
Method. A fast and fairly efficient attack algorithm that can use either whole image or its certain regions is presented. Using the SSIM image structural similarity metric, an analysis of the algorithm and its modifications was carried out, as well as a comparison with previous methods using gradient for the attack.
Results. Simplified targeted and non-targeted attack algorithms have been built for a single-layer neural network trained to perform handwritten digit classification on the MNIST dataset. A visual and semantic interpretation of weights as pixel “importance” for recognizing an image as one class or another is given. Based on structural image similarity index SSIM an image quality loss analysis has been performed for images attacked by the proposed algorithms on the whole test dataset. Such an analysis has revealed the classes the most vulnerable to an adversarial attack as well as images, whose class can be changed by adding noise imperceptible by a human being.
Adversarial examples built with the developed algorithm has been transferred to a 5-layered network of an unknown architecture. In many cases images that were difficult to attack for the original network have seen a higher transfer rates, then the ones needed only minor image changes.
Conclusions. Adversarial examples built upon the adversarial attack scoping idea and the methodic of the input data analysis can be easily generalized to other image recognition problems which makes it applicable to a wide range of practical tasks. This way, another way of analyzing neural network safety (logistic regression included) against input data attacks is presented.

Keywords


adversarial attacks, fast adversarial attack algorithm, logistic regression, neural network vulnerabilities.

Full Text:

PDF

References


Krizhevsky A. ImageNet classification with deep convolutional neural networks / A. Krizhevsky, I. Sutskever, G. E. Hinton // Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems: 3–6 December 2012: proceedings. – Lake Tahoe, Nevada, USA: NIPS, 2012. – P. 1106–1114. DOI: 10.1145/3065386

Pang Wei Koh. Understanding Black-box Predictions via Influence Functions [Electronic resource] / Pang Wei Koh,

Percy Liang. – Access mode: http://arXiv:1703.04730.

Szegedy C. Intriguing properties of neural networks [Electronic resource] / C. Szegedy, W. Zaremba, I. Sutskever et al. – Access mode: http://arXiv:1312.6199.

Zeiler M. D. Visualizing and Understanding Convolutional Networks / Matthew D Zeiler, Rob Fergus eds.: Fleet D.,

Pajdla T., Schiele B., Tuytelaars T. // Computer Vision – ECCV 2014. Lecture Notes in Computer Science. –

Springer, Cham, 2014. – Part 1. – Vol 8689. – P. 818–833. – DOI:10.1007/978-3-319-10590-1_53

LeCun Y. The MNIST database of handwritten digits [Electronic resource] / Y. LeCun, Corinna Cortes,

Christopher J. C. Burges. – Access mode:http://yann.lecun.com/exdb/mnist/.

Goodfellow Ian J. Explaining and Harnessing Adversarial Examples [Electronic resource] / Ian J Goodfellow, J. Shlens, C. Szegedy. – Access mode:http://arXiv:1412.6572.

Kurakin A. Adversarial machine learning at scale [Electronic resource] / A. Kurakin, Ian J. Goodfellow, S. Bengio. – Access mode: http://arXiv preprint arXiv:1611.01236.

Dong Y. Boosting adversarial attacks with momentum. [Electronic resource] / Y. Dong, F. Liao, T. Pang et al. –

Access mode: http://arXiv:1710.06081.

Eykholt K. Robust Physical-World Attacks on Deep Learning Models. [Electronic resource] / K. Eykholt,

I. Evtimov, E. Fernandes et al. – Access mode: http://arXiv preprint arXiv:1707.08945v5.

Kurakin A. Adversarial examples in the physical world [Electronic resource] / A. Kurakin, I. Goodfellow,

S. Bengio. – Access mode: http://arXiv preprint arXiv:1607.02533.

Sharif M. Accessorize to a crime: Real and stealthy attacks on state-of the-art face recognition / M. Sharif,

S. Bhagavatula, L. Bauer, M. K. Reiter // Computer and Communications Security: ACM SIGSAC Conference,

Vienna, Austria, 24 –28 October 2016: proceedings. – ACM, 2016. – P. 1528–1540. DOI:10.1145/2976749.2978392

Chen Pin-Yu. ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without

Training Substitute Models / Pin-Yu Chen, Huan Zhang, Yash Sharma et al. // Artificial Intelligence and Security: the

th ACM Workshop AISec’17, Dallas, TX, USA, 30 October – 03 November 2017: proceedings. – ACM New

York, NY, USA, 2017. – P. 15–26. – DOI:10.1145/3128572.3140448

Carlini N. Towards evaluating the robustness of neural networks [Electronic resource] / N. Carlini, D. Wagner. – Access mode: http://arXiv:1608.04644 [cs.CR].

Papernot N. Technical Report on the CleverHans v2.1.0 Adversarial Examples Library [Electronic resource] /

N. Papernot, F. Faghri, N. Carlini et al. – Access mode:http://arxiv.org/abs/1610.00768.

Xiaoyong Yuan. Adversarial Examples: Attacks and Defenses for Deep Learning [Electronic resource] / Yuan

Xiaoyong, Pan He, Qile Zhu et al. – Access mode:http://arXiv:1712.07107v2 [cs.LG].

Naveed Akhtar. Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey [Electronicresource] / Naveed Akhtar, Ajmal Mian. – Access mode:http://arXiv:1801.00553 [cs.CV].

Papernot N. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples

[Electronic resource] / N. Papernot, P. McDaniel, and I. Goodfellow. – Access mode: http://arXiv preprint arXiv:1605.07277.

Image quality assessment: From error measurement to structural similarity / [Z. Wang, A. C. Bovik, H. R. Sheikh,

E. P. Simoncelli] // IEEE Trans. Image Process. – 2004. – Vol. 13, No. 4. – P. 600–612. – DOI:10.1109/tip.2003.819861


GOST Style Citations


1. Krizhevsky A., Sutskever I., Hinton G. E. ImageNet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems 25:26th Annual Conference on Neural Information Processing Systems : 3–6 December 2012: proceedings, Lake Tahoe.
Nevada, USA, NIPS, 2012, pp. 1106–1114. DOI:10.1145/3065386
2. Pang Wei Koh, Percy Liang Understanding Black-box Predictions via Influence Functions [Electronic resource]. Access mode: http://arXiv:1703.04730.
3. Szegedy C., Zaremba W., Sutskever I. et al. Intriguing properties of neural networks [Electronic resource]. Access mode: http://arXiv:1312.6199.
4. Zeiler M. D., Fergus Rob eds.: Fleet D., Pajdla T., Schiele B., Tuytelaars T. Computer Vision Visualizing and Understanding Convolutional Networks – ECCV 2014. Lecture Notes in Computer Science. Springer, Cham, 2014, Part 1, Vol 8689, pp. 818–833. DOI:10.1007/978-3-319-10590-1_53
5. LeCun Y. Corinna Cortes, Christopher J. C. Burges. The MNIST database of handwritten digits [Electronic resource].
Access mode: http://yann.lecun.com/exdb/mnist/.
6. Goodfellow Ian J., Shlens J., Szegedy C. Explaining and Harnessing Adversarial Examples [Electronic resource]. Access mode: http://arXiv:1412.6572.
7. Kurakin A., Goodfellow Ian J., Bengio S. Adversarial machine learning at scale [Electronic resource]. Access mode: http://arXiv preprint arXiv:1611.01236.
8. Dong Y., Liao F., Pang T. et al. Boosting adversarial attacks with momentum. [Electronic resource]. Access mode:http://arXiv:1710.06081.
9. Eykholt K., Evtimov I., Fernandes E. et al. Robust Physical-World Attacks on Deep Learning Models. [Electronic resource]. Access mode: http://arXiv preprint arXiv:1707.08945v5.
10. Kurakin A., Goodfellow I., Bengio S. Adversarial examples in the physical world [Electronic resource]. Access mode:
http://arXiv preprint arXiv:1607.02533.
11. Sharif M., Bhagavatula S., Bauer L., Reiter M. K. Accessorize to a crime: Real and stealthy attacks on state-of
the-art face recognition, Computer and Communications Security: ACM SIGSAC Conference, Vienna, Austria, 24–28
October 2016, proceedings. ACM, 2016, pp. 1528–1540. DOI: 10.1145/2976749.2978392
12. Chen Pin-Yu, Huan Zhang, Yash Sharma et al. ZOO: Zeroth Order Optimization based Black-box Attacks to Deep
Neural Networks without Training Substitute Models, Artificial Intelligence and Security: the 10th ACM Workshop
AISec’17, Dallas, TX, USA, 30 October – 03 November 2017: proceedings. ACM New York, NY, USA, 2017, pp. 15–26. DOI: 10.1145/3128572.3140448
13. Carlini N., Wagner D. Towards evaluating the robustness of neural networks [Electronic resource], Access mode:
http://arXiv:1608.04644 [cs.CR].
14. Papernot N., Faghri F., Carlini N. et al. Technical Report on the CleverHans v2.1.0 Adversarial Examples Library
[Electronic resource]. Access mode:http://arxiv.org/abs/1610.00768.
15. Xiaoyong Yuan, Pan He, Qile Zhu et al. Adversarial Examples: Attacks and Defenses for Deep Learning [Electronic resource]. Access mode:http://arXiv:1712.07107v2 [cs.LG].
16. Naveed Akhtar, Ajmal Mian. Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey [Electronic resource]. Access mode:
http://arXiv:1801.00553 [cs.CV].
17. Papernot N., McDaniel P., and Goodfellow I. Transferability in machine learning: from phenomena to black-box attacks
using adversarial samples [Electronic resource]. Access mode: http://arXiv preprint arXiv:1605.07277.
18. Wang Z., Bovik A. C., Sheikh H. R., Simoncelli E. P. Image quality assessment: From error measurement to structural
similarity, IEEE Trans. Image Process, 2004, Vol. 13, No. 4, pp. 600–612. DOI: 10.1109/tip.2003.819861







Copyright (c) 2019 K. S. Khabarlak, L. S. Koriashkina

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Address of the journal editorial office:
Editorial office of the journal «Radio Electronics, Computer Science, Control»,
National University "Zaporizhzhia Polytechnic", 
Zhukovskogo street, 64, Zaporizhzhia, 69063, Ukraine. 
Telephone: +38-061-769-82-96 – the Editing and Publishing Department.
E-mail: rvv@zntu.edu.ua

The reference to the journal is obligatory in the cases of complete or partial use of its materials.