A RESEARCH OF THE LATEST APPROACHES TO VISUAL IMAGE RECOGNITION AND CLASSIFICATION
DOI:
https://doi.org/10.15588/1607-3274-2024-1-13Keywords:
machine learning, computer vision, image processing, convolutional neural networks, visual image recognition, visual image classification, algorithms, telecommunication systemsAbstract
Context. The paper provides an overview of current methods for recognizing and classifying visual images in static images or video stream. The paper will discuss various approaches, including machine learning, current problems of these methods and possible improvements. The biggest challenges of the visual image retrieval and classification task are discussed. The main emphasis is placed on the review of such promising algorithms as SSD, YOLO, R-CNN, an overview of the principles of these methods, network architectures.
Objective. The aim of the work is to analyze existing studies and find the best algorithm for recognizing and classifying visual images for further activities.
Method. Primary method is to compare different factors of algorithms in order to select the most perspective one. There are different marks to compare, like image processing speed, accuracy. There are a number of studies and publications that propose methods and algorithms for solving the problem of finding and classifying images in an image [3–6]. It should be noted that most promising approaches are based on machine learning methods. It is worth noting that the proposed methods have drawbacks due to the imperfect implementation of the Faster R-CNN, YOLO, SSD algorithms for working with streaming video. The impact of these drawbacks can be significantly reduced by applying the following solutions: development of combined identification methods, processing of edge cases – tracking the position of identified objects, using the difference between video frames, additional preliminary preparation of input data. Another major area for improvement is the optimization of methods to work with real-time video data, as most current methods focus on images.
Results. As an outcome of the current research we have found an optimal algorithm for further researches and optimizations.
Conclusions. Analysis of existent papers and researches has demonstrated the most promising algorithm for further optimizations and experiments. Also current approaches still have some space for further. The next step is to take the chosen algorithm and investigate possibilities to enhance it.
References
Yue W., Liu S., Li Y. An Efficient Pure CNN Network for Medical Image Classification, Applied Sciences, 2023, No. 13(16), P. 9226.
Cui W., Zhang Y., Zhang X., Li L., Liou F. Metal Additive Manufacturing Parts Inspection Using Convolutional Neural Network, Applied Sciences, 2020, No. 10(2), P. 545.
Lysechko V. P., Syvolovskyi I. M., Shevchenko B. V. et al. Research of modern NoSQL databases to simplify the process of their design, Academic journal: Mechanics Transport Communications, 2023, Vol. №21, Issue 2, article №2363, pp. 234–242
Lysechko V. P., Zorina O. I., Sadovnykov B. I. et al. Experimental study of optimized face recognition algorithms for resource – constrained, Academic journal: Mechanics Transport Communications, 2023, Vol. №21, Issue 1, article №2343, pp. 89–95.
Mohana, Ravish Aradhya H. V Design and Implementation of Object Detection, Tracking, Counting and Classification Algorithms using Artificial Intelligence for Automated Video Surveillance Applications, Conference, 24th International Conference on Advanced Computing and Communications, 2022, pp. 56–60.
Feroz A., Sultana M., Hasan R. et al. Object Detection and Classification from a Real-Time Video Using SSD and YOLO Models, Computational Intelligence in Pattern Recognition, 2021, 405 p.
Seker M., Köylüoğlu Y., Celebi A., Bayram B. Effects of Open-Source Image Preprocessing on Glaucoma and Glaucoma Suspect Fundus Image Differentiation with CNN [Electronic resource], 2021, Access mode: https://doi.org/10.21203/rs.3.rs-1695441/v1.
Sadovnykov B., Zhuchenko O., Perets K. Overview of stateof-the-art image object detection and classification approaches, Collection of scientific papers of UkrDUZT International scientific and technical conference “Development of scientific and innovative activity in transport”, 2023, Issue 177. Kharkiv, UkrDUZT, pp. 46–48.
Sharada K., Alghamdi W., Karthika K., Alawadi A., Nozima G., Vijayan V. Deep Learning Techniques for Image Recognition and Object Detection, E3S Web of Conferences 2023, Vol. 399, Article Number 04032, pp. 234–243.
Girshick R., Donahue J., Darrell T., Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 86–114.
Girshick R. Fast R-CNN, IEEE International Conference on Computer Vision (ICCV), 2015, pp. 112–123.
Ren S., He K., Girshick R. et al. Faster R-CNN: Towards real-time object detection with region proposal networks, Neural Information Processing Systems (NIPS), 2015.
Mijwil M., Aggarwal K., Doshi R. et al. The Distinction between R-CNN and Fast R-CNN in Image Analysis: A Performance Comparison, Asian Journal of Applied Sciences, 2022, No. 10(5), pp. 429–437.
Dong W. Faster R-CNN and YOLOv3: a general analysis between popular object detection networks, Journal of Physics Conference Series, 2023, No. 2580(1), № 012016.
Redmon J., Divvala S., Girshick R., Farhadi A. You Only Look Once: Unified, Real-Time Object Detection, IEEE Conference on Computer Vision and Pattern Recognition, 2016, 118 p.
Terven J., Cordova-Esparza D. A comprehensive review of YOLO: from YOLOv1 and beyond, arXiv: 2304.00501, 2023, 125 p.
Li Y., Fan Q., Huang H. A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition, MDPI Innovative Urban Mobility, 2023, pp. 35–45.
Liu W., Anguelov D., Erhan D., Szegedy C., Reed S., Fu C., Berg A., SSD: Single Shot MultiBox Detector, arXiv 1512.02325, 2016.
Simonyan K., Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv:1409.1556, 2014, pp 202–212.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 V. P. Lysechko, B. I. Sadovnykov, O. M. Komar, О. S. Zhuchenko
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.