ТЕХНОЛОГІЯ ІНТЕЛЕКТУАЛЬНОГО АНАЛІЗУ ВІДЕОПОТОКУ ДЛЯ АВТОМАТИЧНОГО РОЗПІЗНАВАННЯ ЦІЛЕЙ СИСТЕМИ КЕРУВАННЯ ВОГНЕМ НА ОСНОВІ МАШИННОГО НАВЧАННЯ

V. Vysotska; R. Romanchuk

doi:10.15588/1607-3274-2024-3-7

Authors

V. Vysotska Lviv Polytechnic National University,Lviv,Ukraine, Ukraine
R. Romanchuk Lviv Polytechnic National University, Lviv, Ukraine, Ukraine

DOI:

https://doi.org/10.15588/1607-3274-2024-3-7

Keywords:

moving object recognition, security, privacy, YOLO, target identification, machine learning, APC, BMP, TANK

Abstract

Context. Target recognition is a priority in military affairs. This task is complicated by the fact that it is necessary to recognize moving objects, different terrain and landscape create obstacles for recognition. Combat actions can take place at different times of the day, accordingly, it is necessary to take into account the perspective of lighting and general lighting. It is necessary to detect the object in the video by segmenting the video frames, recognize and classify.

Objective of the study is to develop a technology for the analysis of the development of a technology for recognizing targets in real time as a component of the fire control system, due to the use of artificial intelligence, YOLO and machine learning.

Method. The article develops a video stream analysis technology for automatic target recognition of the fire control system based on machine learning. The paper proposes the development of a target recognition module as a component of the fire control system within the framework of the proposed information technology using artificial intelligence. The YOLOv8 pattern recognition model family was used to develop the target recognition module. The methods used during the study of the formed dataset.

– Bounding Box: Noise – Up to 15% of pixels (limiting frame: adding salt and pepper noise to the image – up to 15% of pixels).

– Bounding Box: Blur – Up to 2.5px (bounding box: adding Gaussian blur to the image – up to 2.5 pixels).

– Cutout – 3 boxes with 10% size each (cut out a part of the image – 3 boxes of 10% size each).

– Brightness Between –25% and +25% (changing the brightness of the image to increase the resistance of the model to changes in lighting and camera settings – from –25% to +25%).

– Rotation – Between –15 and +15 (rotation of the image object – clockwise or counterclockwise by degrees from –15 to +15).

– Flip – Horizontal (flip the image object horizontally).

Results. The data is collected from open sources, in particular, from videos posted in open sources on the YouTube platform. The main task of data preprocessing is the classification of three classes of objects on video or in real time – APC, BMP and TANK. The dataset is formed using the Roboflow platform based on the labeling tools and subsequently the augmentation tools. The dataset consists of 1193 unique images – approximately equally for each class. The training was conducted using Google Colab resources. It took 100 epochs to train the model.

Conclusions. Analysis is performed according to mAP50 (average precision as 0.85), mAP50-95 (0.6), precision (0.89) and recall (0.75). Big losses are due to the fact that the background was not taken into account during the research – training the module on the basis of confirmed data (images) of the background without technology

Author Biographies

V. Vysotska, Lviv Polytechnic National University,Lviv,Ukraine

PhD, Associate Professor of Information Systems and Networks Department

R. Romanchuk, Lviv Polytechnic National University, Lviv, Ukraine

Post-graduate student of Information Systems and Networks Department

References

Aote S. S., Wankhade N., Pardhi A., Misra N., Agrawal H., Potnurwar A. An improved deep learning method for flying object detection and recognition, Signal, Image and Video Processing, 2023, Vol. 18(1), pp. 143–152. DOI:10.1007/s11760-023-02703-y.

Zhang Z. Drone-YOLO: An Efficient Neural Network Method for Target Detection in Drone Images, Drones, 2023, Vol. 7 (2023), Art. 526. DOI:10.3390/drones7080526.

He K., Gkioxari G., Dollár P., Girshick R. Mask R-CNN, Computer Vision : the IEEE International Conference, Venice, Italy, 22–29 October 2017 : proceedings. Venice, IEEE, 2017, pp. 2961–2969. Access mode:https://openaccess.thecvf.com/content_ICCV_2017/papers/He_Mask_R-CNN_ICCV_2017_paper.pdf

Zhao H., Shi J., Qi X., Wang X., Jia J. Pyramid scene parsing network, Computer Vision and Pattern Recognition : the IEEE Conference, Honolulu, HI, USA, 21–26 July 2017 : proceedings. Honolulu, IEEE, 2017, pp. 2881–2890. Access mode:https://openaccess.thecvf.com/content_cvpr_2017/papers/Zhao_Pyramid_Scene_Parsing_CVPR_2017_paper.pdf

Ronneberger O., Fischer P., Brox T. U-net: Convolutional networks for biomedical image segmentation, Lecture Notes in Computer Science, 2015, Vol. 9351, pp. 234–241. DOI: 10.1007/978-3-319-24574-4_28.

Ren S., He K., Girshick R., Sun J. Faster R-CNN: Towards realtime object detection with region proposal networks, Advances in neural information processing systems, 2015, Vol. 28, pp. 1–9. Access mode:https://proceedings.neurips.cc/paper_files/paper/2015/file/14bfa6bb14875e45bba028a21ed38046-Paper.pdf

Girshick R. Fast R-CNN, Computer Vision : the IEEE International Conference, Santiago, Chile, 7–13 December 2015 : proceedings. Chile, IEEE, 2015, pp. 1440–1448. Access mode:https://openaccess.thecvf.com/content_iccv_2015/papers/Girshick_Fast_R-CNN_ICCV_2015_paper.pdf

Girshick R., Donahue J., Darrell, T. Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation, Computer Vision and Pattern Recognition : the IEEE Conference, Columbus, OH, USA, 23–28 June 2014 : proceedings. Columbus, IEEE, 2014, pp. 580–587. Access mode:https://openaccess.thecvf.com/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf

Huang G., Liu Z., Van Der Maaten L., Weinberger K. Q. Densely connected convolutional networks, Computer Vision and Pattern Recognition : the IEEE Conference, Honolulu, HI, USA, 21–26 July 2017 : proceedings. Honolulu, IEEE, 2017, pp. 4700–4708. Access mode:https://openaccess.thecvf.com/content_cvpr_2017/papers/Huang_Densely_Connected_Convolutional_CVPR_2017_paper.pdf

He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition, Computer Vision and Pattern Recognition : the IEEE Conference, Las Vegas, NV, USA, 27–30 June 2016 : proceedings. Las Vegas: IEEE, 2016, pp. 770–778. Access mode:https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf

Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition, arXiv. Access mode:https://arxiv.org/abs/1409.1556 DOI: 10.48550/arXiv.1409.1556.

Lin T., Maire M., Belongie S. J., Bourdev L. D., Girshick R. B., Hays J., Perona P., Ramanan D., Doll’ar P., Zitnick C. L. Microsoft COCO: Common Objects in Context, Lecture Notes in Computer Science, 2014, Vol. 8693, pp. 740–755. DOI: 10.1007/978-3-319-10602-1_48.

[Everingham M., Van Gool L., Williams C. K. I., Winn J., Zisserman A. The PASCAL Visual Object Classes Challenge Results, Pascal-Network.org. Access mode: http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.

Everingham M., Van Gool L., Williams C. K. I., Winn J., Zisserman A. The PASCAL Visual Object Classes Challenge Results, Pascal-Network.org. Access mode: http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html.

Jocher G., Chaurasia A., Qiu J. YOLO by Ultralytics, GitHub. Access mode:https://github.com/ultralytics/ultralytics/blob/main/CITATION.cff

Wang C. Y., Bochkovskiy A., Liao H. Y. M. YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, Computer Vision and Pattern Recognition : the IEEE/CVF Conference, Vancouver, BC, Canada, 18–22 June 2023 : proceedings. Vancouver: IEEE, 2023, pp. 7464–7475. Access mode: https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_YOLOv7_Trainable_Bag-of-Freebies_Sets_New_State-of-the-Art_for_Real-Time_Object_Detectors_CVPR_2023_paper.pdf

Li C., Li L., Jiang H., Weng K., Geng Y., Li L., Ke Z., Li Q., Cheng M., Nie W., Li Y., Zhang B., Liang Y., Zhou L., Xu X., Chu X., Wei X., Wei X. YOLOv6: A single-stage object detection framework for industrial applications, arXiv. Access mode: https://arxiv.org/abs/2209.02976 DOI: 10.48550/arXiv.2209.02976.

Ge Z., Liu S., Wang F., Li Z., Sun J. Yolox: Exceeding yolo series in 2021, arXiv. – Access mode: https://arxiv.org/abs/2107.08430 DOI: 10.48550/arXiv.2107.08430.

Jocher G., Stoken A., Chaurasia A. , Borovec J., NanoCode012, TaoXie, Kwon Y. , Michael K., Changyu L., Fang J., Abhiram V., Laughing, Tkianai, yxNONG, Skalski P., Hogan A. , Nadar J., Imyhxy, Mammana L., Wang Alex, Fati C., Montes D., Hajek J., Diaconu L., Minh M. T., Marc, Albinxavi, Fatih, Oleg, Wanghaoyang Ultralytics/Yolov5: V6.0 – YOLOv5n ’Nano’ Models, Roboflow Integration, TensorFlow Export, OpenCV DNN Support, Zenodo, 2021. Access mode:https://zenodo.org/record/5563715, https://github.com/ultralytics/yolov5/releases

Bochkovskiy A., Wang C. Y., Liao H. Y. M. Yolov4: Optimal speed and accuracy of object detection, arXiv. Access mode:https://arxiv.org/abs/2004.10934. DOI: 10.48550/arXiv.2004.10934

Redmon J., Farhadi A. YOLO9000: Better, faster, stronger, // Computer Vision and Pattern Recognition : the IEEE Conference, Honolulu, HI, USA, 21–26 July 2017 : proceedings. Honolulu, IEEE, 2017, pp. 7263–7271. Access mode: https://openaccess.thecvf.com/content_cvpr_2017/papers/Redmon_YOLO9000_Better_Faster_CVPR_2017_paper.pdf

Redmon J., Divvala S., Girshick R., Farhadi A. You only look once: Unified, real-time object detection, Computer Vision and Pattern Recognition : the IEEE Conference, Las Vegas, NV, USA, 27–30 June 2016 : proceedings. Las Vegas, IEEE, 2016, pp. 779–788. Access mode: https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Redmon_You_Only_Look_CVPR_2016_paper.pdf

Wang C. Y., Liao H. Y. M., Wu Y. H., Chen P. Y., Hsieh J. W., Yeh I. H. CSPNet: A new backbone that can enhance learning capability of CNN, Computer Vision and Pattern Recognition Workshops: the IEEE/CVF Conference, Seattle, WA, USA, 14-19 June 2020 : proceedings. Seattle, IEEE, 2020, pp. 390–391. Access mode:https://openaccess.thecvf.com/content_CVPRW_2020/papers/w28/Wang_CSPNet_A_New_Backbone_That_Can_Enhance_Learning_Capability_of_CVPRW_2020_paper.pdf

Chu X., Zheng A., Zhang X., Sun J. Detection in crowded scenes: One proposal, multiple predictions, Computer Vision and Pattern Recognition : the IEEE/CVF Conference, Seattle, WA, USA, 14–19 June 2020 : proceedings. Seattle, IEEE, 2020, pp. 12214–12223. Access mode: https://openaccess.thecvf.com/content_CVPR_2020/papers/Chu_Detection_in_Crowded_Scenes_One_Proposal_Multiple_Predictions_CVPR_2020_paper.pdf

Liu S., Qi L., Qin H., Shi J., Jia J.Path aggregation network for instance segmentation, Computer Vision and Pattern Recognition : the IEEE Conference, Salt Lake City, UT, USA, 18–23 June 2018 : proceedings. Salt Lake City: IEEE, 2018, pp. 8759–8768. Access mode:https://openaccess.thecvf.com/content_cvpr_2018/papers/Liu_Path_Aggregation_Network_CVPR_2018_paper.pdf

Glorot X., Bordes A., Bengi Y. Deep sparse rectifier neural networks, Artificial Intelligence and Statistics (Proceedings of Machine Learning Research): the Fourteenth International Conference, Ft. Lauderdale, FL, USA, 11–13 April 2011 : proceedings. Ft. Lauderdale, MLResearchPress, 2011, pp. 313–326. Access mode:https://proceedings.mlr.press/v15/glorot11a/glorot11a.pdf

Elfwing S., Uchibe E., Doya K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning, Neural Netw., 2018, Vol. 107 (2018), pp. 3–11. DOI: 10.1016/j.neunet.2017.12.012.

INTELLIGENT VIDEO ANALYSIS TECHNOLOGY FOR AUTOMATIC FIRE CONTROL TARGET RECOGNITION BASED ON MACHINE LEARNING

Authors

DOI:

Keywords:

Abstract

Author Biographies

V. Vysotska, Lviv Polytechnic National University,Lviv,Ukraine

R. Romanchuk, Lviv Polytechnic National University, Lviv, Ukraine

References

Downloads

Published

How to Cite

Issue

Section

License

Creative Commons Licensing Notifications in the Copyright Notices

Information

Current Issue