INTELLIGENT VIDEO ANALYSIS TECHNOLOGY FOR AUTOMATIC FIRE CONTROL TARGET RECOGNITION BASED ON MACHINE LEARNING
DOI:
https://doi.org/10.15588/1607-3274-2024-3-7Keywords:
moving object recognition, security, privacy, YOLO, target identification, machine learning, APC, BMP, TANKAbstract
Context. Target recognition is a priority in military affairs. This task is complicated by the fact that it is necessary to recognize moving objects, different terrain and landscape create obstacles for recognition. Combat actions can take place at different times of the day, accordingly, it is necessary to take into account the perspective of lighting and general lighting. It is necessary to detect the object in the video by segmenting the video frames, recognize and classify.
Objective of the study is to develop a technology for the analysis of the development of a technology for recognizing targets in real time as a component of the fire control system, due to the use of artificial intelligence, YOLO and machine learning.
Method. The article develops a video stream analysis technology for automatic target recognition of the fire control system based on machine learning. The paper proposes the development of a target recognition module as a component of the fire control system within the framework of the proposed information technology using artificial intelligence. The YOLOv8 pattern recognition model family was used to develop the target recognition module. The methods used during the study of the formed dataset.
– Bounding Box: Noise – Up to 15% of pixels (limiting frame: adding salt and pepper noise to the image – up to 15% of pixels).
– Bounding Box: Blur – Up to 2.5px (bounding box: adding Gaussian blur to the image – up to 2.5 pixels).
– Cutout – 3 boxes with 10% size each (cut out a part of the image – 3 boxes of 10% size each).
– Brightness Between –25% and +25% (changing the brightness of the image to increase the resistance of the model to changes in lighting and camera settings – from –25% to +25%).
– Rotation – Between –15 and +15 (rotation of the image object – clockwise or counterclockwise by degrees from –15 to +15).
– Flip – Horizontal (flip the image object horizontally).
Results. The data is collected from open sources, in particular, from videos posted in open sources on the YouTube platform. The main task of data preprocessing is the classification of three classes of objects on video or in real time – APC, BMP and TANK. The dataset is formed using the Roboflow platform based on the labeling tools and subsequently the augmentation tools. The dataset consists of 1193 unique images – approximately equally for each class. The training was conducted using Google Colab resources. It took 100 epochs to train the model.
Conclusions. Analysis is performed according to mAP50 (average precision as 0.85), mAP50-95 (0.6), precision (0.89) and recall (0.75). Big losses are due to the fact that the background was not taken into account during the research – training the module on the basis of confirmed data (images) of the background without technology
References
Aote S. S., Wankhade N., Pardhi A., Misra N., Agrawal H., Potnurwar A. An improved deep learning method for flying object detection and recognition, Signal, Image and Video Processing, 2023, Vol. 18(1), pp. 143–152. DOI:10.1007/s11760-023-02703-y.
Zhang Z. Drone-YOLO: An Efficient Neural Network Method for Target Detection in Drone Images, Drones, 2023, Vol. 7 (2023), Art. 526. DOI:10.3390/drones7080526.
He K., Gkioxari G., Dollár P., Girshick R. Mask R-CNN, Computer Vision : the IEEE International Conference, Venice, Italy, 22–29 October 2017 : proceedings. Venice, IEEE, 2017, pp. 2961–2969. Access mode:https://openaccess.thecvf.com/content_ICCV_2017/papers/He_Mask_R-CNN_ICCV_2017_paper.pdf
Zhao H., Shi J., Qi X., Wang X., Jia J. Pyramid scene parsing network, Computer Vision and Pattern Recognition : the IEEE Conference, Honolulu, HI, USA, 21–26 July 2017 : proceedings. Honolulu, IEEE, 2017, pp. 2881–2890. Access mode:https://openaccess.thecvf.com/content_cvpr_2017/papers/Zhao_Pyramid_Scene_Parsing_CVPR_2017_paper.pdf
Ronneberger O., Fischer P., Brox T. U-net: Convolutional networks for biomedical image segmentation, Lecture Notes in Computer Science, 2015, Vol. 9351, pp. 234–241. DOI: 10.1007/978-3-319-24574-4_28.
Ren S., He K., Girshick R., Sun J. Faster R-CNN: Towards realtime object detection with region proposal networks, Advances in neural information processing systems, 2015, Vol. 28, pp. 1–9. Access mode:https://proceedings.neurips.cc/paper_files/paper/2015/file/14bfa6bb14875e45bba028a21ed38046-Paper.pdf
Girshick R. Fast R-CNN, Computer Vision : the IEEE International Conference, Santiago, Chile, 7–13 December 2015 : proceedings. Chile, IEEE, 2015, pp. 1440–1448. Access mode:https://openaccess.thecvf.com/content_iccv_2015/papers/Girshick_Fast_R-CNN_ICCV_2015_paper.pdf
Girshick R., Donahue J., Darrell, T. Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation, Computer Vision and Pattern Recognition : the IEEE Conference, Columbus, OH, USA, 23–28 June 2014 : proceedings. Columbus, IEEE, 2014, pp. 580–587. Access mode:https://openaccess.thecvf.com/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf
Huang G., Liu Z., Van Der Maaten L., Weinberger K. Q. Densely connected convolutional networks, Computer Vision and Pattern Recognition : the IEEE Conference, Honolulu, HI, USA, 21–26 July 2017 : proceedings. Honolulu, IEEE, 2017, pp. 4700–4708. Access mode:https://openaccess.thecvf.com/content_cvpr_2017/papers/Huang_Densely_Connected_Convolutional_CVPR_2017_paper.pdf
He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition, Computer Vision and Pattern Recognition : the IEEE Conference, Las Vegas, NV, USA, 27–30 June 2016 : proceedings. Las Vegas: IEEE, 2016, pp. 770–778. Access mode:https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf
Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition, arXiv. Access mode:https://arxiv.org/abs/1409.1556 DOI: 10.48550/arXiv.1409.1556.
Lin T., Maire M., Belongie S. J., Bourdev L. D., Girshick R. B., Hays J., Perona P., Ramanan D., Doll’ar P., Zitnick C. L. Microsoft COCO: Common Objects in Context, Lecture Notes in Computer Science, 2014, Vol. 8693, pp. 740–755. DOI: 10.1007/978-3-319-10602-1_48.
[Everingham M., Van Gool L., Williams C. K. I., Winn J., Zisserman A. The PASCAL Visual Object Classes Challenge Results, Pascal-Network.org. Access mode: http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.
Everingham M., Van Gool L., Williams C. K. I., Winn J., Zisserman A. The PASCAL Visual Object Classes Challenge Results, Pascal-Network.org. Access mode: http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html.
Jocher G., Chaurasia A., Qiu J. YOLO by Ultralytics, GitHub. Access mode:https://github.com/ultralytics/ultralytics/blob/main/CITATION.cff
Wang C. Y., Bochkovskiy A., Liao H. Y. M. YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, Computer Vision and Pattern Recognition : the IEEE/CVF Conference, Vancouver, BC, Canada, 18–22 June 2023 : proceedings. Vancouver: IEEE, 2023, pp. 7464–7475. Access mode: https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_YOLOv7_Trainable_Bag-of-Freebies_Sets_New_State-of-the-Art_for_Real-Time_Object_Detectors_CVPR_2023_paper.pdf
Li C., Li L., Jiang H., Weng K., Geng Y., Li L., Ke Z., Li Q., Cheng M., Nie W., Li Y., Zhang B., Liang Y., Zhou L., Xu X., Chu X., Wei X., Wei X. YOLOv6: A single-stage object detection framework for industrial applications, arXiv. Access mode: https://arxiv.org/abs/2209.02976 DOI: 10.48550/arXiv.2209.02976.
Ge Z., Liu S., Wang F., Li Z., Sun J. Yolox: Exceeding yolo series in 2021, arXiv. – Access mode: https://arxiv.org/abs/2107.08430 DOI: 10.48550/arXiv.2107.08430.
Jocher G., Stoken A., Chaurasia A. , Borovec J., NanoCode012, TaoXie, Kwon Y. , Michael K., Changyu L., Fang J., Abhiram V., Laughing, Tkianai, yxNONG, Skalski P., Hogan A. , Nadar J., Imyhxy, Mammana L., Wang Alex, Fati C., Montes D., Hajek J., Diaconu L., Minh M. T., Marc, Albinxavi, Fatih, Oleg, Wanghaoyang Ultralytics/Yolov5: V6.0 – YOLOv5n ’Nano’ Models, Roboflow Integration, TensorFlow Export, OpenCV DNN Support, Zenodo, 2021. Access mode:https://zenodo.org/record/5563715, https://github.com/ultralytics/yolov5/releases
Bochkovskiy A., Wang C. Y., Liao H. Y. M. Yolov4: Optimal speed and accuracy of object detection, arXiv. Access mode:https://arxiv.org/abs/2004.10934. DOI: 10.48550/arXiv.2004.10934
Redmon J., Farhadi A. YOLO9000: Better, faster, stronger, // Computer Vision and Pattern Recognition : the IEEE Conference, Honolulu, HI, USA, 21–26 July 2017 : proceedings. Honolulu, IEEE, 2017, pp. 7263–7271. Access mode: https://openaccess.thecvf.com/content_cvpr_2017/papers/Redmon_YOLO9000_Better_Faster_CVPR_2017_paper.pdf
Redmon J., Divvala S., Girshick R., Farhadi A. You only look once: Unified, real-time object detection, Computer Vision and Pattern Recognition : the IEEE Conference, Las Vegas, NV, USA, 27–30 June 2016 : proceedings. Las Vegas, IEEE, 2016, pp. 779–788. Access mode: https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Redmon_You_Only_Look_CVPR_2016_paper.pdf
Wang C. Y., Liao H. Y. M., Wu Y. H., Chen P. Y., Hsieh J. W., Yeh I. H. CSPNet: A new backbone that can enhance learning capability of CNN, Computer Vision and Pattern Recognition Workshops: the IEEE/CVF Conference, Seattle, WA, USA, 14-19 June 2020 : proceedings. Seattle, IEEE, 2020, pp. 390–391. Access mode:https://openaccess.thecvf.com/content_CVPRW_2020/papers/w28/Wang_CSPNet_A_New_Backbone_That_Can_Enhance_Learning_Capability_of_CVPRW_2020_paper.pdf
Chu X., Zheng A., Zhang X., Sun J. Detection in crowded scenes: One proposal, multiple predictions, Computer Vision and Pattern Recognition : the IEEE/CVF Conference, Seattle, WA, USA, 14–19 June 2020 : proceedings. Seattle, IEEE, 2020, pp. 12214–12223. Access mode: https://openaccess.thecvf.com/content_CVPR_2020/papers/Chu_Detection_in_Crowded_Scenes_One_Proposal_Multiple_Predictions_CVPR_2020_paper.pdf
Liu S., Qi L., Qin H., Shi J., Jia J.Path aggregation network for instance segmentation, Computer Vision and Pattern Recognition : the IEEE Conference, Salt Lake City, UT, USA, 18–23 June 2018 : proceedings. Salt Lake City: IEEE, 2018, pp. 8759–8768. Access mode:https://openaccess.thecvf.com/content_cvpr_2018/papers/Liu_Path_Aggregation_Network_CVPR_2018_paper.pdf
Glorot X., Bordes A., Bengi Y. Deep sparse rectifier neural networks, Artificial Intelligence and Statistics (Proceedings of Machine Learning Research): the Fourteenth International Conference, Ft. Lauderdale, FL, USA, 11–13 April 2011 : proceedings. Ft. Lauderdale, MLResearchPress, 2011, pp. 313–326. Access mode:https://proceedings.mlr.press/v15/glorot11a/glorot11a.pdf
Elfwing S., Uchibe E., Doya K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning, Neural Netw., 2018, Vol. 107 (2018), pp. 3–11. DOI: 10.1016/j.neunet.2017.12.012.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 V. Vysotska, R. Romanchuk
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.