A MODEL AND TRAINING METHOD FOR CONTEXT CLASSIFICATION IN CCTV SEWER INSPECTION VIDEO FRAMES
DOI:
https://doi.org/10.15588/1607-3274-2021-3-9Keywords:
Sewer pipe inspection, convolutional neural network, error-correction output codes, Siamese network, Information-Extreme Learning, information criterion, LSTM, GRU.Abstract
Context. A model and training method for observational context classification in CCTV sewer inspection vide frames was developed and researched. The object of research is the process of detection of temporal-spatial context during CCTV sewer inspections. The subjects of the research are machine learning model and training method for classification analysis of CCTV video sequences under the limited and imbalanced training dataset constraint.
Objective. Stated research goal is to develop an efficient context classifier model and training algorithm for CCTV sewer inspection video frames under the constraint of the limited and imbalanced labeled training set.
Methods. The four-stage training algorithm of the classifier is proposed. The first stage involves training with soft triplet loss and regularisation component which penalises the network’s binary output code rounding error. The next stage is needed to determine the binary code for each class according to the principles of error-correcting output codes with accounting for intra- and interclass relationship. The resulting reference vector for each class is then used as a sample label for the future training with Joint Binary Cross Entropy Loss. The last machine learning stage is related to decision rule parameter optimization according to the information criteria to determine the boundaries of deviation of binary representation of observations for each class from the corresponding reference vector. A 2D convolutional frame feature extractor combined with the temporal network for inter-frame dependency analysis is considered. Variants with 1D Dilated Regular Convolutional Network, 1D Dilated Causal Convolutional Network, LSTM Network, GRU Network are considered. Model efficiency comparison is made on the basis of micro averaged F1 score calculated on the test dataset.
Results. Results obtained on the dataset provided by Ace Pipe Cleaning, Inc confirm the suitability of the model and method for practical use, the resulting accuracy equals 92%. Comparison of the training outcome with the proposed method against the conventional methods indicated a 4% advantage in micro averaged F1 score. Further analysis of the confusion matrix had shown that the most significant increase in accuracy in comparison with the conventional methods is achieved for complex classes which combine both camera orientation and the sewer pipe construction features.
Conclusions. The scientific novelty of the work lies in the new models and methods of classification analysis of the temporalspatial context when automating CCTV sewer inspections under imbalanced and limited training dataset conditions. Training results obtained with the proposed method were compared with the results obtained with the conventional method. The proposed method showed 4% advantage in micro averaged F1 score.
It had been empirically proven that the use of the regular convolutional temporal network architecture is the most efficient in utilizing inter-frame dependencies. Resulting accuracy is suitable for practical use, as the additional error correction can be made by using the odometer data.
References
Moradi S., Zayed T., Golkhoo F. Review on Computer Aided Sewer Pipeline Defect Detection and Condition Assessment, Infrastructures, 2019, Vol. 4, No. 1: 10. DOI: 10.3390/infrastructures4010010.
Myrans J., Everson R. , Kapelan Z. Automated detection of fault types in CCTV sewer surveys, Journal of Hydroinformatics, 2018, Vol. 21, No. 1, pp. 153–163. DOI: 10.2166/hydro.2018.073.
He M., Zhu Ch., Huang Q. et al. A review of monocular visual odometry, The Visual Computer, 2020, Vol. 36, No. 2, pp. 1053–1065. DOI: 10.1007/s00371-019-01714-6.
Lim B., Zohren S. Time-series forecasting with deep learning: a survey, Philosofical Transactions of the Royal Society A, 2021, Vol. 379, Issue 2194, P. 14. DOI: 10.1098/rsta.2020.0209.
Syahrian N. M., Risma P., Dewi T. Vision-Based Pipe Monitoring Robot for Crack Detection Using Canny Edge Detection Method as an Image Processing Technique, Kinetik, 2017, Vol. 2, No. 4, pp. 243–250. DOI: 10.22219/kinetik.v2i4.243.
Czimmermann T., Ciuti G., Milazzo M. et al. Visual-Based Defect Detection and Classification Approaches for Industrial Applications – A SURVEY, Sensors, 2020, Vol. 20, No. 5: 1459. DOI: 10.3390/s20051459.
Cheng J. C. P., Wang M. Automated detection of sewer pipe defects in closed-circuit television images using deep learning techniques, Automation in Construction, 2018, Vol. 95, pp. 155–171. DOI: 10.1016/j.autcon.2018.08.006.
Panella F., Boehm J., Loo Y. et al. Deep learning and image processing for automated crack detection and defect measurement in underground structures, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2018, Vol. XLII-2, pp. 829– 835. DOI: 10.5194/isprs-archives-xlii-2-829-2018.
Zhan H., Shi B., Duan L.-Y. et al. DeepShoe: An improved Multi-Task View-invariant CNN for street-to-shop shoe retrieval, Computer Vision and Image Understanding, 2019. Vol. 180, pp. 23–33. DOI: 10.1016/j.cviu.2019.01.001.
Zhang B. , Tondi B. , Lv X. et al. Challenging the Adversarial Robustness of DNNs Based on Error-Correcting Output Codes, Security and Communication Networks, 2020, Vol. 2020: 8882494. DOI: 10.1155/2020/8882494.
Moskalenko V., Moskalenko A. , Korobov A. et al. The Model and Training Algorithm of Compact Drone Autonomous Visual Navigation System, Data, 2019, Vol. 4, No. 1: 4. DOI: 10.3390/data4010004.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 V. V. Moskalenko, M. O. Zaretsky, A. S. Moskalenko, A. O. Panych, V. V. Lysyuk
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.