A MODEL AND TRAINING METHOD FOR CONTEXT CLASSIFICATION IN CCTV SEWER INSPECTION VIDEO FRAMES

V. V. Moskalenko; M. O. Zaretsky; A. S. Moskalenko; A. O. Panych; V. V. Lysyuk

doi:10.15588/1607-3274-2021-3-9

Authors

V. V. Moskalenko Sumy State University, Sumy, Ukraine., Ukraine
M. O. Zaretsky Sumy State University, Sumy, Ukraine., Ukraine
A. S. Moskalenko Sumy State University, Sumy, Ukraine., Ukraine
A. O. Panych Sumy State University, Sumy, Ukraine., Ukraine
V. V. Lysyuk Molfar.AI sp. z o.o., Gdansk, Poland., Poland

DOI:

https://doi.org/10.15588/1607-3274-2021-3-9

Keywords:

Sewer pipe inspection, convolutional neural network, error-correction output codes, Siamese network, Information-Extreme Learning, information criterion, LSTM, GRU.

Abstract

Context. A model and training method for observational context classification in CCTV sewer inspection vide frames was developed and researched. The object of research is the process of detection of temporal-spatial context during CCTV sewer inspections. The subjects of the research are machine learning model and training method for classification analysis of CCTV video sequences under the limited and imbalanced training dataset constraint.

Objective. Stated research goal is to develop an efficient context classifier model and training algorithm for CCTV sewer inspection video frames under the constraint of the limited and imbalanced labeled training set.

Methods. The four-stage training algorithm of the classifier is proposed. The first stage involves training with soft triplet loss and regularisation component which penalises the network’s binary output code rounding error. The next stage is needed to determine the binary code for each class according to the principles of error-correcting output codes with accounting for intra- and interclass relationship. The resulting reference vector for each class is then used as a sample label for the future training with Joint Binary Cross Entropy Loss. The last machine learning stage is related to decision rule parameter optimization according to the information criteria to determine the boundaries of deviation of binary representation of observations for each class from the corresponding reference vector. A 2D convolutional frame feature extractor combined with the temporal network for inter-frame dependency analysis is considered. Variants with 1D Dilated Regular Convolutional Network, 1D Dilated Causal Convolutional Network, LSTM Network, GRU Network are considered. Model efficiency comparison is made on the basis of micro averaged F1 score calculated on the test dataset.

Results. Results obtained on the dataset provided by Ace Pipe Cleaning, Inc confirm the suitability of the model and method for practical use, the resulting accuracy equals 92%. Comparison of the training outcome with the proposed method against the conventional methods indicated a 4% advantage in micro averaged F1 score. Further analysis of the confusion matrix had shown that the most significant increase in accuracy in comparison with the conventional methods is achieved for complex classes which combine both camera orientation and the sewer pipe construction features.

Conclusions. The scientific novelty of the work lies in the new models and methods of classification analysis of the temporalspatial context when automating CCTV sewer inspections under imbalanced and limited training dataset conditions. Training results obtained with the proposed method were compared with the results obtained with the conventional method. The proposed method showed 4% advantage in micro averaged F1 score.

It had been empirically proven that the use of the regular convolutional temporal network architecture is the most efficient in utilizing inter-frame dependencies. Resulting accuracy is suitable for practical use, as the additional error correction can be made by using the odometer data.

Author Biographies

V. V. Moskalenko, Sumy State University, Sumy, Ukraine.

PhD, Associate Professor of Computer Science department.

M. O. Zaretsky, Sumy State University, Sumy, Ukraine.

Postgraduate Student of Computer Science department.

A. S. Moskalenko, Sumy State University, Sumy, Ukraine.

PhD, Senior Lecturer of Computer Science department.

A. O. Panych, Sumy State University, Sumy, Ukraine.

M. Eng., Lecturer of Computer Science department.

V. V. Lysyuk, Molfar.AI sp. z o.o., Gdansk, Poland.

M. Eng., Co-Founder of Molfar.AI sp. z o.o.

References

Moradi S., Zayed T., Golkhoo F. Review on Computer Aided Sewer Pipeline Defect Detection and Condition Assessment, Infrastructures, 2019, Vol. 4, No. 1: 10. DOI: 10.3390/infrastructures4010010.

Myrans J., Everson R. , Kapelan Z. Automated detection of fault types in CCTV sewer surveys, Journal of Hydroinformatics, 2018, Vol. 21, No. 1, pp. 153–163. DOI: 10.2166/hydro.2018.073.

He M., Zhu Ch., Huang Q. et al. A review of monocular visual odometry, The Visual Computer, 2020, Vol. 36, No. 2, pp. 1053–1065. DOI: 10.1007/s00371-019-01714-6.

Lim B., Zohren S. Time-series forecasting with deep learning: a survey, Philosofical Transactions of the Royal Society A, 2021, Vol. 379, Issue 2194, P. 14. DOI: 10.1098/rsta.2020.0209.

Syahrian N. M., Risma P., Dewi T. Vision-Based Pipe Monitoring Robot for Crack Detection Using Canny Edge Detection Method as an Image Processing Technique, Kinetik, 2017, Vol. 2, No. 4, pp. 243–250. DOI: 10.22219/kinetik.v2i4.243.

Czimmermann T., Ciuti G., Milazzo M. et al. Visual-Based Defect Detection and Classification Approaches for Industrial Applications – A SURVEY, Sensors, 2020, Vol. 20, No. 5: 1459. DOI: 10.3390/s20051459.

Cheng J. C. P., Wang M. Automated detection of sewer pipe defects in closed-circuit television images using deep learning techniques, Automation in Construction, 2018, Vol. 95, pp. 155–171. DOI: 10.1016/j.autcon.2018.08.006.

Panella F., Boehm J., Loo Y. et al. Deep learning and image processing for automated crack detection and defect measurement in underground structures, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2018, Vol. XLII-2, pp. 829– 835. DOI: 10.5194/isprs-archives-xlii-2-829-2018.

Zhan H., Shi B., Duan L.-Y. et al. DeepShoe: An improved Multi-Task View-invariant CNN for street-to-shop shoe retrieval, Computer Vision and Image Understanding, 2019. Vol. 180, pp. 23–33. DOI: 10.1016/j.cviu.2019.01.001.

Zhang B. , Tondi B. , Lv X. et al. Challenging the Adversarial Robustness of DNNs Based on Error-Correcting Output Codes, Security and Communication Networks, 2020, Vol. 2020: 8882494. DOI: 10.1155/2020/8882494.

Moskalenko V., Moskalenko A. , Korobov A. et al. The Model and Training Algorithm of Compact Drone Autonomous Visual Navigation System, Data, 2019, Vol. 4, No. 1: 4. DOI: 10.3390/data4010004.

A MODEL AND TRAINING METHOD FOR CONTEXT CLASSIFICATION IN CCTV SEWER INSPECTION VIDEO FRAMES

Authors

DOI:

Keywords:

Abstract

Author Biographies

V. V. Moskalenko, Sumy State University, Sumy, Ukraine.

M. O. Zaretsky, Sumy State University, Sumy, Ukraine.

A. S. Moskalenko, Sumy State University, Sumy, Ukraine.

A. O. Panych, Sumy State University, Sumy, Ukraine.

V. V. Lysyuk, Molfar.AI sp. z o.o., Gdansk, Poland.

References

Downloads

Published

How to Cite

Issue

Section

License

Creative Commons Licensing Notifications in the Copyright Notices

Information

Current Issue