APPROACH TO THE AUTOMATIC CREATION OF AN ANNOTATED DATASET FOR THE DETECTION, LOCALIZATION AND CLASSIFICATION OF BLOOD CELLS IN AN IMAGE

Authors

  • S. M. Kovalenko National Technical University “Kharkiv Polytechnic Institute”, Kharkiv, Ukraine, Ukraine
  • O. S. Kutsenko National Technical University “Kharkiv Polytechnic Institute”, Kharkiv, Ukraine, Ukraine
  • S. V. Kovalenko National Technical University “Kharkiv Polytechnic Institute”, Kharkiv, Ukraine, Ukraine
  • A. S. Kovalenko National Technical University “Kharkiv Polytechnic Institute”, Kharkiv, Ukraine, Ukraine

DOI:

https://doi.org/10.15588/1607-3274-2024-1-12

Keywords:

computer vision, object detection, object localization, digital image processing, classification

Abstract

Context. The paper considers the problem of automating the creation of an annotated dataset for further use in a system for detecting, localizing and classifying blood cells in an image using deep learning. The subject of the research is the processes of digital image processing for object detection and localization.

Objective. The aim of this study is to create a pipeline of digital image processing methods that can automatically generate an annotated set of blood smear images. This set will then be used to train and validate deep learning models, significantly reducing the time required by machine learning specialists.

Method. The proposed approach for object detection and localization is based on digital image processing methods such as filtering, thresholding, binarization, contour detection, and filling. The pipeline for detection and localization includes the following steps: The given fragment of text describes a process that involves noise reduction, conversion to the HSV color model, defining a mask for white blood cells and platelets, detecting the contours of white blood cells and platelets, determining the coordinates of the upper left and lower right corners of white blood cells and platelets, calculating the area of the region inside the bounding box, saving the obtained data, and determining the most common color in the image; filling the contours of leukocytes and platelets with said color; defining a mask for red blood cells; defining the contours of red blood cells; determining the coordinates of the upper left and lower right corners of red blood cells; calculating the area of the region within the bounding box; entering data about the found objects into the dataframe; saving to a .csv file for future use. With an unlabeled image dataset and a generated .csv file using image processing libraries, any researcher should be able to recreate a labeled dataset.

Results. The developed approach was implemented in software for creating an annotated dataset of blood smear images

Conclusions. The study proposes and justifies an approach to automatically create a set of annotated data. The pipeline is tested on a set of unlabelled data and a set of labelled data is obtained, consisting of cell images and a .csv file with the attributes “file name”, “type”, “xmin”, “ymin”, “xmax”, “ymax”, “area”, which are the coordinates of the bounding box for each object. The number of correctly, incorrectly, and unrecognised objects is calculated manually, and metrics are calculated to assess the accuracy and quality of object detection and localisation.

Author Biographies

S. M. Kovalenko, National Technical University “Kharkiv Polytechnic Institute”, Kharkiv, Ukraine

PhD, Associate Professor, Associate Professor of the Department of Software Engineering and Management Intelligent Technologies

O. S. Kutsenko, National Technical University “Kharkiv Polytechnic Institute”, Kharkiv, Ukraine

Dr. Sc., Professor, Professor of the Department of System Analysis and Information-Analytical Technologies

S. V. Kovalenko, National Technical University “Kharkiv Polytechnic Institute”, Kharkiv, Ukraine

PhD, Associate Professor, Professor of the Department of System Analysis and InformationAnalytical Technologies

A. S. Kovalenko, National Technical University “Kharkiv Polytechnic Institute”, Kharkiv, Ukraine

Postgraduate student of the Department of System Analysis and Information-Analytical Technologies

References

Chadha G. K., Srivastava A., Singh A., Gupta R., Singla D. An Automated Method for Counting Red Blood Cells using Image Processing. Procedia Computer Science, Procedia Computer Science, 2020, Vol. 167, pp. 769-778. DOI: 10.1016/j.procs.2020.03.408.

Kovalenko S. Kovalenko S., Mikhnova O., Kovalenko A., Pelikh D., Severin V. An Approach to Blood Cell Classification Based on Object Segmentation and Machine Learning, IEEE 4th KhPI Week on Advanced Technology (KhPIWeek), 2023, pp. 1–6. DOI: 10.1109/KhPIWeek61412.2023.10312903.

2022 State of Data Science by Anaconda [Electronic resource]. Access mode: https://www.anaconda.com/resources/whitepapers/state-ofdata-science-report-2022.

Aljabri M., AlAmir M., AlGhamdi M., Abdel-Mottaleb M., Collado-Mesa F. Towards a better understanding of annotation tools for medical imaging: a survey, Multimedia tools and applications, 2022, Vol. 81(18), 202225877-25911. DOI: 10.1007/s11042-022-12100-1.

Kutsenko A. S., Megel Y. Y., Kovalenko S. V., Kovalenko S. M., Omiotek Z., Zhunissova U. An approach to quality evaluation of embryos based on their geometrical parameters, Proc. SPIE 11176, Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments, 2019. 111762G. DOI: 10.1117/12.2536420.

Megel Y., Kalimanova I., Rybalka A., Kovalenko S., and Kovalenko S. Automation of measurement of objects geometrical parameters, Proceedings of International Scientific Symposium “Metrology and Metrology Assurance”, 2017, pp. 225–259.

Burger W., Burge M. J. Digital Image Processing: An Algorithmic Introduction. Springer Nature, 2022, 927 p. DOI: 10.1007/978-1-4471-6684-9

Chityala R., Sridevi P. Image processing and acquisition using Python. CRC Press, 2020, 420 p. DOI: 10.1201/9780429243370.

Dey S. Python image processing cookbook: over 60 recipes to help you perform complex image processing and computer vision tasks with ease. Packt Publishing Ltd, 2020, 438 p. ISBN: 978-1-789537147.

Gaudenz B. Object Detection in 2024: The Definitive Guide. [Electronic resource]. Access mode: https://viso.ai/deeplearning/object-detection/.

Plataniotis K., Venetsanopoulos A. N. Color Image Processing and Applications. Springer Science & Business Media, 2000, 355 p. DOI: 10.1007/978-3-662-04186-4.

Zhuang F., Qi Z., Duan K., Xi D., Zhu Y., Zhu H., Xiong H., He Q. A Comprehensive Survey on Transfer Learning, Proceedings of the IEEE, 2021. Vol. 109, No. 1, pp. 43–76. DOI: 10.1109/JPROC.2020.3004555.

Mikołajczyk A. and Grochowsk M. Data augmentation for improving deep learning in image classification problem, 2018 International Interdisciplinary PhD Workshop (IIPhDW). Świnouście, Poland, 2018, pp. 117–122. DOI: 10.1109/IIPHDW.2018.8388338.

Bonaccorso G. Machine Learning Algorithms – Second Edition. Packt Publishing Ltd., 2018, 522 p. ISBN 978-178934-799-9.

Szeliski R. Image-Based rendering, Computer Vision. Texts in Computer Science. Springer, Cham, 2022, pp. 681–722. DOI: 10.1007/978-3-030-34372-9_14.

Brownlee J. A gentle introduction to object recognition with deep learning, Machine Learning Mastery 5 [Electronic resource]. Access mode: https://machinelearningmastery.com/object-recognitionwith-deep-learning/.

Sinha R. K., Pandey R., Pattnaik R. Deep learning for computer vision tasks: a review, arXiv preprint arXiv:1804.03928, 2018. DOI: 10.26438/ijcse/v7i7.195201.

Lu L., Wang X., Carneiro G., Yang L. Deep learning and convolutional neural networks for medical imaging and clinical informatics. Berlin/Heidelberg, Germany, Springer International Publishing, 2019, 461 p. DOI: 10.1007/978-3030-13969-8.

Megel Yu., Chaly I., Kovalenko S., Mikhnova O. Doslidzhennja dyhal’nyh ruhiv za dopomogoju ekspertnoi systemy na bazi obchysljuvalnogo intelektu [Study of Respiratory Movements Using an Expert System Based on Computational Intelligence], Systemy obrobky informacii [System. Information Processing Systems], 2022, № 3 (170), pp. 41– 46. DOI: 10.30748/soi.2022.170.05.

Tigner A., Ibrahim S. A., Murray I. V. Histology, White Blood Cell, StatPearls Publishing, 2022 [Electronic resource]. Access mode: https://www.ncbi.nlm.nih.gov/books/NBK563148.

OpenCV. Color conversions [Electronic resource]. Access mode: https://docs.opencv.org/3.4/de/d25/imgproc_color_conversi ons.html.

Acevedo A., Merino A., Alférez S., Molina Á., Boldú L. and Rodellar J. A dataset for microscopic peripheral blood cell images for development of automatic recognition systems, Data in brief, 2002, Vol. 30. 105474. DOI: 10.1016/j.dib.2020.105474.

Downloads

Published

2024-04-02

How to Cite

Kovalenko, S. M., Kutsenko, O. S., Kovalenko, S. V., & Kovalenko, A. S. (2024). APPROACH TO THE AUTOMATIC CREATION OF AN ANNOTATED DATASET FOR THE DETECTION, LOCALIZATION AND CLASSIFICATION OF BLOOD CELLS IN AN IMAGE. Radio Electronics, Computer Science, Control, (1), 128. https://doi.org/10.15588/1607-3274-2024-1-12

Issue

Section

Neuroinformatics and intelligent systems