THE ALGORITHM TREE METHOD IN SOLVING THE TASK OF CLASSIFYING HYDROGRAPHIC DATA

I. F. Povkhan; O. V. Mitsa; O. Y. Mulesa; V. V. Polishchuk

doi:10.15588/1607-3274-2021-4-8

Authors

I. F. Povkhan State Higher Education Institution Uzhhorod National University, Uzhhorod, Ukraine., Ukraine
O. V. Mitsa State Higher Education Institution Uzhhorod National University, Uzhhorod, Ukraine., Ukraine
O. Y. Mulesa State Higher Education Institution Uzhhorod National University, Uzhhorod, Ukraine., Ukraine
V. V. Polishchuk State Higher Education Institution Uzhhorod National University, Uzhhorod, Ukraine., Ukraine

DOI:

https://doi.org/10.15588/1607-3274-2021-4-8

Keywords:

classification tree, algorithmic classification tree, discrete object, feature, recognition function, recognition algorithm, branching criterion.

Abstract

Context. The work is dedicated to the identification of a simple and effective mechanism by which it is possible to build algorithmic classification trees (algorithmic tree models) on the basis of fixed initial information in the form of a discrete data training sample. The constructed algorithmic classification tree will unmistakably classify (recognize) the entire training sample on which the model is built, have a minimum structure (structural complexity) and consist of components – autonomous classification and recognition algorithms as the vertices of the structure (attributes of the tree).

Objective. The aim of this work is to create a simple, effective and universal method of constructing classification (recognition) models based on the concept of algorithmic trees for arrays of real hydrographic data, where the obtained schemes of classification systems (classification tree structure) are characterized by a tree structure (construction) and autonomous classification algorithms (sets of generalized features) as their structural elements (construction blocks).

Method. The general scheme of synthesizing classification trees in a form of algorithmic trees on the basis of a procedure of approximation of an array of discrete data by a set of elementary classifiers, which for the set initial training sample builds a tree-like structure, i.e. a model of the algorithmic tree, is suggested. Moreover, the constructed scheme consists of a set of autonomous classification and recognition algorithms evaluated at each step/stage of constructing the classification tree for this initial sample. A method for constructing an algorithmic classification tree has been developed, the main idea of which is to approximate step-by-step the initial sample of an arbitrary volume and structure by a set of elementary classification algorithms. The method of algorithmic tree in the formation of the current algorithmic tree vertex, node, generalized feature provides selection of the most effective, highquality elementary classifiers from the initial set and completion of only those paths in the tree structure where the largest number of errors (failures) occurs. The structural complexity of the algorithmic tree design is estimated based on the number of transitions, vertices and tiers of the model structure, which allows one to improve the quality of its subsequent analysis, provide an effective decomposition mechanism, and build algorithmic tree structures under fixed constraint sets. The method of the algorithmic tree synthesis allows one to build different types of tree-like recognition models with different initial sets of elementary classifiers with predetermined accuracy for a wide class of problems of the artificial intelligence theory.

Results. The developed method of building algorithmic tree models allows one to work with training samples of a large amount of different types of information (discrete data) and provides high speed and economy of hardware resources in the process of generating the final classification scheme, as well as to build classification trees with predetermined accuracy.

Conclusions. An approach to the synthesis of new recognition algorithms (schemes) based on a library (set) of already known algorithms (methods) and schemes has been developed. That is, an effective scheme for recognizing discrete objects based on stepby-step evaluation and selection of classification algorithms (generalized features) at each step of the scheme synthesis is presented. Based on the suggested concept of algorithmic classification trees, a model of the structure of the algorithm tree was built, which provided classification of flood situations for the Uzh river basin.

Author Biographies

I. F. Povkhan, State Higher Education Institution Uzhhorod National University, Uzhhorod, Ukraine.

Dr. Sc., Associate Professor, Associate Professor at the System Software Department.

O. V. Mitsa, State Higher Education Institution Uzhhorod National University, Uzhhorod, Ukraine.

Dr. Sc., Associate Professor, Head of the Information Control Systems and Technologies Department.

O. Y. Mulesa, State Higher Education Institution Uzhhorod National University, Uzhhorod, Ukraine.

Dr. Sc., Associate Professor, Associate professor at the Cybernetics and Applied Mathematics.

V. V. Polishchuk, State Higher Education Institution Uzhhorod National University, Uzhhorod, Ukraine.

PhD, Associate Professor, Associate Professor at the System Software Department.

References

Murphy K. Machine Learning: A Probabilistics Perspective, The MIT Press, Cambridge, Massachusetts, 2012, 423 p.

Gupta Y. Selection of important features and predicting wine quality using machine learning techniques, Procedia Computer Science, 2018, Vol. 125, pp. 305–312. DOI:https://doi.org/10.1016/j.procs.2017.12.041

Denisko D., Hoffman M. Classification and interaction in random forests, Proceedings of the National Academy of Sciences of the United States of America, 2018, Vol. 115, No. 8, pp. 1690–1692. DOI:10.1073/pnas.1800256115.

Jordan M. I., Mitchell T. M. Machine learning: trends, perspectives, and prospects, Science 2015, Vol. 349(6245), pp. 255–260. DOI:https://doi.org/10.1126/science.aaa8415

Rokach L., Maimon O. Feature set decomposition for decision trees, Journal of Intelligent Data Analysis, 2005, Vol. 9, № 2, pp. 131–158. DOI: https://doi.org/10.3233/ida2005-9202

Hyafil L., Rivest R. Constructing optimal binary decision trees is npcomplete, Information Processing Letters, 1976, Vol. 5, № 1, pp. 15–17. DOI: https://doi.org/10.1016/00200190(76)90095-8

Vasilenko Y. A., Vasilenko E. Y., Kuhayivsky A. I., Papp I.O. Construction and optimization of recongnizing systems, Scientific and technical journal “Information technologies and systems”, 1999, №1, pp. 122–125.

Povkhan I., Lupei M., Kliap M., Laver V. The issue of efficient generation of generalized features in algorithmic classification tree methods, International Conference on Data Stream Mining and Processing: DSMP 2020 Data Stream Mining & Processing. Springer, Cham, 2020, pp. 98–113. DOI: https://doi.org/10.1007/978-3-030-61656-4_6

Shilen S. Nonparametric classification using matched binary decision trees, Pattern Recognition Letters, 1992, No. 13, pp. 83–87. DOI: https://doi.org/10.1016/01678655(92)90037-z

Perner P. Improving the accuracy of decision tree induction by feature preselection, Applied Artificial Intelligence, 2001, Vol. 15, № 8, pp. 747–760. DOI: https://doi.org/10.1080/088395101317018582

Povkhan I. A constrained method of constructing the logic classification trees on the basis of elementary attribute selection, CEUR Workshop Proceedings: Proceedings of the Second International Workshop on Computer Modeling and Intelligent Systems (CMIS-2020), Zaporizhzhia, Ukraine, April 15–19, 2020. Zaporizhzhia, 2020, Vol. 2608, pp. 843– 857. DOI: https://doi.org/10.15588/1607-3274-2020-2-10

Murthy S. K., Kasif S. and Salzberg S. A system for induction of oblique decision trees, Journal of Artificial Intelligence Research. August 1994, № 2, pp. 1–33. DOI: https://doi.org/10.1613/jair.63

Dovbysh А. S., Moskalenko V. V., Rizhova A. S. Information-Extreme Method for Classification of Observations with Categorical Attributes, Cibernetica and Systems Analysis, 2016, Vol. 52, № 2, pp. 45–52. DOI: 10.1007/s10559016-9818-1

Witten I. H., Frank E. Data Mining. Practical Machine Learning Tools and Techniques, Second Edition. San Francisco: Elsevier Inc., 2005, 558 p. DOI: https://doi.org/10.1016/b978-0-12-374856-0.00015-8

Geurts P., Irrthum A., Wehenkel L. Supervised learning with decision tree-based methods in computational and systems biology, Molecular Biosystems, 2009, Vol. 5, No. 12, pp. 1593–1605. DOI: https://doi.org/10.1039/b907946g

Yang J., Li Y. Orthogonal relief algorithm for feature selection, Lecture Notes in Computer Science, 2006, pp. 227– 234. DOI: https://doi.org/10.1007/11816157_22

Rokach L., Maimon O. Data Mining with decision trees: Theory and Applications, 2nd Edition. Singapore: World Scientifc Publishing Co. Pte. Ltd., 2015, 305 p.

Subbotin S.A. Construction of decision trees for the case of low-information features, Radio Electronics, Computer Science, Control, 2019, No. 1, pp. 121–130. DOI: https://doi.org/10.15588/1607-3274-2019-1-12

Rodriguez J. J., Kuncheva L. I. and Alonso C. J. Rotation forest: A new classifier ensemble method, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, Vol. 28, No. 10, pp. 1619–1630. DOI: https://doi.org/10.1109/tpami.2006.211

What is the C4.5 algorithm and how does it work (2019). Retrieved from https://towardsdatascience.com/what-is-thec4-5-algorithm-and-how-does-it-work-2b971a9e7db0

C5.0 Classification Models (2020). Retrieved from https://cran.r-project.org/web/packages/C50 /vignettes/C5.0.html

Papagelis A., Kalles D. Breeding S. Decision Trees Using Evolutionary Techniques, Machine Learning: Proceedings of the Eighteenth International Conference (ICML), June 28–July 1 2001. Morgen Kaufmann Publishers, 2001, pp. 393–400.

Povhan I. F. Logical recognition tree construction on the basis a step-to-step elementary attribute selection, Radio Electronics, Computer Science, Control, 2020, № 2, pp. 95– 106. DOI: https://doi.org/10.15588/1607-3274-2020-2-10

Murthy S., Salzberg S. Decision Tree Induction: How Effective Is the Greedy Heuristic, Proceedings of the First International Conference on Knowledge Discovery and Data Mining, Montreal, Kanada, August 20–21 1995, AAAI Press, 1995, pp. 222–227.

Subbotin S., Kirsanova E. The regression tree model building based on a clusterregression approximation for datadriven medicine, CEUR Workshop Proceedings, 2018, Vol. 2255, pp. 155–169.

Harrington P. Machine Learning in Action, Shelter Island, Manning Publications Co, 2012, 354 p.

Hssina B., Merbouha A., Ezzikouri H., Erritali M. A comparative study of decision tree ID3 and C4.5, International Journal of Advanced Computer Science and Applications (IJACSA), 2014, pp. 13–19. DOI: https://doi.org/10.14569/specialissue.2014.040203

Page D, Ray S. An efficient alternative to lookahead for decision tree induction, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, August 9–15 2003. Acapulko, Mexico, Publisher Not Avail, 2003, pp. 601–612.

Kaftannikov I. L., Parasich A. V. Decision Tree’s Features of Application in Classification Problems, Bulletin of the South Ural State University. Ser. Computer Technologies, Automatic Control, Radio Electronics, 2015, Vol. 15, № 3, pp. 26–32. DOI: https://doi.org/10.14529/ctcr150304

Povhan I. Logical classification trees in recognition problems, Kwartalnik Naukowo-Techniczny: Informatyka Automatyka Pomiary w gospodarce o ochronie srodowiska. Krakow, 2020, No. 2, pp. 12–16. DOI: https://doi.org/10.35784/iapgos.927

Freund Y., Schapire R. Experiments with a New Boosting Algorithm, Proceedings Thirteenth of the International Conference on Machine Learning (ICML’96), Morgan Kaufmann Publishers Ins, 1996, pp. 148–156.

Wang H., Hong M. Online ad effectiveness evaluation with a two-stage method using a Gaussian filter and decision tree approach, Electronic Commerce Research and Applications, 2019, Vol. 35, Article 100852. DOI: 10.1016/j.elerap.2019.100852.

Giatzitzoglou D. G., Sotiropoulos D. N., Tsihrintzis G. A. AIRS-x: An eXtension to the Original Artificial Immune Recognition Learning Algorithm, 2019 International Conference on Computer Information and Telecommunication Systems (CITS), Beijing, China, 2019, pp. 1–5. DOI: https://doi.org/10.1109/cits.2019.8862043

He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, USA, June 27–30, 2016, pp. 770–778. DOI: https://doi.org/10.1109/cvpr.2016.90

Witten I., Eibe F., Hall M. Data Mining: Practical Machine Learning Tools and Techniques, 3rd Edition, Morgan Kaufmann Publishers Ins., 2011, 664 p. ISBN 9780123748560.

Sirichotedumrong W., Maekawa T., Kinoshita Y., Kiya H. Privacypreserving deep neural networks with pixel-based image encryption considering data augmentation in the encrypted domain, 2019 IEEE International Conference on Image Processing (ICIP), IEEE, 2019, pp. 674–678. DOI:https://doi.org/10.1109/icip.2019.8804201

THE ALGORITHM TREE METHOD IN SOLVING THE TASK OF CLASSIFYING HYDROGRAPHIC DATA

Authors

DOI:

Keywords:

Abstract

Author Biographies

I. F. Povkhan, State Higher Education Institution Uzhhorod National University, Uzhhorod, Ukraine.

O. V. Mitsa, State Higher Education Institution Uzhhorod National University, Uzhhorod, Ukraine.

O. Y. Mulesa, State Higher Education Institution Uzhhorod National University, Uzhhorod, Ukraine.

V. V. Polishchuk, State Higher Education Institution Uzhhorod National University, Uzhhorod, Ukraine.

References

Downloads

Published

How to Cite

Issue

Section

License

Creative Commons Licensing Notifications in the Copyright Notices

Information

Current Issue