PROBLEM OF A DISCRETE DATA ARRAY APPROXIMATION BY A SET OF ELEMENTARY GEOMETRIC ALGORITHMS
DOI:
https://doi.org/10.15588/1607-3274-2021-3-10Keywords:
algorithmic classification tree, image recognition, classification, classification algorithm, branching criterion, geometric algorithm.Abstract
Context. In this paper, a problem of a discrete data array approximation by a set of elementary geometric algorithms and a recognition model representation in a form of algorithmic classification tree has been solved. The object of the present study is a concept of a classification tree in a form of an algorithm trees. The subject of this study are the relevant models, methods, algorithms and schemes of different classification tree construction.
Objective. The goal of this work is to create a simple and efficient method and algorithmic scheme of building the tree-like recognition and classification models on the basis of the algorithm trees for training selections of large-volume discrete information characterized by a modular structure of independent recognition algorithms assessed in accordance with the initial training selection data for a wide class of applied tasks.
Method. A scheme of classification tree (algorithm tree) synthesis has been suggested being based on the data array approximation by a set of elementary geometric algorithms that constructs a tree-like structure (the ACT model) for a preset initial training selection of arbitrary size. The latter consists of a set of autonomous classification/recognition algorithms assessed at each step of the ACT construction according to the initial selection. A method of the algorithmic classification tree construction has been developed with the basic idea of step-by-step arbitrary-volume and structure initial selection approximation by a set of elementary geometric classification algorithms. When forming a current algorithm tree vertex, node and generalized attribute, this method provides alignment of the most effective and high-quality elementary classification algorithms from the initial set and complete construction of only those paths in the ACT structure, where the most of classification errors occur. The scheme of synthesizing the resulting classification tree and the ACT model developed allows one to reduce considerably the tree size and complexity. The ACT construction structural complexity is being assessed on the basis of a number of transitions, vertices and tiers of the ACT structure that allows the quality of its further analysis to be increased, the efficient decomposition mechanism to be provided and the ACT structure to be built in conditions of fixed limitation sets. The algorithm tree synthesis method allows one to construct different-type tree-like recognition models with various sets of elementary classifiers at the preset accuracy for a wide class of artificial intelligence theory problems.
Results. The method of discrete training selection approximation by a set of elementary geometric algorithms developed and presented in this work has received program realization and was studied and compared with those of logical tree classification on the basis of elementary attribute selection for solving the real geological data recognition problem.
Conclusions. Both general analysis and experiments carried out in this work confirmed capability of developed mechanism of constructing the algorithm tree structures and demonstrate possibility of its promising use for solving a wide spectrum of applied recognition and classification problems. The outlooks of the further studies and approbations might be related to creating the othertype algorithmic classification tree methods with other initial sets of elementary classifiers, optimizing its program realizations, as well experimental studying this method for a wider circle of applied problems.
References
Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning. Stanford, 2008, 768 p.
Quinlan J. R. Induction of Decision Trees, Machine Learning, 1986, No. 1, pp. 81–106.
Mitchell T. Machine learning. New York, McGraw-Hill, 1997, 432 p.
Dietterich T. G., Kong E. B. Machine learning bias, statistical bias, and statistical variance of decision tree algorithms [Electronic resource]. Corvallis, Oregon State University, 1995, 14 p. Access mode : http://www.cems.uwe.ac.uk/~irjohnso/coursenotes/uqc832/tr bias.pdf
Breiman L. L., Friedman J. H., Olshen R. A., Stone C. J. Classification and regression trees. Boca Raton, Chapman and Hall/CRC, 1984, 368 p.
Vtogoff P.E. Incremental Induction of Decision Trees, Machine Learning, 1989, No. 4, pp. 161−186.
Vasilenko Y. A., Vasilenko E. Y., Kuhayivsky A. I., Papp I. O. Construction and optimization of recongnizing systems, Scientific and technical journal “Information technologies and systems”, 1999, No. 1, pp. 122–125.
Subbotin S.A. Construction of decision trees for the case of low-information features, Radio Electronics, Computer Science, Control, 2019, No. 1, pp. 121–130.
Povhan I.F. Logical recognition tree construction on the basis a step-to-step elementary attribute selection, Radio Electronics, Computer Science, Control, 2020, No. 2, pp. 95–106.
Povkhan I. F. The general concept of the methods of algorithmic classification trees, Radio Electronics, Computer Science, Control, 2020, No. 3, pp. 108–121.
Povhan I. F. Limited method for the case of algorithmic classification tree, Radio Electronics, Computer Science, Control, 2020, No. 4, pp. 106–118.
Povhan I. Designing of recognition system of discrete objects, 2016 IEEE First International Conference on Data Stream Mining & Processing (DSMP), 2016, Lviv, Ukraine. Lviv, 2016, pp. 226–231.
Vasilenko Y. A., Vashuk F. G., Povkhan I. F. Automating the construction of classification systems based on agent – schemes, Mathematical modeling, optimization and information technologies : International Joint Conference MDIF-2012, Kisheneu, Moldova, 2012. Kisheneu, 2012, pp. 444–446.
Povkhan I.F. Features of synthesis of generalized features in the construction of recognition systems using the logical tree method, Information technologies and computer modeling ІТКМ-2019 : materials of the international scientific and practical conference, Ivano-Frankivsk, May 20–25, 2019. Ivano-Frankivsk, 2019, pp. 169–174.
Vasilenko Y. A., Vashuk F. G., Povkhan I. F. The importance of discrete signs, XX International Conference Promising ways and directions of improving the educational system, Uzhgorod, November 16–19, 2010. Uzhgorod, 2010, Vol. 21, No. 1, pp. 217–222.
Alpaydin E. Introduction to Machine Learning. London, The MIT Press. 2010, 400 p.
De Mántaras R. L. A distance-based attribute selection measure for decision tree induction, Machine learning, 1991, Vol. 6, No. 1, pp. 81–92.
Painsky A., Rosset S. Cross-validated variable selection in tree-based methods improves predictive performance, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, Vol. 39, No. 11, pp. 2142–2153. DOI:10.1109/tpami.2016.2636831.
Miyakawa M. Criteria for selecting a variable in the construction of efficient decision trees, IEEE Transactions on Computers, 1989, Vol. 38, No. 1, pp. 130–141.
Kotsiantis S.B. Supervised Machine Learning: A Review of Classification Techniques, Informatica, 2007, No. 31, pp. 249–268.
Deng H., Runger G., Tuv E. Bias of importance measures for multi-valued attributes and solutions, Proceedings of the 21st International Conference on Artificial Neural Networks (ICANN), Espoo, Finland, Jun 14–Jun 17, 2011. Espoo, 2011, pp. 293–300.
Dietterich T. G. An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization, Machine learning, 2000, Vol. 40, No. 2, pp. 139–157.
Subbotin S., Oliinyk A. eds. R. Szewczyk, M. Kaliczyńska.The dimensionality reduction methods based on computational intelligence in problems of object classification and diagnosis, Recent Advances in Systems, Control and Information Technology. Cham, Springer, 2017, pp. 11–19. (Advances in Intelligent Systems and Computing, vol. 543).
Subbotin S. A. Methods and characteristics of localitypreserving transformations in the problems of computational intelligence, Radio Electronics, Computer Science, Control, 2014, No. 1, pp. 120–128.
Subbotin S.A. Methods of sampling based on exhaustive and evolutionary search, Automatic Control and Computer Sciences, 2013, Vol. 47, No. 3, pp. 113–121. DOI: 10.3103/s0146411613030073
Koskimaki H., Juutilainen I., Laurinen P., Roning J. Twolevel clustering approach to training data instance selection: a case study for the steel industry, Neural Networks : International Joint Conference (IJCNN-2008), Hong Kong, 1–8 June 2008, proceedings. Los Alamitos, IEEE, 2008, pp. 3044–3049. DOI: 10.1109/ijcnn.2008.4634228
Srikant R., Agrawal R. Mining generalized association rules Future Generation Computer S ystems, 1997, Vol. 13, No. 2, pp. 161–180.
Amit Y., Geman D., Wilder K.Joint induction of shape features and tree classifiers, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, Vol. 19, No. 11, pp. 1300–1305.
Mingers J. An empirical comparison of pruning methods for decision tree induction, Machine learning, 1989, Vol. 4, No. 2, pp. 227–243.
Karimi K., Hamilton H. J. Generation and Interpretation of Temporal Decision Rules, International Journal of Computer Information Systems and Industrial Management Applications, 2011, Vol. 3, pp. 314 –323.
Kamiński B., Jakubczyk M., Szufel P. A framework for sensitivity analysis of decision trees, Central European Journal of Operations Research, 2017, Vol. 26 (1), pp. 135– 159.
Lupei M., Mitsa A., Repariuk V., Sharkan V. Identification of authorship of Ukrainian-language texts of journalistic style using neural networks, Eastern-European Journal of Enterprise Technologies, 2020, Vol. 1 (2 (103)), pp. 30–36. DOI: https://doi.org/10.15587/1729-4061.2020.195041
Bodyanskiy Y., Vynokurova O., Setlak G. and Pliss I. Hybrid neuro-neo-fuzzy system and its adaptive learning algorithm, Computer Sciences and Information Technologies (CSIT), Xth Scien. and Tech. Conf., Lviv, 2015. Lviv, 2015, pp. 111–114.
Subbotin S. The neuro-fuzzy network synthesis and simplification on precedents in problems of diagnosis and pattern recognition, Optical Memory and Neural Networks (Information Optics), 2013, Vol. 22, No. 2, pp. 97–103. DOI: 10.3103/s1060992x13020082
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 I. F. Povkhan, O. V. Mitsa, O. Y. Mulesa, O. O. Melnyk
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.