THE GENERAL CONCEPT OF THE METHODS OF ALGORITHMIC CLASSIFICATION TREES
Keywords:Algorithmic classification tree, pattern recognition, classification, classification algorithm, branching criterion.
Context. The general problem of constructing logical trees of recognition (classification) in the theory of artificial intelligence is considered in this paper. The object of this study is the concept of the classification tree (a logical and an algorithmic ones). The current methods and algorithms for constructing algorithmic classification trees are the subject of the study.
Objective. This work aims to create a simple and effective method for constructing tree-like recognition models on the basis of algorithmic classification trees for the training set of discrete information, which is characterized by the structure of the logical classification trees obtained on the basis of independent classification algorithms evaluated through the functional of calculating their overall efficiency.
Method. The general method of constructing algorithmic classification trees is proposed. It builds a tree-like structure (a classification model) for a given initial training data set. This structure consists of a set of autonomous algorithms of classification and recognition which have been evaluated at each step (stage) of constructing the model based on the given initial dataset. Namely, the method for constructing the algorithmic classification tree is proposed. The main idea of this method is to step by step approximate the initial dataset of arbitrary size and structure using a set of independent classification algorithms. This method, when forming the current vertex of the algorithmic tree (a node, a generalized feature) ensures the selection of the most effective (highquality) autonomous classification algorithms from the initial dataset. In the process of constructing the resulting classification tree this approach can significantly reduce the size and complexity of the tree (the total number of branches, vertices and tiers of the structure) and improve the quality of its subsequent analysis (interpretability), the possibility of decomposition. The proposed method of constructing an algorithmic classification tree enables building different types of tree-like recognition models for a wide class of problems in the theory of artificial intelligence.
Results. The algorithmic classification tree method, developed and presented in this work, was implemented in the software and was studied and compared with the methods of logical classification trees (based on the selection of a set of elementary features) when solving the problem of recognizing real data of the geologic type.
Conclusions. The results of the conducted experiments described in this paper confirm the functional efficiency of the proposed mathematical software and show the possibility of its future use for solving a wide range of practical problems of recognition and classification. Further research prospects and approbation may consist in developing a limited method of the algorithmic classification tree, whose main points include the introduction of the criterion for stopping the procedure of constructing a tree model based on the depth of the structure, optimization of its software implementations, introduction of new types of algorithmic trees, and also the experimental research of this method while applying it for solving a wider range of practical problems.
Srikant R., Agrawal R. Mining generalized association rules, Future Generation Computer Systems, 1997, Vol. 13, No. 2, pр. 161–180.
Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning. Stanford, 2008, 768 p.
Quinlan J.R. Induction of Decision Trees, Machine Learning, 1986, No. 1, pp. 81–106.
Vasilenko Y. A., Vasilenko E. Y., Kuhayivsky A. I., Papp I. O. Construction and optimization of recongnizing systems, Scientific and technical journal “Information technologies and systems”, 1999, No. 1, pp. 122–125.
Povhan I. Designing of recognition system of discrete objects, 2016 IEEE First International Conference on Data Stream Mining & Processing (DSMP), 2016, Lviv, Ukraine. Lviv, 2016, pp. 226–231.
Mitchell T. Machine learning. New York, McGraw-Hill, 1997, 432 p.
Povhan I. General scheme for constructing the most complex logical tree of classification in pattern recognition discrete objects, Collection of proceedings «Electronics and information technology», 2019, Vol. 11, pp. 73–80.
Breiman L. L., Friedman J. H., Olshen R. A., Stone C. J. Classification and regression trees. Boca Raton, Chapman and Hall/CRC, 1984, 368 p.
Vasilenko Y. A., Vashuk F. G., Povkhan I. F. Automating the construction of classification systems based on agent – schemes, Mathematical modeling, optimization and information technologies : International Joint Conference MDIF-2012. Kisheneu, Moldova, 2012, Kisheneu, 2012, pp. 444–446.
Vtogoff P. E. Incremental Induction of Decision Trees, Machine Learning, 1989, № 4, pp. 161−186.
Amit Y., Geman D., Wilder K. Joint induction of shape features and tree classifiers, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, Vol. 19, No. 11, pp. 1300–1305.
Dietterich T. G., Kong E. B. Machine learning bias, statistical bias, and statistical variance of decision tree algorithms [Electronic resource]. Corvallis, Oregon State University, 1995, 14 p. Access mode : http://www.cems.uwe.ac.uk/~irjohnso/coursenotes/uqc83 2/trbias.pdf
Mingers J. An empirical comparison of pruning methods for decision tree induction, Machine learning, 1989, Vol. 4, No. 2, pp. 227–243.
Povhan I. Question of the optimality criterion of a regular logical tree based on the concept of similarity, Collection of proceedings “Electronics and information technology”, 2020, Vol. 13, pp. 12–16.
Subbotin S.A. Construction of decision trees for the case of low-information features, Radio Electronics, Computer Science, Control, 2019, № 1, pp. 121–130.
Lupei M., Mitsa A., Repariuk V., Sharkan V. Identification of authorship of Ukrainian-language texts of journalistic style using neural networks, EasternEuropean Journal of Enterprise Technologies, 2020, Vol. 1 (2 (103)), pp. 30–36. DOI: https://doi.org/10.15587/1729-4061.2020.195041
Bodyanskiy Y., Vynokurova O., Setlak G. and Pliss I. Hybrid neuro-neo-fuzzy system and its adaptive learning algorithm, Computer Sciences and Information Technologies (CSIT) : Xth Scien. and Tech. Conf. Lviv, 2015. Lviv, 2015, P. 111–114.
Karimi K., Hamilton H. J. Generation and Interpretation of Temporal Decision Rules, International Journal of Computer Information Systems and Industrial Management Applications, 2011, Vol. 3, pp. 314–323.
Kotsiantis S. B. Supervised Machine Learning: A Review of Classification Techniques, Informatica, 2007, No. 31, pp. 249–268.
Povkhan I. F. Features of synthesis of generalized features in the construction of recognition systems using the logical tree method, Information technologies and computer modeling ІТКМ-2019 : materials of the international scientific and practical conference, Ivano- Frankivsk, May 20–25, 2019. Ivano-Frankivsk, 2019, pp. 169–174.
Vasilenko Y. A., Vashuk F. G., Povkhan I. F. The importance of discrete signs, XX International Conference Promising ways and directions of improving the educational system, Uzhgorod, November 16–19, 2010. Uzhgorod, 2010, Vol. 21, No. 1, pp. 217–222.
Deng H., Runger G., Tuv E. Bias of importance measures for multi-valued attributes and solutions, Proceedings of the 21st International Conference on Artificial Neural Networks (ICANN). Espoo, Finland, Jun 14–Jun 17, 2011, Espoo, 2011, pp. 293–300.
Kamiński B., Jakubczyk M., Szufel P. A framework for sensitivity analysis of decision trees, Central European Journal of Operations Research, 2017, Vol. 26 (1), pp. 135–159.
Dietterich T. G. An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization, Machine learning, 2000, Vol. 40, № 2, pp. 139–157.
Povhan I. Generation of elementary signs in the general scheme of the recognition system based on the logical tree, Collection of proceedings «Electronics and information technology», 2019, Vol. 12, pp. 20–29.
Subbotin S., Oliinyk A. eds. : Szewczyk R., Kaliczyńska M. The dimensionality reduction methods based on computational intelligence in problems of object classification and diagnosis, Recent Advances in Systems, Control and Information Technology. Cham, Springer, 2017, pp. 11–19. (Advances in Intelligent Systems and Computing, vol. 543).
Subbotin S. A. Methods and characteristics of localitypreserving transformations in the problems of computational intelligence, Radio Electronics, Computer Science, Control, 2014, No. 1, pp. 120–128.
Koskimaki H., Juutilainen I., Laurinen P., Roning J. Two-level clustering approach to training data instance selection: a case study for the steel industry, Neural Networks : International Joint Conference (IJCNN2008). Hong Kong, 1–8 June 2008 : proceedings. Los Alamitos, IEEE, 2008, pp. 3044–3049. DOI: 10.1109/ijcnn.2008.4634228
Subbotin S. The neuro-fuzzy network synthesis and simplification on precedents in problems of diagnosis and pattern recognition, Optical Memory and Neural Networks (Information Optics), 2013, Vol. 22, № 2, pp. 97–103. DOI: 10.3103/s1060992x13020082
Subbotin S.A. Methods of sampling based on exhaustive and evolutionary search, Automatic Control and Computer Sciences, 2013, Vol. 47, № 3, pp. 113–121. DOI: 10.3103/s0146411613030073
De Mántaras R. L. A distance-based attribute selection measure for decision tree induction, Machine learning, 1991, Vol. 6, № 1, pp. 81–92.
Alpaydin E. Introduction to Machine Learning. London, The MIT Press. 2010, 400 p.
Painsky A., Rosset S. Cross-validated variable selection in tree-based methods improves predictive performance, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, Vol. 39, No. 11, pp. 2142–2153. DOI:10.1109/tpami.2016.2636831
Miyakawa M. Criteria for selecting a variable in the construction of efficient decision trees, IEEE Transactions on Computers, 1989, Vol. 38, No. 1, pp. 130–141.
How to Cite
Copyright (c) 2020 І. F. Povkhan
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.