THE GENERAL CONCEPT OF THE METHODS OF ALGORITHMIC CLASSIFICATION TREES

Context. The general problem of constructing logical trees of recognition (classification) in the theory of artificial intelligence is considered in this paper. The object of this study is the concept of the classification tree (a logical and an algorithmic ones). The current methods and algorithms for constructing algorithmic classification trees are the subject of the study. Objective. This work aims to create a simple and effective method for constructing tree-like recognition models on the basis of algorithmic classification trees for the training set of discrete information, which is characterized by the structure of the logical classification trees obtained on the basis of independent classification algorithms evaluated through the functional of calculating their overall efficiency. Method. The general method of constructing algorithmic classification trees is proposed. It builds a tree-like structure (a classification model) for a given initial training data set. This structure consists of a set of autonomous algorithms of classification and recognition which have been evaluated at each step (stage) of constructing the model based on the given initial dataset. Namely, the method for constructing the algorithmic classification tree is proposed. The main idea of this method is to step by step approximate the initial dataset of arbitrary size and structure using a set of independent classification algorithms. This method, when forming the current vertex of the algorithmic tree (a node, a generalized feature) ensures the selection of the most effective (highquality) autonomous classification algorithms from the initial dataset. In the process of constructing the resulting classification tree this approach can significantly reduce the size and complexity of the tree (the total number of branches, vertices and tiers of the structure) and improve the quality of its subsequent analysis (interpretability), the possibility of decomposition. The proposed method of constructing an algorithmic classification tree enables building different types of tree-like recognition models for a wide class of problems in the theory of artificial intelligence. Results. The algorithmic classification tree method, developed and presented in this work, was implemented in the software and was studied and compared with the methods of logical classification trees (based on the selection of a set of elementary features) when solving the problem of recognizing real data of the geologic type. Conclusions. The results of the conducted experiments described in this paper confirm the functional efficiency of the proposed mathematical software and show the possibility of its future use for solving a wide range of practical problems of recognition and classification. Further research prospects and approbation may consist in developing a limited method of the algorithmic classification tree, whose main points include the introduction of the criterion for stopping the procedure of constructing a tree model based on the depth of the structure, optimization of its software implementations, introduction of new types of algorithmic trees, and also the experimental research of this method while applying it for solving a wider range of practical problems.

a is a fixed independent classification and recognition algorithms in the scheme of the algorithmic classification tree; G is a some initial set of signals (discrete objects); R is a partitioning into classes (patterns) i H specified in the initial data set G ; R f is a recognition function (RF) defined in the initial data set G ; i x is a discrete objects (signals) of the initial TS; i H is a set of patterns (classes) specified in the initial TS; )) ( , ( is a training pairs of the initial TS; m is a total number of training pairs (objects of the known classification) of the initial TS; M is a total number of independent classification algorithms i a in the set; k is a total number of classes (patterns) of the set of signals G ; n is a total number of features (аttributes) of a problem (feature space dimension); l is a value of class membership of discrete object x ; is a represents the total number of all types of vertices in the structure of the algorithmic classification tree model; Uz O is a total number of generalized features used in the classification tree model; All P is a total number of transitions between the vertices in the structure of the constructed classification tree model; All N is a total number of different classification algorithms that are used in the classification tree model; is a indicator of generalizing data of the initial TS using the classification tree; is a integral indicator of the quality of the algorithmic classification tree model

INTRODUCTION
Today, the rapid scientific and technological advancements urge an engineer to solve the fundamentally important problem that often arises when working with large amounts of data. This is the problem of efficient automatic construction of systems for processing large amounts of information, decision-making systems, data analysis systems. It is clear that the solution of this problem allows one to completely transfer the hard work related to designing a complex system to a computer and release an engineer's creative potential to solve other, more important and relevant problems. Moreover, overcoming this problem within the theory of artificial intelligence along with automating the algorithm and software construction of specific recognition systems in the form of LCT / ACT models is the key to their high efficiency for every real problem, and consequently will ensure the rapid development of various fields of science and technology [1].
Information technologies based on mathematical models of pattern recognition in the form of tree models are widely used in systems of processing and analyzing arrays of information. Apparently, this is due to the fact that this approach eliminates a set of shortcomings typical of classical methods and achieves a fundamentally new result efficiently and rationally using the power of computer information systems [2].
Today we know almost four thousand recognition methods and algorithms which are based on various approaches and concepts, but they all have certain limitations in their use -accuracy, speed of operation, memory. It should be underlined that each of the existing classification algorithms is limited to certain specifics of application problems (universality constraints), and this is certainly the weakest point of not only these algorithms, but also information systems in general, which are based on the respective concepts [3,4].
Decision trees, namely the structures of algorithmic classification trees are the object of the study.
Thus, the representation of training sets (arrays of discrete information) containing large amount of data in the form of structures of logical (algorithmic) trees has its significant advantages in terms of economic description and analysis of data, efficient mechanisms (procedures) for working with them [4,5]. The coverage of the TS with the set of elementary features in the case with the LCT or the coverage of the TS with the fixed set of autonomous recognition and classification algorithms in the case with the ACT, generates a fixed tree-like data structure (a tree model) which provides even compression and conversion of the initial training dataset. This approach enables the significant optimization and saving of hardware resources of the information system [6][7][8][9].
It is known that the field of application of the concept of decision trees (LCT / ACT) is currently extremely extensive; yet many tasks and problems which are solved with the help of this instrument can be reduced to the following three basic segments -problems of describing data structures, recognition and classification problems, regression problems [10]. The vast majority of modern schemes of methods for constructing classification trees are known in the literature as 'divide and conquer'. It should be noted only that when this scheme is applied, the classification tree will be constructed in the direction from top to bottom [11].
An arbitrary structure of the classification tree (LCT / ACT) is presented in the form of branches and nodes, and on the branches of the tree there are some labels (attributes, values) on which the target function depends (in the case of the ACT -independent classification algorithms, sets of GFs), and the nodes (vertices) contain the RF values (the values of class membership) or the extended attributes of transitions.
The central issues of the concept of classification trees remain the ones linked to the choice of the branching criterion (constructing or selecting vertices), the branching stopping criterion (constructing the classification tree structure), and the criterion for rejecting branches of the logical tree (subtrees). This gives rise to the fundamental issue of the theory of classification trees which consists in the possible construction of all variants of logical trees that correspond to the initial TS and the selection of the minimal classification tree according to the depth, structural complexity (the number of tiers) [12][13][14][15].
The methods and algorithms for constructing algorithmic classification trees (decision trees) are the subject of research.
An important point about the existing methods of processing training sets (discrete arrays) in the recognition problems when classification rules are built is that they do not allow regulating their complexity (the parametric complexity of GFs) in the process of constructing the model [16][17][18][19]. This shortcoming is not found in the methods of constructing recognition systems which are based on the concept of algorithmic classification trees (decision trees). The peculiarity of the algorithmic tree method is the possibility of complex use for solving each specific problem of constructing the scheme of recognition of many known recognition algorithms (methods). The concept of the ACT is based on the single methodology -the optimal approximation of the TS using the set of generalized features (autonomous algorithms), which are included in some scheme (operator), constructed in the training process [20].
The objective of the paper is to develop a simple, efficient and universal method of constructing models of classification (recognition) based on the concept of the ACT for discrete arrays, where the obtained schemes of classification systems are characterized by a tree-like structure and the presence of autonomous classification algorithms (the sets of GFs) as their structural elements.

PROBLEM STATEMENT
Suppose there is some initial TS as the sequence of training pairs which are the following: Here the objects G x i ∈ ( G is some set), and RF ) ,..., 2 , . In addition to the initial TS, a test set (a set of objects of the known class membership) is also specified, as some part of the initial TS.
For RF , and here R f is some finite significant function which specifies the initial partitioning R of G set which consists of subsets (a set of patterns, classes) . Hence, according to the specifications, a TS is a set (fixed sequence) of some sets (discrete objects), and each set is a set of values of some features (attributes) as well as values of some functions (RFs) typical of this set. Then the set of values of the features is a certain image (a discrete object), and the value of the function (RF) refers this image to the corresponding pattern [21][22][23].
Thus, the paper deals with the problem of constructing the ACT model with p parameters, whose L structure is optimal ) )) ( ), , in relation to the initial training dataset.

REVIEW OF THE LITERATURE
This research continues a series of works devoted to the problematics of tree-like recognition schemes (classification models) of discrete objects [4][5][6][7]. They highlight the issues of constructing, using and optimizing the structures of classification trees. As mentioned in [5], the resulting rule of classification (scheme), which is built with the help of an arbitrary method or algorithm of branched selection of features, has a tree-like logical structure, and the logical tree consists of vertices (features, attributes), which are grouped by tiers and which are obtained at a certain step (stage) of constructing the recognition tree [24]. An important task that arises from [20] is the one linked to synthesizing recognition trees which will actually be represented by a tree (a graph) of algorithms (ACT methods). In contrast to the existing methods, the main feature of tree-like recognition systems is that the importance of individual features (groups of features or algorithms) is determined in reference to the function that specifies the partition of objects into classes [23]. Thus, work [15] is devoted to the principal issues concerning the generation of decision trees in the case of uninformative features, estimating the quality of the constructed models. The ability of classification tree structures to perform one-dimensional branching (the selection of features, attributes) for analyzing the impact (importance, quality) of individual variables (vertices) makes it possible to work with variables of different types in the form of predicates, generalized features, in the case of ACTs -with the respective autonomous classification and recognition algorithms. This concept of classification trees is actively used in data mining where the ultimate goal is to synthesize the model (fixed scheme), which predicts the value of the target variable based on the set of the initial data (training datasets) at the input of the system [26].
Nowadays a number of algorithms implementing the concept of decision trees (classification trees) are applied. However, their two representatives (С4.5 and CART) are widely spread and used; and also mentioned above the algorithm of the logical tree С4.5 as the node (vertex) selection criterion employs the so-called theoretical information criterion whereas CART algorithm is based on calculating Gini index which takes into account the relative distances (within the metric) between class distributions [27][28][29][30].
Since the main idea of methods and algorithms of the branched selection of features (vertices of algorithms) of ACTs can be defined as the optimal approximation of some initial TS using a set of ranked classification algorithms (features, attributes of the object in the case of LCTs), the central problem which arises is related to choosing an efficient branching criterion (the selection of vertices, attributes, features of discrete objects for LCTs and algorithms for ACTs). These fundamental problems are studied in [15,21,[31][32][33][34]; they include quality evaluation of individual discrete features, their sets and fixed junctions; all of these enable introducing an efficient mechanism for the implementation of branching.
It should be noted that the structure of models of classification trees is characterized by compactness on the one hand and uneven occupancy (discharge) of the tiers on the other hand in comparison with regular trees [4]. It has to be stressed that the important issues, which still remain open, encompass the issue of the convergence of the process of constructing classification trees in accordance with the methods of branched feature selection and the issue of selecting the criterion for stopping the process of synthesizing a logical tree [19].
An important point is that there is no contradiction between the concept of classification trees and the possibility to use as the features (vertices of the structure) of the classification tree not only the individual attributes (features) of objects of their junctions (the idea of the generalized feature which was studied in work [25]) and the sets; but if we go further and do not consider the attributes of objects (features) as branches -but select individual independent recognition algorithms (estimated based on the training dataset), a new structure -the ACT will be obtained at the output [20]. It is the ACT structures that this paper focuses on.

МATERIALS AND МETHODS
By analogy with the methods of approximation of the TS using the set of estimated elementary features [3,9,15] -we present the main idea of the methods of algorithmic classification trees, which in turn lies in approximating the initial training dataset arrays with the help of the set of autonomous different-type algorithms of classification.
So, suppose there is the initial TS of general type (1) as a sequence of training pairs of the known classification ( m power), and some system (set) of independent and autonomous recognition (classification) algorithms for the initial TS Then it is necessary to introduce the following sets which represent the partition of the training dataset using the respective classification algorithms i a : ). ,..., It should be admitted that here in order to simplify the explanation each of the autonomous classification algorithms ) ( and j x f Therefore, taking into account the above-mentioned and by analogy with the methods of selecting elementary feature sets, we can introduce the following values, which should be considered as a certain criterion of branching in the ACT structure: We must stress that here , and the ratio (4) represents some classification rule, besides it is clearthe larger i a a ,..., 1 ρ value, the higher its efficiency.
Since the initial TS is the only information which represents the partitioning of patterns definitely refers to that class (pattern) for which the following ratio is fulfilled: Suppose there is the initial TS of type (1), then let us consider that RF ) ,...,  (4). At the next stage the following is obtained from the formulae (9) and (8) Further, the recognition function (5) is denoted by 0 F , and this function is specified by the ratio l a a a F n = ) ,..., , ( Then based on the formulae (10) and (12) we immediately get the following: Hence, for all RFs ) ,..., it follows from the formulae (13) and (11) It should be admitted that the result of functioning of each of the fixed (selected from the library of algorithms of some information system) autonomous classification and recognition algorithms i a at the respective step of generating the ACT is one or several generalized features j f (certain classification rules) which, in fact, describe (approximate) the determined part of the initial training set. Thus, the respective resulting generalized features for the case of the known geometric recognition algorithms [2] are geometric objects that cover the TS in feature space of the n dimension problem.
It is clear that in real examples there may be the cases when the respective classification algorithm i a cannot construct the generalized feature j f due to the complex arrangement of k H classes in the feature space of the problem or specific conceptual and implementation constraints of the classification algorithm itself. Then, by analogy with the LCT such a case is possible when the generalized features j f , constructed with the help of i a classification algorithm, approximate the initial TS incompletely or such a situation is supported by the algorithm scheme of generating the ACT (as an example, the initial restriction in the algorithm scheme of the classification tree -about generating no more than one generalized feature j f at each stage of constructing the ACT model).
It must be underlined that the objects of the initial TS, which do not fall under the constructed scheme of approximating the set using the sequence of generalized features j f (at the final stage of the procedure of synthesizing the ACT), are referred to as rejections (errors) of classification of the first typetr En and similarly for the test dataset the incorrectly classified discrete objects are also referred to as errors of the first typetr Et . Therefore, given all the above, we can assume that the ACT structure (type I) will have the general structure of the type - (Fig. 1), where each tier of such a classification tree determines the stage of constructing the ACT by means of approximating some part of the TS making use of the current classification algorithm i a and owing to this approach enables adjusting the final complexity (accuracy) of the obtained classification tree model.
It should be stressed that within each step of generating the ACT model - (Fig. 1) there are given the specific classification algorithm i a and respective TS (or the subset of the initial TS), and the initial TS in full is provided only at the first step, further with the subsequent steps of constructing a classification tree the power of the training dataset arrays will decrease due to the set of the constructed GFs j f which will cut off (describe) some part of the initial training dataset. It is also important to emphasize that depending on the structure of the ACT construction scheme and the peculiarities of the current algorithm i a more than one GF j f can be generated within each step. At the next stage of the research for the ACT method we introduce two basic criteria for constructing the classification tree model -the criterion for stopping the procedure of branching stop K (it actually adjusts the complexity and accuracy of the obtained ACT model) and the criterion for selecting branches ) (a W (the choice of a classification algorithm at the current step) for the classification tree under construction.
Thus, based on the above, it is advisable to introduce stop K -the criterion for stopping the branching process of the type ) (boolean of the ACT construction procedure which consists in checking the power ) (TS P pt of the training dataset of the following type: Let us admit that the procedure of constructing a classification tree continues unless 1 =  . (15) Let us highlight that in the formula (15) summation is carried out based on all classes which are provided by the data array of the initial TS (though there can be restrictions on summation which are associated with the structure (parameters) of the algorithm of constructing a classification tree).
An important point in the scheme of building the ACT model ( Fig. 1) is that at each step of the tree algorithm actually its fixed (one or more -depending on the structure of the ACT algorithm itself) GF are done only once within this stage, and then at each step of constructing the ACT the following algorithm i a of the initial sequence is fixed to approximate the data. This approach significantly saves the hardware resources of the information system, but negatively affects the complexity of the obtained classification tree model.
b) The option when the estimation of the efficiency and the ranking of the set of classification algorithms ) ,...., , ( are done at each step of constructing the ACT according to the appropriate data of the subset (parts) of the initial TS in order to estimate and identify the highest quality (efficient) classification algorithm for this part of the TS (the step of generating the ACT). This approach enables completing the approximation of the TS in a fewer steps and obtaining a more economical structure of the ACT compared to option (a), however, it requires much more hardware resources of the information system for the second stage of the scheme of constructing the ACT and requires considerable attention and the introduction of a set of restrictions on the initial is chosen as the vertex of the second tier and the procedure of constructing the GFs of the third stage is repeated with the only difference that at the input there is provided already the limited TS without training pairs which are approximated by the GF, the vertex of the first tier, etcetera. Thus, further the procedure of constructing the ACT will come down to repeating this stage for the next most efficient i a algorithm sequence ) ,...., , ( , the constant clipping of the parts of the TS and checking the branch stop criterion (the empty TS), which actually signals the completion of the procedure of constructing the ACT model and obtaining at the output a tree of classification algorithms i a as well as a tree of generalized features i f . It should be mentioned that there are other implementation options of the scheme for constructing the ACT of the first type, which differ from the proposed scheme by variations in the number of GFs that are built at each step, the criteria and sequence of stages for assessing the quality of classification algorithms, the possibility of using a limited number of algorithms (even one), the possibility of approximating each of the classes of the TS using the set of its selected algorithms, the possibility of varying the criterion for stopping branching in the ACT.
Finally, it should be emphasized that the main peculiarity of such a scheme for constructing the ACT is the possibility to adjust the accuracy of the classification tree model, which is constructed during the basic procedure of constructing the ACT, along with this the important point is the principal possibility of constructing the ACT model with the predetermined accuracy with respect to the data array of the initial TS. This possibility is achieved by limiting the number of steps of the ACT generation procedure, the system of restrictions on information capacity, the number and parameters of generalization (the area of the approximated TS) of the set of generalized features which are constructed at the appropriate stages of constructing the resulting classification tree.

EXPERIMENTS
It has to be noted that the scheme for building the ACT suggested in this work enables adjusting the complexity of the classification tree model under construction, building models with the predetermined accuracy, and the classification tree structure consists of different-type autonomous classification algorithms as constructing modules (components). Moreover, the task of selecting the classification tree model from the set of constructed ACTs for a specific problem is determined by a set of parameters that are of decisive importance in relation to the current application problem (the training dataset). It is apparent that in order to compare and select a specific ACT model from the fixed set, it is necessary to determine their most important parameters (feature space dimension, the number of vertices, transitions, algorithms, etc.) and to identify their error regarding the input array.
It is fundamentally important to consider the criteria of quality of the obtained classification tree models, which depend on the model error, the power of the initial data array of the TS, the size of the test set (the number of training pairs and the dimension of the attribute space of the problem), the number of model parameters, etcetera [15,27].
Let us underline that this integral indicator of the quality of the ACT model will take values between zero and one. The lower it is, the worse the quality of the constructed classification tree is, and the higher the indicator, the better model is obtained.
An important indicator that characterizes the basic properties of the obtained ACT models is the general indicator of the generalization of the data of the initial TS using the classification tree; this indicator is calculated as follows: The proposed evaluation of the quality of the classification tree model (ACT) reflects its basic parameters (characteristics) of classification trees and can be applied as the criterion for optimality in the procedure of evaluating an arbitrary tree-like recognition scheme, for example in the case of methods of constructing and selecting random LCTs from work [24] (taking into account their respective structural parameters). It must be noted that the fundamental point which still remains is linked to reducing the complexity of the classification tree structure, the parameters of consuming memory λ and CPU time τ . It is also necessary to maximize the parameter Main I (the indicator of generalization of the ACT model) that enables obtaining the most optimal structure of the classification tree and actually provides the maximum data compression of the initial TS (to represent the array of the initial data by the logical tree whose structural complexity is minimal) [12].
In Uzhhorod National University there has been developed "Orion III" software complex for generating autonomous recognition systems where the algorithmic library of the system has 11 recognition algorithms, among which there is the above-mentioned algorithmic implementation of constructing the ACT. Its task was to build an autonomous recognition system based on geological data (the problem of separating oil-bearing strata). 22 elementary features were used to identify the objects. The TS contains information about the objects of the two classes, and at the stage of the examination the constructed classification system (the ACT model) should ensure the effective recognition of objects of the unknown classification regarding these two classes. Note that before starting work, the training set was automatically checked for correctness (search and removal of the same objects of different membership -errors of the first kind)

RESULTS
Thus, the TS contains information on partition into two classes, and in the training data array there were predominantly training pairs of the first class (oil-bearing strata) in proportion (1.5 / 1), and the TS itself consisted of 1250 objects, and the efficiency of the constructed recognition system (the ACT model) was evaluated using the test set of 240 objects. The test set was a separate part of the training set (consisting of discrete objects of the known classification). The arrays of the training and test sets were obtained on the basis of geological exploration in Transcarpathian region during the period from 2001 to 2011.
Hence, a fragment of the main results of the above experiments (the comparative tests of methods for constructing LCT / ACT models using the data array of this application problem) is presented in (Table 1), and along with this the synthesized models of different-type classification trees provided the required level of accuracy specified in the task, the processing speed and costs of working memory of the information system, but showed different structural complexity of the constructed classification trees (models) and the set of generalized features in the case of the algorithmic classification tree model.

DISCUSSION
It is worth highlighting that the proposed estimates of the quality of the ACT model fix the most important characteristics of the classification trees and can be used as the criterion for optimality in the procedure of constructing the ACT and final selection from the set of ACT models. Notably, the method of the algorithmic tree operates only with the ready (constructed) generalized features, and it does not matter at all by means of what algorithm (rule, method) they were obtained, and each of the schemes constructed with the help of the algorithmic tree method is the general recognition system (the ATC model), which can be employed for practical work (processing large amounts of experimental data in the form of discrete sets). It should also be stressed that the resulting scheme is to some extent a new recognition algorithm (certainly, the one which has been synthesized from the known algorithms and methods). It is necessary to admit that the obtained ACT structure is characterized by the high degree of universality and relatively compact structure of the model itself, but it requires large hardware costs to store the generalized features of the initial assessment of the quality of classification algorithms according to the TS.

CONCLUSIONS
The problem of automation of constructing the models of algorithmic classification trees on the basis of approximating the TS using the set of independent classification algorithms has been solved in this paper.
The scientific novelty of the obtained results lies in the fact that for the first time there has been suggested a simple method of constructing the ACT based on evaluating and ranking the set of autonomous recognition algorithms for generating the classification tree structure (the ACT model). Moreover, within each step of branching the classification tree, a certain part of the TS (or its subset) is approximated. Admittedly, the functional (branching criterion) proposed in the work can be used not only for estimating the quality of individual classification algorithms, but also calculating the efficiency of the related sets of algorithms, which in the long run enables obtaining a more optimal structure of the synthesized ACT based on the initial training dataset. The paper offers the set of general indicators (parameters) which allows the efficient presentation of the general characteristics of the ACT model, its possible use for selecting the most optimal ACT.
The applied value of the obtained results is that the proposed method of constructing ACT models was implemented in the library of algorithms of the universal software system "ORION III" to solve various practical problems of classification (recognition) of arrays of discrete objects. It must be underscored that the conducted practical tests confirm the functionality of the proposed ACT models and the developed software, which enables making a recommendation on the use of this approach and its software implementation for a wide range of application problems of classifying and recognizing discrete objects.
Further research is needed to develop methods of algorithmic classification trees (constrained methods) of optimizing software implementations of the proposed method for constructing the ACT as well as its practical approbation dealing with a number of real problems of classification and recognition. 25