STATISTICAL DATA ANALYSIS TOOLS IN IMAGE CLASSIFICATION METHODS BASED ON THE DESCRIPTION AS A SET OF BINARY DESCRIPTORS OF KEY POINTS

Context. Modern computer vision systems require effective classification solutions based on the research of the the processed data nature. Statistical distributions are currently the perfect tool for representing and analyzing visual data in image recognition systems. If the description of a recognized object is represented by a set of vectors, the statistical apparatus becomes fundamental for making a classification decision. The study of data distributions in the feature blocks systems for key point descriptors has shown its effectiveness in terms of achieving the necessary quality of classification and processing speed. There is a need for in-depth study of the descriptor sets statistical properties in terms of the main aspect – the multidimensional data separation for classification. This task becomes especially important for constructing new effective feature spaces, for example, by aggregating a set of descriptors by their constituent components, including individual bits. To do this, it is natural to use the apparatus of statistical criteria designed to compare the parameters of the distribution of the studied samples. Despite the widespread use and applied effectiveness of the feature descriptors apparatus for image classification, the statistical basis of these methods in their implementation in aggregate visual data systems and the choice of effective means to assess their effectiveness for distinguishing real images in application databases remains insufficiently studied. Objective. Development of an effective images classification method by introducing aggregate statistical features for the description components. Method. A metric image classifier based on feature aggregation for a set of image descriptors using statistical criteria for assessing the classification decision significance is proposed. Results. The synthesis of the classification method on the basis of the introduction of aggregated statistical features for a set of image description descriptors is carried out. The efficiency and effectiveness of the developed classifier are confirmed. On examples of application of a method for system of real images features its efficiency is experimentally estimated. Conclusions. The study makes possible to evaluate the applied effectiveness of the key points descriptors apparatus and build on its basis an aggregate features system for the effective visual objects classification implementation. Our research has shown that the available information in the form of a bit descriptors representation is sufficient for a significant statistical distinction between visual objects descriptions. Analysis of pairs and other blocks for descriptor bits provides a promising opportunity to reduce processing time. The scientific novelty of the study is the development of a method of image classification based on an integrated statistical features system for structural description, confirmation of the effectiveness of the method and the importance of the created features classification system in the image database. The practical significance of the work is to confirm the efficiency of the proposed methods on the real image descriptions examples.


INTRODUCTION
Statistical data science tools usage to build visual objects images classifiers in computer vision systems is aimed at providing the necessary performance based on the study of properties, content, structure of reference data and the introduction of obtained knowledge into the classification process [1][2][3][4][5][6]. An element of the image space in a vector data environment with real or binary components in the implementation of structural recognition methods is a finite set of key point descriptors (KP) of the image [2]. Recently, BRISK and ORB descriptors with binary components have become popular due to low computational costs [3][4][5][6]13].
Statistical data distributions are perfect tools for representing and analyzing visual data in image recognition systems. If the description of a recognized object is given by a set of vectors, the statistical apparatus becomes fundamental for making a classification decision. Data distributions research in the blocks systems for KP descriptors have shown their effectiveness in terms of providing the required quality of classification and processing speed [2]. There is a need for in-depth study of statistical properties descriptor sets in terms of the main problem -the multidimensional data separation for classification. This task becomes especially important when constructing new effective feature spaces, for example, by aggregating a set of descriptors by vector components [3,4,10]. For this purpose, it is natural to use the apparatus of statistical criteria designed to compare the parameters of the distribution of the studied samples.
The aggregator classifier organizes a new data space to describe as a set of descriptors, which evaluates the similarity of the feature vectors of the recognized object and a single reference image, and the classification is done by optimizing the degree of this similarity.
Probabilistic model of generating visual object descriptors vector data is a practical approach to formalize the process of classifier constructing, the essence of which is to build and study statistical distributions of objects or their components with the introduction of aggregation and optimization procedures on multiple classes [1,6].
Despite the widespread use and practical effectiveness of the apparatus of KP descriptors for the visual objects classification [2][3][4][5], there is still remains unexplored statistical basis of these methods and the choice of effective means to assess their effectiveness for real datasets [1,2].
The object of this research is the introduction of a statistical data analysis apparatus to build the image classifier based on the aggregate data representation as a set of KP descriptors and confirm its effectiveness.
The subject of the research is the synthesis of the classifier on the basis of aggregated features and statistical proof of the separation properties of these features for reference classes and examples of input images.
The aim is to develop a performance-efficient method of image classification by introducing aggregate features for the composition of the description components.

PROBLEM STATEMENT
Consider a multidimensional space n B of any binary vectors of dimension n , where we will construct the object descriptions and reference images. Description Z is defined on the basis of the KP descriptor set of the visual object in the form of a finite set of binary vectors of dimension S n v In a more detailed view, we will consider and analyze the description as a matrix of binary values with n s × size. We will traditionally consider classification as a reflection where each class is represented by a reference descriptor in j E , m j ,..., 1 = , which are available for analysis [2].
Let's study the visual objects classification as assigning their description to one of the reference classes, based on the aggregate representation of the description data using the tools and criteria of mathematical statistics. In a general case, the classification problem is formally reduced to establishing the degree of similarity of two vector sets with binary components of equivalent size. We will build a secondary integrated system of features We use a metric approach to determine the degree of similarity of feature values θ for object and reference images. The introduction of aggregate features contributes to a significant acceleration of the classifier decision process, the gain in comparison with the traditional method of voting descriptors reaches hundreds of times [5,6]. Another task is to investigate the separation properties of the newly created system of features using traditional statistical criteria.

REVIEW OF THE LITERATURE
The formal definition of the classification problem with the description of the image as a set of KP descriptors is formulated in [2][3][4], which also studies the advantages of implementing a structural description model in the methods of statistical classification [1,[5][6][7][8][9]. It is noted that the primary problem is the excessive computational costs caused by large arrays of vector data. Articles [4-6, 9, 16, 17, 26] investigate statistical models for the synthesis of feature space modifications to reduce the amount of computation, in particular, the application of data aggregation methods by forming distributions and defining statistical data centers. Works [1,2,8,14] are devoted directly to the analysis of learning models for the fixed base of descriptions used in computer vision and the definition of the function of belonging to a fixed system of classes.
Studies [1,2,8,15,23] contain results on the applied implementation of statistical approaches to the visual images classification using an ensemble processing. In [1,[6][7][8][17][18][19][20] methods of evaluating the effectiveness of intelligent systems using statistical and metric measures of similarity are described. The advantages of statistical solutions such as high processing speed, sufficient distor-tions resistance and ensuring the required level of classification efficiency are discussed.
Works [11,12] are used as sources of traditional and modern methods of statistical evaluation, the book [15] contains a description of applied features of software modeling, and sources [2-6, 10, 23] include the results of authors' research in implementing statistical approaches to develop structural methods image classification. In particular, [2] proposed technologies of component analysis and spatial processing for the classification of visual objects using statistical characteristics of the structural description of the image.

MATERIALS AND METHODS
We introduce a mapping , from a fixed set Z of binary vectors -KP descriptors for a given object into an integer vector , the components of which will be calculated by some rule, according to which N = n or N = s. This will make it possible to identify and distinguish visual objects on the basis of smaller data, as set of vectors is transformed into a single vector [4].
We will classify on the basis of estimating the differences in the values of vectors θ for different descriptions, the calculation of which is proposed in two different ways, which aims to take into account the structural features of the studied data and, as a result, ensure the efficiency of the recognition process.
According to the first method of determining vectors θ , we find the sum of binary values (number of units) consecutively for each bit with the number separately, based on the complete set of object descriptions Z . For a fixed description we obtain vectors of the form: The vector (1) is an aggregate parameter formed on a set of descriptor descriptors by bitwise analysis of data in the form of adding the values of the corresponding bits ).
If we consider the distribution of values by the i-th bit from the description of the object close to binomial, which is determined by the Bernoulli formula: as it known in mathematical statistics [11], can be interpreted as the average value of the appearance of the corresponding number of units in place of the і-th bit: obtained on the basis of the description Z can be considered as an aggregated parametric representation of the description, where the parameters are the probabilities i p of occurrence of single bits for the i -th component in the set Z. We introduce the representation ) 1 ( θ into the classification process, as it significantly reduces computational costs by transforming data from a set of vectors into a single parameter-vector [2][3][4].
-a vector aggregated by columns of the matrix for the binary description of the reference j E with the number m j ,..., 1 = , according to (1) and -an aggregate vector for the description of the studied object O.
To compare the aggregate descriptions of objects of type (1) and, accordingly, to solve the classification problem, we introduce the classifier [2] ] Classifier (2) implements the principle of analysis "object -reference image" based on the aggregate vector representation θ [2]. We emphasize that expression (2) can be considered as a decisive rule, which is based on the likelihood function, presented, in contrast to its classical probabilistic representation [1], in terms of metrics, in particular, Manhattan.
To confirm the significance of the decision, as well as to control the obtained result of the classifier (2) with the involvement of aggregate vectors, we use methods of mathematical statistics, namely, a paired two-sample t-test for averages [11,12], which provides pairwise (coordinate) comparison of the studied objects -vectors θ for a statistically significant difference in their average values. When using this test, two samples of the same volume are considered, in which the elements have a fixed location (as coordinates).
In the process of testing the null hypothesis regarding the equality of the averages in these samples, Student's statistics is used [11]; a level of significance α is established, equal to the probability of making an error of the І kind, i.e. rejecting the null hypothesis if it was correct; based on the initial data, the p-value is calculated as the maximum possible probability of error of the І kind. Then, if the p-value is less than the established α, then the null hypothesis is rejected, and an alternative hypothesis is accepted regarding the significant difference of the means (at the level of significance α). Otherwise, there is no reason to reject the null hypothesis of no statistically significant differences between the means [11].
Based on the features i θ , it is possible to calculate higher-level features k u for data blocks as sets of columns [1] In relation (4) the relation is established . Features (4) implement cross-correlation processing of the matrix Z with a rectangular mask size s b × [1,14]. As a result of calculation (4) we obtain an integer vector k u of dimension q . The parameter q is a characteristic of the newly created system of fragments, it varies from n to 1 with increasing fragment size from 1 to n .
The values of the vector ) ,..., ,..., can be used for classification as independent structural features of the statistical type. Given a simple model for calculating functions (4), they are all easy to determine for an arbitrary fragment size (logically or by adding integers). Based on representation (4), a hierarchical recognition method can also be used, which uses a system of features k u with different degrees of data integration to compare with reference set [2,16]. The range of integer values for features k u can be directly determined by the size of the fragment } ,..., as Model (4) implements the procedure of reducing the information redundancy of the spatial signal due to the allowable resolution reduction of the feature system [2,14].
Note also that for the sake of universality of the study, it may be appropriate to perform analysis of variance of aggregate vectors constructed from reference j E , m j ,..., 1 = , in order to ensure that the reduction of data dimensionality did not affect the difference in the references set. In this case, since the structure of the vectors aggregated by formula (1) involves the consideration of paired samples, in this case it is possible to use only nonparametric analysis of variance, for example, in the form of the Friedman test [11].
The second way to calculate vectors θ is to represent the components of the vector as the sum of unit bits separately for each binary descriptor of the description. In this case, by adding the elements of the rows of the matrix (1) we obtain aggregate description vectors in the form: Note that the process of formation of aggregate vectors in the form (5) leads to the creation of already independent samples, the study of which does not involve a coordinate comparison. In this case, due to the independent nature of the data, it is proposed to check for a significant difference only by statistical methods. The essence of the introduction in this situation of a two-sample t-test for averages for two independent samples is to compare the averages of two sets of disordered elements, which are the number of units calculated separately for each binary descriptor description, the sequence of which obviously does not matter. The procedure for implementing the test remains the same as for the case of dependent (paired) samples, and differs only in the formula, according to which the relevant statistics are calculated on the basis of the studied data [11].
Note that here it may also be appropriate to perform analysis of variance of aggregate vectors (5), built on the references j E , m j ,..., 1 = ; in order to verify the fact of statistically significant differences in the set of these vectors. Here, the structure of aggregate vectors involves the consideration of independent samples, which involves the use of methods of both parametric and nonparametric variance analysis (for example, in the form of the Kruskal-Wallis test) [11].

EXPERIMENTS
Consider an example with experimental descriptions of three fixed reference images E1, E2, E3, and E4, obtained from E1 by rotation. Examples of images based on the results of software modeling with the formed coordinates of the BRISK KP descriptors are shown in Fig. 1 [13,15,21]. For the descriptions of these images in the form of a set of descriptors, our calculations are performed.
In the calculation example, n = 512 is the dimension of the descriptor, s = 500 is the number of descriptors in the description, m is the number of reference images or classes (m = 3).
We proceed to the implementation of the proposed classification approach based on the values of the parameters θ in the database of three reference images E1, E2, E3 (Fig. 1) and the image E4, transformed by rotation of E1. Note that the representation using KP descriptors provides invariance to the transformations of displacement, rotation and scale of the analyzed object [2].
Fragments of the calculation results for aggregate vec- , of the form (1) for objects E1, E2, E3, E4 are given in Table 1.
According to formulas (2, 3) with the substitution of indicators E4 we have:  We see that the use of classifier (2) leads to the correct recognition of the object E4 as a transformed reference image E1, as its similarity with the first reference image is the greatest.
For the hierarchical features obtained by expression (4) at the size of the fragment 2 = b obtained the following values of these indicators: 0,986; 0,934; 0,908.
As you can see, they differ slightly from the values for the full description ( 1 = b ), but the data vector is reduced by 2 times, which allows for further reduction of computational volumes.
Confirmation of the fact of statistically significant proximity of E4 to E1, as well as statistically significant difference of E4 from other reference images E2, E3 is obtained using a paired two-sample t-test for averages applied to samples represented by aggregate vectors (1), the results of which are shown in table 2 In this case, for the hierarchical features obtained by expression (4)    . This corresponds to the calculation of hierarchical features when performing a merge for pairs of descriptors that appeared next to each other in the description. At the same time, the amount of data is also halved.
The obtained P-values for this case were 0.886; 0.034; 0,0000000007. This fact in comparison with the data of tab. 4 indicates the resistance of the considered methods to accidental interference, as the previous classification conclusions are fully confirmed.

RESULTS
The main result of this study is the development of images classification models based on the statistical analysis of component sets in the images descriptions and metric means of class selection. The proposed variants of data analysis models are based on the degree of similarity between the object and reference images, are workable and provide sufficient classification efficiency. Computational simulation performed on the example with 3 reference images confirmed efficiency of the proposed approach with use of significant data difference statistical criteria. Variants of the generalized features system synthesis that implies the further compression of the description sets and acceleration of classification procedures are also analyzed.

DISCUSSION
The synthesis of an aggregate feature system based on KP descriptor set makes possible to build a classifier that works successfully for the real images database.
As you can see, the first P-value (Table 2) for both op- significantly exceeds the significance level α = 0.05 (equal to the probability of error of the first kind), which indicates the absence of statistically signifi- Note, that for a large sample size (in the example we have n = 512), checking the data for compliance with the normal distribution law when using a paired t-test is not mandatory [11].
Note also, that the visual comparison of bar charts, which is a graphical representation of the vectors aggregated by formula (1), is a clear confirmation of the results obtained on the difference of objects (Fig. 2)  Similarly to the previous one, the visual comparison of bar diagrams, which are a graphical interpretation of the vectors aggregated by formula (5), confirms the obtained results (Fig. 3).
The proposed classifier construction method allows further generalization in terms of fragment size aggregation that implies reduction of processing time.
CONCLUSIONS The urgent problem of mathematical support Statistical data analysis is a powerful research tool for intelligent decision making, machine learning and data science. The study makes possible to assess the applied effectiveness of the key points descriptors apparatus and build on its basis an aggregate features system for the effective visual objects classificator implementation. Our research has shown that the available information in the form of a bit representation of the descriptors is sufficient for statistical separation of data for different visual objects. Analysis of pairs and other blocks of bits reduces the processing time.
The scientific novelty of the study is the development of a method of image classification based on an integrated statistical features system for structural description, confirmation of the effectiveness of the method and the importance of the created features classification system in the image database.
The practical significance of the work is to confirm the efficiency and effectiveness of the proposed methods on the examples of real images descriptions.
Prospects for the study are related to the further enhancement and application of the developed classifiers in large-scale visual data bases.

ACKNOWLEDGEMENTS
The work was performed within the framework of the state budget research of Kharkiv National University of Radio Electronics "Deep hybrid systems of computational intelligence for data flow analysis and their rapid learning" (№ ДР0119U001403).