TY  - JOUR
AU  - Subbotin, S. A.
PY  - 2017/11/09
Y2  - 2025/07/01
TI  - THE FRACTAL DIMENSION BASED QUALITY METRICS OF DATA SAMPLES AND DEPENDENCE MODELS
JF  - Radio Electronics, Computer Science, Control
JA  - RIC
VL  - 0
IS  - 2
SE  - Neuroinformatics and intelligent systems
DO  - 10.15588/1607-3274-2017-2-8
UR  - https://ric.zp.edu.ua/article/view/112260
SP  - 70 - 81
AB  - &lt;p&gt;Context. The problem of automating the sampling of the original sample a large amount for the construction of models precedent. The object of the study was to model quality samples to build the models precedents.&lt;/p&gt;&lt;p&gt;Objective. The goal of the work is the creation of a set of indicators to assess the quality of samples having a single nature, based on the principles of fractal analysis.&lt;/p&gt;&lt;p&gt;Method. A set of indicators is proposed to characterize the quality of the subsample with respect to the original sample with one point of view on the basis of the principles of fractal analysis. The methods of sample fractal dimension evaluation are proposed. They operating with rectangular blocks of equal size and covering by them the feature space. They are method not taking into account the characteristics of the synthesized model, method taking into account the error (accuracy) of synthesized model and method taking into account accuracy and complexity of the synthesized model. Along with the fractal dimension it is also provided a method for determining the sample quality indicators based on the principle of mass dimension with regard to data analysis. The proposed method divides the feature space on clusters of the same size and shape. The method allows obtaining different levels of sampling detail varying the size of the cluster. The method allows to determine the masses of the class center in the sample, the average distance between instances of the cluster, the normalized mean deviation of the distance between instances of their average mass and density of the instances of the cluster, the volume and surface area of rectangular cluster ratio of volume to surface area of the cluster, the weighted average of evenness of instances location in the clusters of a class, the mass and density of instances of the class, the weighted average of sample instances location.&lt;/p&gt;&lt;p&gt;Results. The developed indicators have been implemented in software and investigated for solving the problems of Fisher’s Iris classification.&lt;/p&gt;&lt;p&gt;Conclusions. The conducted experiments have confirmed the proposed software operability and allow recommending it for use in practice for solving the problems of diagnosis and automatic classification on the features. The prospects for further research may include the creation of parallel methods for calculation of set of proposed indicators, the optimization of their software implementations, as well as a experimental study of proposed indicators on more complex practical problems of different nature and dimensionality&lt;/p&gt;
ER  -