TY - JOUR
AU - Subbotin, S. A.
PY - 2020/12/22
Y2 - 2022/09/27
TI - THE POLAR COORDINATES BASED HASHING FOR DATA DIMENSIONALITY REDUCTION
JF - Radio Electronics, Computer Science, Control
JA - RIC
VL - 0
IS - 4
SE - Neuroinformatics and intelligent systems
DO - 10.15588/1607-3274-2020-4-12
UR - http://ric.zntu.edu.ua/article/view/218598
SP - 118 - 128
AB - <p>Context. To reduce the data dimensionality of in recognition and diagnostics problems based on hashing, it becomes necessary to reduce the time spent on generating a hashing transformation.</p><p>Objective. The purpose of the work is to reduce the time spent on reducing the dimension of data by creating a hashing method that does not require solving the optimization problem of finding the best random transformation, as well as reducing the loss of local properties of the feature space.</p><p>Method. A hash generation method is proposed. It converts the instance coordinates from the original feature system into a multidimensional polar coordinate system, on which basis discretize polar coordinates using heuristics, in various ways encodes and combines the values of the discretized polar coordinates, forming hashes of instances, from which as the resulting transformation selects the best one in the system of given criteria based on minimizing the number of collisions in which instances of different classes and different values of the original features receive the same hashes. This makes possible to automate the formation of hashing transformations, eliminate the need to solve optimization problems of enumerating random projections, ensuring a reduction in time consumption, and also makes the hashing transformation freer from imposing the data on the partitioning of the feature space, of a non-inherent nature, which allows increase the generalizing properties and accuracy of transformations. Criteria for evaluating the quality of hashing transformations are proposed, including determining the number of positive and negative collisions, as well as evaluating the probabilities of the corresponding collisions on their basis. This makes it possible to automate the analysis and selection of hashing transformations to reduce the dimension of the data in the problems of recognition and diagnosis.</p><p>Results. An experimental study has been carried out, which has confirmed the efficiency of the proposed methods in solving practical problems.</p><p>Conclusions. The developed mathematical support can be recommended for solving problems of data dimension reduction. </p>
ER -