• S. A. Subbotin



sample, example selection, data reduction, data mining, data dimensionality reduction


In data mining problem solving it has to operate with a large amount of data samples. This entails a significant amount of time to process the data. Therefore, an urgent task is to reduce the dimensionality of the data samples. The aim of paper is to provide a method for the formation and reduction of samples, allowing to handle a large amount of the original sample. The problem of sample formation and reduction for data mining was solved. The scientific novelty of the work lies in the fact that the method of sample formation and reduction is firstly proposed. It provides a saving of the most important topological properties of original sample in the formed sub-sample without the need for downloading the original sample to the computer memory, and without numerous passages of the original sample. It allows to reduce the size of the sample and to reduce the resource requirements of a computer. The practical significance of the work lies in the development of software, which implements the proposed method of sample formation and reduction, also as conducting of experiments on research of proposed method to solve practical problems, the results of which allows to recommend the developed method for use in practice in solving problems of data mining. Using the proposed method one can significantly reduce the amount of a sample (in 7,7–12,5 times), without the need to download the original sample into computer memory, providing preservation in the generated sub-sample the most important for analysis of the topological properties of the original sample.



How to Cite

Subbotin, S. A. (2013). SAMPLE FORMATION AND REDUCTION FOR DATA MINING. Radio Electronics, Computer Science, Control, (1).



Neuroinformatics and intelligent systems