ШВИДКА НЕЧІТКА ПРАВДОПОДІБНА КЛАСТЕРИЗАЦІЯ НА ОСНОВІ АНАЛІЗУ ПІКІВ ЩІЛЬНОСТІ РОЗПОДІЛУ ДАНИХ

Ye. V. Bodyanskiy; I. P. Pliss; A. Yu. Shafronenko

doi:10.15588/1607-3274-2022-1-9

Authors

Ye. V. Bodyanskiy Kharkiv National University of Radio Electronics, Kharkiv, Ukraine, Ukraine
I. P. Pliss Kharkiv National University of Radio Electronics, Kharkiv, Ukraine, Ukraine
A. Yu. Shafronenko Kharkiv National University of Radio Electronics, Kharkiv, Ukraine, Ukraine

DOI:

https://doi.org/10.15588/1607-3274-2022-1-9

Keywords:

fuzzy clustering, credibilistic clustering, density peak of dataset

Abstract

Context. The problem of clustering (classification without a teacher) is often occures when processing data arrays of various natures, which is quite an interesting and integral part of artificial intelligence. To solve this problem, there are many known methods and algorithms based on the principles of the distribution density of observations in the analyzed data. However, these methods are rather complicated in software implementation and are not without drawbacks, namely: the problem of determining significant clusters in datasets of different densities, multiepoch self-learning, getting stuck in local extrema of goal functions, etc. It should be noted that the methods based on the analysis of the peaks of the data distribution density are clear in nature, therefore, to expand the capabilities of these methods, it is advisable to introduce their fuzzy modification.

Objective. The aim of the work is to introduce fast fuzzy data clustering using density peaks distribution of the datasets, that can find the prototypes (centroids) of clusters that overlapping regardless of the amount of incoming data.

Method. The problem of fuzzy clustering data arrays based on a hybrid method that based on the simultaneous use of a credibilistic approach to fuzzy clustering and an algorithm for finding the types of distribution density of the initial data is proposed. A feature of the proposed method is computational simplicity and high speed, due to the fact that the entire array is processed only once, that is, eliminates the need for multi-era self-learning, implemented in traditional fuzzy clustering algorithms.

Results. A feature of the proposed method of fast fuzzy credibilistic clustering using of density peaks distribution is characterized by computational simplicity and high speed due to the fact that the entire array is processed only once, that is, the need for multiepoch self-learning is eliminated, which is implemented in traditional fuzzy clustering algorithms. The results of the computational experiment confirm the effectiveness of the proposed approach in clustering problems under conditions in the case when the clusters are ovelap.

Conclusions. The experimental results allow us to recommend the developed method for solving the problems of automatic clustering and data classification, as quickly as possible to find the centroids of clusters. The proposed method of fast fuzzy credibilistic clustering using of density peaks distribution of dataset is intended for use in computational intelligence systems, neuro-fuzzy systems, in training artificial neural networks and in clustering problems.

Author Biographies

Ye. V. Bodyanskiy, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

Dr. Sc., Professor at the Department of Artificial Intelligence

I. P. Pliss, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

PhD, Leading Researcher at Control Systems Research Laboratory

A. Yu. Shafronenko, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

PhD, Associated Professor at the Department of Informatics

References

Xu R., Wunsch D. C. Clustering. Hoboken N.J., John Wiley & Sons, Inc., 2009, 398 p.

Nadaraya E. A. On nonparametric estimates of density function and regression curves, Theory of Probabilistic Application, 1965, No. 10, pp. 186–190.

Epanechnikov V. A. Nonparametric estimation of multivariate probability density, Probability theory and its Application, 1968, 14, No. 2, pp. 156–161.

Fukunaga K., Hostler L. D. The estimation of the gradient of a density function with application in pattern recognition, IEEE Trans. on Inf. Theory, Jan., 1975, IEEE, 1975, No. 21, pp. 32–40. DOI: 10.1109/TIT.1975.10 55330.

Ester M., Kriegel H., Sandler J., Xu X. A density – based algorithm for discovering clusters in large spatial databases with noise, Proc. 2nd Int. Conf. on Knowledge Discovering and Data Mining – KDD96, N.Y.: AAAI Press, Aug. 2, 1996, pp. 226–231.

Hinneburg A., Klein D. An efficient approach to clustering in large multimedia databases with noise, Proc. 4th Int. Conf. in Knowledge Discovering and Data Mining – KDD98, N.Y.: AAAI Press, Aug. 27, 1998. Hinneburg, 1998, pp. 58–65.

Ankerst M., Brening M., Kriegel H., Sander J. OPTICS: Ordering points to identify the clustering structure. Proc. 1999 ACM SIGMOD Int. Conf. on Management of Data, Jun. 1, 1999. Philadelphia, 1999, pp.49–60.

Rodriguez A., Laio A. Clustering by fast seach and find of density peaks, Science, 2014, № 34, pp. 1492–1496.

Höppner F., Klawonn F., Kruse R., Runkler T. Fuzzy Clustering Analysis: Methods for Classification, Data Analisys and Image Recognition. Chichester, John Wiley &Sons, 1999, 300 p.

Zhou J., Wang Q., Hung C.-C., Yi X. Credibilistic clustering: the model and algorithms, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2015, Vol. 23, No. 4, pp. 545–564. DOI: https://doi.org/ 10.1142/S0218488515500245

Zhou J., Wang, Q., Hung C. C. Credibilistic clustering algorithms via alternating cluster estimation, Journal of Intelligent Manufacturing, 2017, Vol. 28, pp. 727–738. DOI: https://doi.org/10.1007/s10845-014-1004-6.

Begum N., Ulanova L., Wamg J., Klogh E. Accelerating dynamic time warping clustering with a novel admissible pruning strategy, Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 10, 2015. Sydney, NSW, Australia, pp. 49–58. DOI: https://doi.org/10.1145/2783258.27 83286.

Shafronenko A., Bodyanskiy Ye., Klymova I., Holovin O. Online credibilistic fuzzy clustering of data using membership functions of special type[Electronic resource], Proceedings of The Third International Workshop on Computer Modeling and Intelligent Systems (CMIS-2020), April 27–1 May 2020. Zaporizhzhia, 2020. Access mode: http://ceur-ws.org/Vol2608/paper56.pdf.

FAST FUZZY CREDIBILISTIC CLUSTERING BASED ON DENSITY PEAKS DISTRIBUTION OF DATA BROAKYSIS

Authors

DOI:

Keywords:

Abstract

Author Biographies

Ye. V. Bodyanskiy, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

I. P. Pliss, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

A. Yu. Shafronenko, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

References

Downloads

Published

How to Cite

Issue

Section

License

Creative Commons Licensing Notifications in the Copyright Notices

Information

Current Issue