FAST FUZZY CREDIBILISTIC CLUSTERING BASED ON DENSITY PEAKS DISTRIBUTION OF DATA BROAKYSIS
Keywords:fuzzy clustering, credibilistic clustering, density peak of dataset
Context. The problem of clustering (classification without a teacher) is often occures when processing data arrays of various natures, which is quite an interesting and integral part of artificial intelligence. To solve this problem, there are many known methods and algorithms based on the principles of the distribution density of observations in the analyzed data. However, these methods are rather complicated in software implementation and are not without drawbacks, namely: the problem of determining significant clusters in datasets of different densities, multiepoch self-learning, getting stuck in local extrema of goal functions, etc. It should be noted that the methods based on the analysis of the peaks of the data distribution density are clear in nature, therefore, to expand the capabilities of these methods, it is advisable to introduce their fuzzy modification.
Objective. The aim of the work is to introduce fast fuzzy data clustering using density peaks distribution of the datasets, that can find the prototypes (centroids) of clusters that overlapping regardless of the amount of incoming data.
Method. The problem of fuzzy clustering data arrays based on a hybrid method that based on the simultaneous use of a credibilistic approach to fuzzy clustering and an algorithm for finding the types of distribution density of the initial data is proposed. A feature of the proposed method is computational simplicity and high speed, due to the fact that the entire array is processed only once, that is, eliminates the need for multi-era self-learning, implemented in traditional fuzzy clustering algorithms.
Results. A feature of the proposed method of fast fuzzy credibilistic clustering using of density peaks distribution is characterized by computational simplicity and high speed due to the fact that the entire array is processed only once, that is, the need for multiepoch self-learning is eliminated, which is implemented in traditional fuzzy clustering algorithms. The results of the computational experiment confirm the effectiveness of the proposed approach in clustering problems under conditions in the case when the clusters are ovelap.
Conclusions. The experimental results allow us to recommend the developed method for solving the problems of automatic clustering and data classification, as quickly as possible to find the centroids of clusters. The proposed method of fast fuzzy credibilistic clustering using of density peaks distribution of dataset is intended for use in computational intelligence systems, neuro-fuzzy systems, in training artificial neural networks and in clustering problems.
Xu R., Wunsch D. C. Clustering. Hoboken N.J., John Wiley & Sons, Inc., 2009, 398 p.
Nadaraya E. A. On nonparametric estimates of density function and regression curves, Theory of Probabilistic Application, 1965, No. 10, pp. 186–190.
Epanechnikov V. A. Nonparametric estimation of multivariate probability density, Probability theory and its Application, 1968, 14, No. 2, pp. 156–161.
Fukunaga K., Hostler L. D. The estimation of the gradient of a density function with application in pattern recognition, IEEE Trans. on Inf. Theory, Jan., 1975, IEEE, 1975, No. 21, pp. 32–40. DOI: 10.1109/TIT.1975.10 55330.
Ester M., Kriegel H., Sandler J., Xu X. A density – based algorithm for discovering clusters in large spatial databases with noise, Proc. 2nd Int. Conf. on Knowledge Discovering and Data Mining – KDD96, N.Y.: AAAI Press, Aug. 2, 1996, pp. 226–231.
Hinneburg A., Klein D. An efficient approach to clustering in large multimedia databases with noise, Proc. 4th Int. Conf. in Knowledge Discovering and Data Mining – KDD98, N.Y.: AAAI Press, Aug. 27, 1998. Hinneburg, 1998, pp. 58–65.
Ankerst M., Brening M., Kriegel H., Sander J. OPTICS: Ordering points to identify the clustering structure. Proc. 1999 ACM SIGMOD Int. Conf. on Management of Data, Jun. 1, 1999. Philadelphia, 1999, pp.49–60.
Rodriguez A., Laio A. Clustering by fast seach and find of density peaks, Science, 2014, № 34, pp. 1492–1496.
Höppner F., Klawonn F., Kruse R., Runkler T. Fuzzy Clustering Analysis: Methods for Classification, Data Analisys and Image Recognition. Chichester, John Wiley &Sons, 1999, 300 p.
Zhou J., Wang Q., Hung C.-C., Yi X. Credibilistic clustering: the model and algorithms, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2015, Vol. 23, No. 4, pp. 545–564. DOI: https://doi.org/ 10.1142/S0218488515500245
Zhou J., Wang, Q., Hung C. C. Credibilistic clustering algorithms via alternating cluster estimation, Journal of Intelligent Manufacturing, 2017, Vol. 28, pp. 727–738. DOI: https://doi.org/10.1007/s10845-014-1004-6.
Begum N., Ulanova L., Wamg J., Klogh E. Accelerating dynamic time warping clustering with a novel admissible pruning strategy, Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 10, 2015. Sydney, NSW, Australia, pp. 49–58. DOI: https://doi.org/10.1145/2783258.27 83286.
Shafronenko A., Bodyanskiy Ye., Klymova I., Holovin O. Online credibilistic fuzzy clustering of data using membership functions of special type[Electronic resource], Proceedings of The Third International Workshop on Computer Modeling and Intelligent Systems (CMIS-2020), April 27–1 May 2020. Zaporizhzhia, 2020. Access mode: http://ceur-ws.org/Vol2608/paper56.pdf.
How to Cite
Copyright (c) 2022 Ye. V. Bodyanskiy, I. P. Pliss, A. Yu. Shafronenko
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.