CREDIBILISTIC ROBUST ONLINE FUZZY CLUSTERING IN DATA STREAM MINING TASKS

Authors

  • A. Yu. Shafronenko Kharkiv National University of Radio Electronics, Kharkiv, Ukraine , Ukraine
  • N. V. Kasatkina National University of Food Technologies, Kyiv, Ukraine , Ukraine
  • Ye. V. Bodyanskiy Kharkiv National University of Radio Electronics, Kharkiv, Ukraine, Ukraine
  • Ye. O. Shafronenko Kharkiv National University of Radio Electronics, Kharkiv, Ukraine , Ukraine

DOI:

https://doi.org/10.15588/1607-3274-2023-3-10

Keywords:

fuzzy clustering, distorted data, credibilistic fuzzy clustering, Data Stream Mining, robust function

Abstract

Context. The task of clustering-classification without a teacher of data arrays occupies an important place in the general problem of Data Mining, and for its solution there exists currently many approaches, methods and algorithms. There are quite a lot of situations where the real data to be clustered are corrupted with anomalous outliers or disturbances with non-Gaussian distributions. It is clear that “classical” methods of artificial intelligence (both batch and online) are ineffective in this situation. The goal of the paper is to develop a credibilistic robust online fuzzy clustering method that combines the advantages of credibilistic and robust approaches in fuzzy clustering tasks.

Objective. The goal of the work is online credibilistic fuzzy clustering of distorted data, using of credibility theory in data stream mining.

Method. The procedure of fuzzy clustering of data using credibilistic approach based on the use of both robust goal functions of a special type, insensitive to outliers and designed to work both in batch and its recurrent online version designed to solve Data Stream Mining problems when data are fed to processing sequentially in real time.

Results. Analyzing the obtained results overall accuracy of clustering methods and algorithm, proposed method similar with result of credibilistic fuzzy clustering method, but has time superiority regardless of the number observations that fed on clustering process.

Conclusions. The problem of fuzzy clustering of data streams contaminated by anomalous non-Gaussian distributions is considered. A recurrent credibilistic online algorithm based on the objective function of a special form is introduced, which suppresses these outliers by using the hyperbolic tangent function, which, in addition to neural networks, is used in robust estimation tasks. The proposed algorithm is quite simple in numerical implementation and is a generalization of some well-known online fuzzy clustering procedures intended for solving Data Stream Mining problems.

Author Biographies

A. Yu. Shafronenko, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

PhD, Associate Professor at the Department of Informatics

N. V. Kasatkina, National University of Food Technologies, Kyiv, Ukraine

Dr. Sc., Professor, Division of doctoral and post-graduate training

Ye. V. Bodyanskiy, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

Dr. Sc., Professor, Professor at the Department of Artificial Intelligence

Ye. O. Shafronenko, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

Assistant at the Department of Media Engineering and Information Radio Electronic Systems

References

Gan G., Ma Ch., Wu J. Data Clustering: Theory, Algorithms and Applications. Philadelphia, Pennsylvania: SIAM: 2007. – 455 p. doi: https://doi.org/10.1137/ 1.9780898718348

Abony J., Feil D. Cluster Analysis for Data Mining and System Identification. Basel, Birkhouser, 2007, 303 p.

Xu R., Wunsch D. C. Clustering. Hoboken N.J., John Wiley & Sons, Inc., 2009, 398 p.

Bezdek J. C. Pattern recognition with fuzzy objective function algorithms. New York, Springer, 1981, 253 p. DOI https://doi.org/10.1007/978-1-4757-0450-1.

Höppner F., Klawonn F., Kruse R., Runkler T. Fuzzy Clustering Analysis: Methods for Classification, Data Analysis and Image Recognition. Chichester, John Wiley &Sons, 1999, 300 p.

Zhou J., Wang Q., Hung C.-C., Yi X. Credibilistic clustering: the model and algorithms, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2015, Vol. 23, № 4, pp. 545–564. DOI: https://doi.org/ 10.1142/S0218488515500245

Zhou J., Wang Q., Hung C. C. Credibilistic clustering algorithms via alternating cluster estimation, Journal of Intelligent Manufacturing, 2017, Vol. 28, pp. 727–738. DOI: https://doi.org/10.1007/s10845-014-1004-6.

Tsuda K., Senda S., Minoh M., Ikeda K. Sequential fuzzy cluster extraction and its robust against noise, System and Computers in Japan, 1997, 28, pp. 10–17.

Höppner F., Klawonn F. Fuzzy clustering of sampled functions, 19th Int. Conf. North American Fuzzy Information Processing Society (NAFIPS). Atlanta, USA, 2000, pp. 257– 255.

Georgieva O., Klawonn F. A clustering algorithm for identification of single clusters in large data sets, Proc. 11th East – West Fuzzy Coll. Zittau/Görlitz, FH, 2004, pp. 118–125.

Kohonen T. Self-Organizing Maps. Berlin, Springer, 1995, 362 p. DOI: 10.1007/978-3-642-56927-2.

Park D. C., Dagger I. Gradient based fuzzy c-means (GBFCM) algorithm, IEEE International Conference on Neural Networks, 28 June – 2July,1984, proceedings. Orlando, IEEE, 1984, pp. 1626–1631. DOI: 10.1109 / ICNN. 1994.374399.

Bodyanskiy Ye. Computational intelligence techniques for data analysis, Lecture Notes in Informatics. Bonn, Gesellschaft für Informatik, 2005, pp. 15–36.

Shafronenko A., Bodyanskiy Ye., Klymova I., Holovin O. Online credibilistic fuzzy clustering of data using membership functions of special type [Electronic resource], Proceedings of The Third International Workshop on Computer Modeling and Intelligent Systems (CMIS-2020), April 27–1 May 2020. Zaporizhzhia, 2020. Access mode: http://ceur-ws.org/Vol-2608/paper56.pdf.

Shafronenko A., Bodyanskiy Ye., Pliss I., Klymova I. Online Credibilistic Fuzzy Clustering Method Based on Cauchy Density Distribution Function, 2021 11th International Conference on Advanced Computer Information Technologies (ACIT): proceedings. Deggendorf, Germany, IEEE, 2021, pp. 704–707. DOI: 10.1109/ ACIT52158.2021.9548572

Bodyanskiy Ye., Gorshkov Ye., Kokshenev I., Kolodyazhniy V. Robust recursive fuzzy clustering algorithms, Proc. 12th East West Fuzzy Coll 2005. Zittau-Grölitz, FH, 2005, pp. 301–308.

Bodyanskiy Ye. Gorshkov Ye., Kokshenev I., Kolodyazhniy V. Outlier resistant recursive fuzzy clustering algorithms, Ed. By B. Reusch «Computational Intelligence Theory and Applications» – Advances in Soft Computing, Vol. 38. Berlin Heidelberg, Springer Verlag, 2006, pp. 647– 652.

Arrow K. J., Hurwitz L., Uzawa H. Studies in Linear and Nonlinear Programming. Stanford University Press, 1958, 242 p.

Bodyanskiy Ye., Kolodyazhniy V., Stephan A. Recurcive fuzzy clustering algorithm, Proc 10th East West Fuzzy Coll, 2002. Zittau-Görlitz, HS, 2002, pp. 276–283.

Downloads

Published

2023-10-13

How to Cite

Shafronenko, A. Y., Kasatkina, N. V., Bodyanskiy, Y. V., & Shafronenko, Y. O. (2023). CREDIBILISTIC ROBUST ONLINE FUZZY CLUSTERING IN DATA STREAM MINING TASKS . Radio Electronics, Computer Science, Control, (3), 97. https://doi.org/10.15588/1607-3274-2023-3-10

Issue

Section

Neuroinformatics and intelligent systems