DEVELOPMENT OF METHOD TO IDENTIFY THE COMPUTER SYSTEM STATE BASED ON THE «ISOLATION FOREST» ALGORITHM

Authors

  • S. Y. Gavrylenko National Technical University «Kharkiv Polytechnik Institute», Kharkiv, Ukraine.
  • I. V. Sheverdin National Technical University «Kharkiv Polytechnik Institute», Kharkiv, Ukraine.

DOI:

https://doi.org/10.15588/1607-3274-2021-1-11

Keywords:

computer system, operating system events, abnormal state, identification, machine learning, Isolation Forest algorithm.

Abstract

Context. The problem of identification a computer system state was investigated. The object of the research is the identification process of the computer system state. The subject of the research is computer system state identifying means and methods.

Objective. The purpose of the work is to develop a method for identifying the computer system state.

Method. The method has been developed for identifying a computer system state based on integrated use the procedure for grouping unlabeled initial data and using machine learning technology based on the «Isolation Forest» algorithm, which provides to identify a computer system state and to distinguished the process name that initiated the abnormal state. Therefore, for collecting statistical data in the form of operating system functioning events, data method has been proposed and developed along with software. The analysis of functioning events has been performed. The result of analysis showed that the most informative are read and write operations. To set up a single dataset, read and write operations compared with the process name and combined into one array of event groups, so that it is possible to single out the process that causes the abnormal state of the computer system. As a result of the research, the «Isolation Forest» algorithm has been selected as a component of the method for identifying the computer system state. An accuracy and efficiency assessment of the developed method of identifying a computer system state has been carried out.

Results. The developed method is implemented and investigated when solving the problem of identifying anomalies in the functioning of computer systems.

Conclusions. The experiments carried out confirmed the efficiency of the proposed method. It allows us recommended the method for practical use in order to improve efficiency of identifying the computer system state and use it as an express method. Areas for further research may lie in the creation of the ensemble of fuzzy trees based on the proposed method and optimization of this software implementation.

Author Biographies

S. Y. Gavrylenko , National Technical University «Kharkiv Polytechnik Institute», Kharkiv, Ukraine.

Dr. Sc., Associate Professor, Professor at Department of Computer Engineering and Programming. 

I. V. Sheverdin, National Technical University «Kharkiv Polytechnik Institute», Kharkiv, Ukraine.

Post-graduate student at Department of Computer Engineering and Programming. 

References

Kelleher, J., B. Namee, A. Archi Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies, The MIT Pres, 2015, 642 p.

Gavrylenko S., Semenov S., Sira O., Kuchuk N. Identification of the state of an object under conditions of fuzzy input data. Eastern-European Journal of Enterprise Technologies, 2019, Vol. 1, No. 4 (97), pp. 22–29. DOI: 10.15587/1729-4061.2019.157085

Subbotin S.O. Podannya j obrobka znan u sistemah shtuchnogo intelektu ta pidtrimki prijnyattya rishen. Zaporizhzhya, ZNTU, 2008, 341 p.

Bolshakov A.S., Gubankova E.V. Obnaruzhenie anomalij v kompyuternyh setyah s ispolzovaniem metodov mashinnogo obucheniya. Telekommunikacionnye ustrojstva i sistemy, 2020, Vol. 10, No. 1, pp. 37–42.

Lindigrin A. N. Sravnitelnyj analiz metodov mashinnogo obucheniya v zadachah obnaruzheniya setevyh anomalij, Izvestiya Tulskogo gosudarstvennogo universiteta. Tehnicheskie nauki, 2019, No. 12, pp. 400–404.

Wang S., Jiang L., Li C. Adapting naive Bayes tree classification, Knowledge and Information system, Vol. 44, No. 1, pp. 77–89. DOI: 10.1007/s10115-014-0746y

Kokoreva Ya., Makarov A. Poetapnyj process klasternogo analiza dannyh na osnove algoritma klasterizacii k-means, Molodoj uchenyj, 2015, No. 13, pp. 126–128.

Carlos A., Catania, Facundo Bromberg, Carlos Garcia Garino. An Autonomous Labelling Approach to Support Vector Machine Algorithms for Network Traffic Anomaly Detection, Expert Systems lications: An International Journal Archive, 2012, No. 39, рр. 45–49. DOI: 10.1016/j.eswa.2011.08.068

Malhotra Pankaj, Long Short Term Memory Networks for Anomaly Detection inTime Series, ESANN 2015 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2015.

Irad Ben-Gal, Alexandra Dana, Niv Shkolnik, Gonen Singer. Efficient Construction of Decision Trees by the Dual Information Distance Method, Quality Technology & Quantitative Management, 2014, Vol. 11, No. 1, pp. 133– 147. DOI: 10.1080/16843703 .2014.11673330

Aggarwal C. C., Sathe S. Theoretical foundations and algorithms for outlier ensembles, ACM SIGKDD Explo- rations Newsletter, 2015, Vol. 17, No. 1, pp. 24–47. DOI: 10.1145/2830544.2830549

Zimek A., Campello R. J. G. B., Sander J. Ensembles for unsupervised outlier detection: challenges and research questions a position paper, Acm Sigkdd Explorations Newsletter, 2014, Vol. 15, No. 1, pp. 11–22. DOI: 10.1145/2594473.2594476

Aggarwal C. C. Outlier ensembles: position paper, ACMSIGKDD Explorations Newsletter, 2017, Vol. 14, No. 2, pp. 49–58. DOI: 10.1145/2481244.2481252

Boutalbi Rafika, Chitibi Kheir Eddine. Boosted Decision Trees for Lithiasis Type Identification, International Journal of Advanced Computer Science and Applications, 2015, Vol. 6, No. 6, рp. 197–202.

Chandola V., Banerjee А., Kumar V. Anomaly detection:survey, ACM computing surveys (CSUR), 2009, No. 41, pp. 15–58. DOI: 10.1145/1541880.1541882.

Chowdhury M. Malware Analysis and Detection Using Data Mining and Machine Learning Classification, International Conference on Applications and Techniques in Cyber Security and Intelligence, ATCI, 2018, pp. 266–274.

Breiman, L. Random Forests, Machine Language, 2001, No. 45 (1), pp. 5–32.

Sheluhin O. I., Polkovnikov M. V. Primenenie algoritma «izoliruyushij les» dlya resheniya zadach obnaruzheniya anomalij. Reshenie, 2019, No. 1, pp. 186–18.

Liu, Fei Tony, Ting, Kai Ming and Zhou, Zhi-Hua. Isolation forest, Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, December 2008, pp. 413–422. DOI: 10.1109/ICDM.2008.17

Gavrylenko S., Sheverdin I., Kazarinov M. The ensemble method development of classification of the computer system state based on decisions trees, Advanced Information System, 2020, Vol. 4, No. 2, рр. 5–10. DOI: 10.20998/25229052.2020.3.01

Published

2021-03-27

How to Cite

Gavrylenko , S. Y. ., & Sheverdin, I. V. . (2021). DEVELOPMENT OF METHOD TO IDENTIFY THE COMPUTER SYSTEM STATE BASED ON THE «ISOLATION FOREST» ALGORITHM . Radio Electronics, Computer Science, Control, 1(1), 105–116. https://doi.org/10.15588/1607-3274-2021-1-11

Issue

Section

Neuroinformatics and intelligent systems