DEVELOPMENT OF METHOD FOR IDENTIFICATION THE COMPUTER SYSTEM STATE BASED ON THE DECISION TREE WITH MULTI-DIMENSIONAL NODES
Keywords:computer system, abnormal state, identification, decision tree, clustering, DBSCAN algorithm, hypersphere
Context. The problem of identifying the state of a computer system is considered. The object of the research is the process of computer system state identification. The subject of the research is the methods of constructing solutions for computer system state identification.
Objective. The purpose of the work is to develop a method for decision trees learning for computer system state identification.
Method. A new method for constructing a decision tree is proposed, combining the classical model for constructing a decision tree and the density-based spatial clustering method (DBSCAN). The simulation results showed that the proposed method makes it possible to reduce the number of branches in the decision tree, which will increase the efficiency of identifying the state of the computer system. Belonging to hyperspheres is used as a criterion for decision-making, which enables to increase the identification accuracy due to the nonlinearity of the partition plane and to perform a more optimal adjustment of the classifier. The method is especially effective in the presence of initial data with high correlation coefficients, since it combines them into one or more multivariate criteria. An assessment of the accuracy and efficiency of the developed method for identifying the state of a computer system is carried out.
Results. The developed method is implemented in software and researched in solving the problem of identifying the state of the functioning of a computer system.
Conclusions. The carried out experiments have confirmed the efficiency of the proposed method, which makes it possible to recommend it for practical use in order to improve the accuracy of identifying the state of a computer system. Prospects for further research may consist in the development of an ensemble of decision trees.
Daniel Schatz, Bashroush Rabih, Wall Julie. Towards a More Representative Definition of Cyber Security, The Association of Digital Forensics, Security and Law(ADFSL), 2017, Vol. 12, No. 2, pp. 53–74. DOI: 10.15394/jdfsl. 2017.1476
Farooq Anjum and Petros Mouchtaris. Intrusion Detection Systems, Security for Wireless Ad Hoc Networks. Wiley, 2007, pp. 120–159. DOI: 10.1002/ 9780470118474.ch5.
Kelleher J., Namee B., Archi A. Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples and Case, Dublin: The MIT Press, 2015, 642 p.
Iqbal H. Sarker, Shahriar Badsha, Hamed Alqahtani, Paul Watters, Alex Ng. Cybersecurity data science: an overview from machine learning perspective, Journal of Big Data, 2020, Vol. 7 (41) pp. 1–29. DOI: 10.1186/s40537-020-00318-5
Xavier Larriva-Novo, Mario Vega-Barbas, Victor A. Villagra, Diego Rivera, Manuel Alvarez-Campana, Julio Berrocal. Efficient Distributed Preprocessing Model for Machine Learning-Based Anomaly Detection over LargeScale Cybersecurity Datasets, Applied Sciences, 2020, Vol. 10, рр. 30–34. DOI: 10.3390/app10103430
Gavrylenko S., Semenov S., Sira O., Kuchuk N. Identification of the state of an object under conditions of fuzzy input data, Eastern-European Journal of Enterprise Technologies, 2019, Vol. 1, No. 4 (97), рр. 22–29. DOI: 10.15587/1729-4061.2019.157085
Alpaydin E. Introduction to Machine learning, London: The MIT Press, 2010, 400 p.
Kaminski B., Jakubczyk M., Szufel P. A framework for sensitivity analysis of decision trees, Central European Journal of Operations Research, 2018, Vol. 26, pp. 135–159. DOI: 10.1007/s10100-017-0479-6
Gavrylenko S., Sheverdin I, Kazarinov M. The ensemble method development of classification of the computer system state based on decision trees, Advanced Information Systems, Vol. 4, No. 3, рр. 5–10. DOI:10.20998/2522-9052.2020.3.01
SubbotIn S. Podannya y obrobka znan u sistemah shtuchnogo Intelektu ta pidtrimki priynyattya, Zaporizhzhya, ZNTU, 2008, 341 p.
Subbotin S. O. Postroenie derevev resheniy dlya sluchaya maloinformativnyih, Radio Electronics, Computer Science, Control, 2019, No. 1, рр. 122–130. DOI: 10.15588/1607-3274-2019-1-12
Mitrofanov S., E. Semenkin. An Approach to Training Decision Trees with the Relearning of Nodes, International Conference on Information Technologies (InfoTech), 2021, pp. 1–5, DOI: 10.1109/InfoTech52438.2021.9548520
Wang S., Wang, L., Jiang, C. Adapting naive Bayes tree classification, Knowledge and Information system, 2015, Vol. 44, No. 1, pp. 77–89. DOI: 10.1007/s10115-014-0746-y
Kornienko Y., Borisov A. A hybrid algorithm for decision tree generation, International Scientific Journal of Computing, 2004, Vol. 3, Issue 3, pp. 51–57. DOI: 10.47839/ijc.3.3.305
Irad Ben-Gal, Alexandra Dana, Niv Shkolnik, Gonen Singer. Efficient Construction of Decision Trees by the Dual Information Distance Method, Quality Technology & Quantitative Management, 2014, Vol. 11, No. 1, pp. 133–147. DOI: 10.1080/16843703.2014. 11673330
Geurts P., Ernst D., Wehenkel L. Extremely randomized trees, Machine Learning, 2006, Vol. 63, No. 1, pp. 3–42. DOI: 10.1007/s10994-006-6226-1
Kesinee Boonchuay. Krung Sinapiromsaran, Chidchanok Lursinsap Boundary expansion algorithm of a decision tree induction for an imbalanced dataset, Songklanakarin Journal of Science and Technology (SJST), 2017, Vol. 39, No. 5, pp. 665–673. DOI: 10.14456/sjst-psu.2017.82
Quinlan J. R. Induction of Decision Trees, Machine Learning, 1986, No. 1, pp. 81–106.
Hssina B.,Merbouha A., Ezzikouri H., Erritali M. comparative study of decision tree ID3 and C4.5, International Journal of Advanced Computer Science and Applications, 2014, Vol. 4(2), pp. 13–19. DOI: 10.14569/SpecialIssue.2014.040203
Idris Mochamad, Mustafid, Suseno Jatmiko Endro. Implementation of C4.5 Algorithm and Forward Chaining Method for Higher Education Performance Analysis, The 4th International Conference on Energy, Environment, Epidemiology and Information System, 2019, Vol. 125. DOI: 10.1051/e3sconf/ 201912521002
Painsky A., Rosset S. Cross-Validated Variable Selection inTree-Based Methods Improves Predictive Performance, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, Vol. 39, No. 11, pp. 2142–2153. DOI: 10.1109/TPAMI.2016.2636831.
Maddeh M., Ayouni S., Alyahya S., Hajjej F. Decision treebased Design Defects Detection, IEEE Access, 2021, Vol. 9, pp. 71606–71614. DOI: 10.1109/ACCESS.2021.3078724.
Gavrylenko S., Chelak V., Hornostal O. Ensemble approach based on bagging and boosting for Identification the Computer System State, Proceedings of the 31th International Scientific Symposium Metrology and Metrology Assurance.–Sozopol, Bulgaria IEEE Access, 2021. DOI:10.1109/ MMA52675.2021.9610949
How to Cite
Copyright (c) 2022 С. Ю. Гавриленко, В. В. Челак, С. Г. Семенов
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.