USING THE ANALYTIC HIERARCHY PROCESS WITH FUZZY LOGIC ELEMENTS TO OPTIMIZE THE DATABASE STRUCTURE
Keywords:corporate information system, database management system, distributed database, SQL-query, data replication, multicriteria problem, analytic hierarchy process, fuzzy logic, classification problem, naive Bayes algorithm
Context. Informational systems are very common and use databases to store information that users need. Many different data models can be used but the relational model is still relevant. The last decade show tendency of using distributed databases while working with relational data model and this approach requires a specially designed module to synchronize data of all separate databases. Considering optimizing the database structure, researchers didn’t pay much attention to the potential of users’ SQL-queries history. The optimal structure of all the distributed nodes could reduce the necessity of synchronization while the data access speed and its actuality would remain stable. The object of the research is the process of optimizing the structure of the distributed database of corporate information systems, which are based on the relational database’s model.
Objective. The research aims at improving the accuracy of the data representation marker’s value on the distributed corporate information system’s (DCIS) node, obtained using the analytic hierarchy process by applying the fuzzy logic elements while processing the alternatives’ global priority vector.
Method. The research’s authors in the set of their previous works emphasize the potential of using the collected history of users’ SQL queries. Firstly presented technology of users’ queries parsing. Then, the idea of using the multidimensional database for analyzing users’ queries by slices of workstation type, application, user, and his/her position was considered. Finally, the authors gave the full-scaled mathematical model for formalizing database and query models, and criteria of database structure’s optimality.
The current research continues the given sequence and tries to increase the efficiency of the decision support system, by introducing elements of fuzzy logic to the analytic hierarchy process algorithm. The approach’s main idea is in presenting the global priorities vector in the form of a series of fuzzy sets of one variable with subsequent transformation to the exact value. This approach made it possible to maintain the accuracy of the obtained result while decreasing the number of solution alternatives. For new tuples added to the database’s tables after all calculations had been performed, the problem was formalized. After obtaining the probability of a tuple belonging to the class “needed” and performing the normalization of the value, it is taken as the level of the representation marker. Accordingly, the data is loaded onto the node if this value is greater than the optimal level of the representation marker for the DCIS node.
Results. After calculating and obtaining the alternatives global priorities’ vector in order to improve the accuracy of the obtained result, the apparatus of fuzzy sets was used. The obtained vector of global priorities was presented as a vector of fuzzy digits for the data representation marker with subsequent transformation to the exact value. This approach made it possible to maintain the accuracy of the obtained result while decreasing the number of solution alternatives.
Conclusions. While working on the research, the concept of a data representation marker on the DCIS node for the elements of the SQL query model was introduced. An aggregation function has been developed that allows determining the level of need for attributes and tuples in the database’s relation for the DCIS node based on the statistics of SQL queries. A model of the dependence of the database structure’s optimality criteria on the value of the data representation marker is built. Received further development method of analytic hierarchy process. The initialization of the alternatives’ pairwise comparisons matrix can be performed automatically according to the obtained mathematical models. Representation of the obtained result in the form of the vector of fuzzy numbers with the reduction to the exact value allows increasing the accuracy of the obtained results.
Hamouda S., Zainol Z. Document-Oriented Data Schema for Relational Database Migration to NoSQL, 2017 International Conference on Big Data Innovations and Applications (Innovate-Data), Czech Republic, 2017, pp. 43–50. DOI: 10.1109/Innovate-Data.2017.13
Hows D., Membrey P., Plugge E., Hawkins T. The Definitive Guide to MongoDB. Berkeley, CA, Apress, 2015, 343 p. DOI: 10.1007/978-1-4842-1182-3
Thakur N., Gupta N. Relational and Non Relational Databases: A Review, Journal of University of Shanghai for Science and Technology, 2021, Vol. 23, No. 8, pp. 117–121. DOI: 10.51201/jusst/21/08341
Kundu P., Arora T. Research of Persistence Solution Based on ORM and Hibernate Technology, International Journal of Advanced Research in Computer Science and Software Engineering, 2017, Vol. 7, No. 4, pp. 359–362. DOI: 10.23956/ijarcsse/v7i3/0154
Becker J., Uhr W., Vering O. Systems for the Support of the Company Management, Retail Information Systems Based on SAP Products. Berlin, Springer Berlin Heidelberg, 2013, Chapter 5, pp. 121–150. DOI: 10.1007/978-3-662-09760-1_5
Petrova E. Overview of modern automation information systems activities of trade enterprises, Journal of management studies, 2018, Vol. 4, No. 9, pp. 76–85. DOI: 10.12737/article_5d68d5afb331c1.42407139
Christudas B. Practical Microservices Architectural Patterns. Berkeley, CA, Apress, 2019, 812 p. DOI: 10.1007/978-1-4842-4501-9
Peterson C., Wilson A., Pirkelbauer P. et al. Optimized Transactional Data Structure Approach to Concurrency Control for In-Memory Databases, 2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2020, pp. 107–115. DOI: 10.1109/SBAC-PAD49847.2020.00025
Perez L. L., Jermaine C. M. History-aware query optimization with materialized intermediate views, 2014 IEEE 30th International Conference on Data Engineering, 2014, pp. 520–531, DOI: 10.1109/ICDE.2014.6816678
Tsegelyk G. G., Krasniuk R. P. The optimization of databases replication in distributed information systems, Information Extraction and Processing, 2017, Vol. 45, No. 121, pp. 104–112.DOI:https://doi.org/10.15407/vidbir2017.45
Korniyenko B. Y., Galata L. P. Optimization of the Information System of the Corporate Network, MCM-TECH, Kamianets-Podilskyi National Ivan Ohiienko University, 2019, pp. 56–62. DOI: 10.32626/2308-5916.2019-19.56-62
Fisun M., Dvoretskyi M., Shved A. et al. Query parsing in order to optimize distributed DB structure, 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems, Technology and Applications (IDAACS), Bucharest, 2017, proceeding. Bucharest, IEEE, 2017, pp. 172–178. DOI: 10.1109/IDAACS.2017.8095071
Dvoretskyi M., Dvoretska S., Nezdoliy Y. et al. Data Utility Assessment while Optimizing the Structure and Minimizing the Volume of a Distributed Database Node, 1st International Workshop on Informationgies & Embedded Systems (ICTES), 2516, 2019, proceeding, CEUR Workshop, 2019, pp. 128–137. Available online: http://ceur-ws.org/Vol-2516/paper10.pdf
Dvoretskyi M., Dvoretska S., Horban H. et al. Optimization of the database structure of a distributed corporate information system node using the analytic hierarchy process, T&I Workshops, 2845, 2020, proceeding, CEUR Workshop, 2020, pp. 193–203. Available online: http://ceurws.org/Vol-2845/Paper_19.pdf
Fisun M., Dvoretskiy M., Dvoretska S. Building a model to optimize the database structure of the node in corporate information systems, Information technology and computer engineering: International Scientific and Technical Journal of Vinnytsia National Technical University, 2020, Vol. 48, No. 2, pp. 52–60. DOI: 10.31649/1999-9941-2020-48-2-52-60
Zadeh L. A., Klir G. J., Yuan B. Fuzzy Sets, Fuzzy Logic, and Fuzzy Systems, World scientific, 1996, 840 p. DOI: 10.1142/2895
Alang-Rashid N. K., Heger A. S. A general purpose fuzzy logic code, IEEE International Conference on Fuzzy Systems, 1992, proceeding, IEEE, 1992, pp. 733–742. DOI: 10.1109/FUZZY.1992.2587
Gozhyj A., Kalinina I., Gozhyj V. Fuzzy cognitive analy-sis and modeling of water quality, 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), 2017, proceeding. IEEE, 2017, pp. 289–293. DOI: 10.1109/IDAACS.2017.8095092
Yager R. R. On inference structures for fuzzy systems modeling, IEEE 3rd International Fuzzy Systems Conference. – 1994, Vol. 2, pp. 1252–1256. DOI: 10.1109/FUZZY.1994.343642
Nakamura K., Sakashita N., Nitta Y. et al. Fuzzy inference and fuzzy inference processor, IEEE Micro, 1993, Vol. 13, No. 5, pp. 37–48. DOI: 10.1109/40.238000
Dvoretskiy M., Dvoretska S., Davidenko E. Information technology for determining useful data while optimizing the structure and minimizing the volume of the distributed database node, Bulletin of Cherkasy State Technological University, 2019, No. 4, pp. 26–35. DOI: 10.24025/2306-4412.4.2019.184808
[Hegde R., Anusha G. V., Madival S. et al. Review on Data Mining and Machine Learning Methods for Student Scholarship Prediction, 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), 2021, proceeding, IEEE, 2021, pp. 923–927. DOI: 10.1109/ICCMC51019.2021.9418376
Zaki M. J., Meira W. J. Neural Networks, Data Mining and Machine Learning. Cambridge University Press, 2020, pp. 637–671. DOI: 10.1017/9781108564175.031
Graupe D. Deep Learning Neural Networks. World scientific, 2016, 280 p. DOI: 10.1142/10190
Janssen J., Laatz W. Naive Bayes, Statistische Datenanalyse mit SPSS. Springer Berlin Heidelberg, 2017, pp. 557–569. DOI: 10.1007/978-3-662-53477-9_25
Krishna S. Introduction to Database and Knowledge-Base Systems, World scientific, 1992, 344 p. DOI: 10.1142/1374
How to Cite
Copyright (c) 2022 M. L. Dvoretskyi, T. O. Savchuk, M. T. Fisun, S. V. Dvoretska
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.