A NONLINEAR REGRESSION MODEL FOR EARLY LOC ESTIMATION OF OPEN-SOURCE KOTLIN-BASED APPLICATIONS

Authors

  • S. B. Prykhodko Admiral Makarov National University of Shipbuilding, Mykolaiv, Ukraine , Ukraine
  • N. V. Prykhodko Admiral Makarov National University of Shipbuilding, Mykolaiv, Ukraine, Ukraine
  • A. V. Koltsov Admiral Makarov National University of Shipbuilding, Mykolaiv, Ukraine , Ukraine

DOI:

https://doi.org/10.15588/1607-3274-2024-1-8

Keywords:

estimation, lines of code, open-source app, Kotlin, nonlinear regression model, Box-Cox transformation, class, weighted methods per class, depth of inheritance tree

Abstract

Context. The early lines of code (LOC) estimation in software projects holds significant importance, as it directly influences the prediction of development effort, covering a spectrum of different programming languages, and open-source Kotlin-based applications in particular. The object of the study is the process of early LOC estimation of open-source Kotlin-based apps. The subject of the study is the nonlinear regression models for early LOC estimation of open-source Kotlin-based apps.

Objective. The goal of the work is to build the nonlinear regression model with three predictors for early LOC estimation of open-source Kotlin-based apps based on the Box-Cox four-variate normalizing transformation to increase the confidence in early LOC estimation of these apps.

Method. For early LOC estimation in open-source Kotlin-based apps, the model, confidence, and prediction intervals of nonlinear regression were constructed using the Box-Cox four-variate normalizing transformation and specialized techniques. These techniques, relying on multiple nonlinear regression analyses incorporating multivariate normalizing transformations, account for the dependencies between variables in non-Gaussian data scenarios. As a result, this method tends to reduce the mean magnitude of relative error (MMRE) and narrow confidence and prediction intervals compared to models utilizing univariate normalizing transformations.

Results. An analysis has been carried out to compare the constructed model with nonlinear regression models employing decimal logarithm and Box-Cox univariate transformation.

Conclusions. The nonlinear regression model with three predictors for early LOC estimation of open-source Kotlin-based apps is constructed using the Box-Cox four-variate transformation. Compared to the other nonlinear regression models, this model demonstrates a larger multiple coefficient of determination, a smaller value of the MMRE, and narrower confidence and prediction intervals. The prospects for further research may include the application of other data sets to construct the nonlinear regression model for early LOC estimation of open-source Kotlin-based apps for other restrictions on predictors.

Author Biographies

S. B. Prykhodko, Admiral Makarov National University of Shipbuilding, Mykolaiv, Ukraine

Dr. Sc., Professor, Head of the Department of Software for Automated Systems

N. V. Prykhodko, Admiral Makarov National University of Shipbuilding, Mykolaiv, Ukraine

PhD, Associate Professor, Associate Professor of the Finance Department

A. V. Koltsov, Admiral Makarov National University of Shipbuilding, Mykolaiv, Ukraine

Post-graduate student of the Department of Software for Automated Systems

References

Ponnala R., Reddy C. R. K. Object Oriented Dynamic Metrics in Software Development: A Literature Review, International Journal of Applied Engineering Research, 2019, Vol. 14, No. 22, pp. 4161–4172.

Boehm B. W., Abts C., Brown A. W. et al. Software cost estimation with COCOMO II. Upper Saddle River, NJ: Prentice Hall PTR, 2000, 506 p.

Rumiński B. Top Apps Built with Kotlin Multiplatform [2023 Update] [Electronic resource]. Access mode: https://www.netguru.com/blog/top-apps-built-with-kotlinmultiplatform

Kaczmarek J., Kucharski M. Size and effort estimation for applications written in Java, Information and Software Technology, 2004, Vol. 46, Issue 9, pp. 589–601. DOI: 10.1016/j.infsof.2003.11.001

Laird L. M., Brennan M. C. Software measurement and estimation. A practical approach. quantitative software engineering series. Wiley-IEEE Computer Society Press, 2006, 379 p.

Tan H. B. K., Zhao Y., Zhang H. Conceptual data modelbased software size estimation for information systems, Transactions on Software Engineering and Methodology, 2009, Vol. 19, Issue 2, pp. 1–37. DOI: 10.1145/1571629.1571630

Zifen Y. An improved software size estimation method based on object-oriented approach, Electrical & Electronics Engineering : IEEE Symposium EEESYM’12, Kuala Lumpur, Malaysia, 24–27 June 2012, proceedings. Los Alamitos: IEEE, 2012, pp. 615–617. DOI: 10.1109/EEESym.2012.6258733.

Kiewkanya M., Surak S. Constructing C++ software size estimation model from class diagram, Computer Science and Software Engineering : 13th International Joint Conference, Khon Kaen, Thailand, 13–15 July 2016 : proceedings. – Los Alamitos: IEEE, 2016, pp. 1–6. DOI: 10.1109/JCSSE.2016.7748880

Sholiq S., Dewi R. S., Subriadi A. P. A comparative study of software development size estimation method: UCPabc vs Function Points, Procedia Computer Science, 2017, Vol. 124, pp. 470–477. DOI: 10.1016/j.procs.2017.12.179

Prykhodko N. V., Prykhodko S. B. The non-linear regression model to estimate the software size of open source Java-based systems, Radio Electronics, Computer Science, Control, 2018, No. 3 (46), pp. 158–166. DOI: 10.15588/1607-3274-2018-3-17

Prykhodko S. B., Prykhodko N. V., Smykodub T. G. Chotyrokhfaktorna neliniyna regresiyna model dlia otsiniuvannia rozmiru Java-zastosunkiv z vidkrytym kodom, Vcheni zapysky TNU imeni V.I. Vernads’kogo. Serija: tehnichni nauky, 2020, Vol. 31 (70), Issue 2, pp. 157–162. DOI: https://doi.org/10.32838/2663-5941/2020.2-1/25

Prykhodko S. B., Shutko I. S., Prykhodko A. S. A nonlinear regression model to estimate the size of web apps created using the CakePHP framework, Radio Electronics, Computer Science, Control, 2021, No. 4 (59), pp. 129–139. DOI: 10.15588/1607-3274-2021-4-12

Manisha, Rishi, R. Early size estimation using machine learning, Proceedings of the 2021 8th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 17–19 March 2021 : proceedings. Los Alamitos, IEEE, 2021, pp. 757–762. DOI: 10.1109/INDIACom51348.2021.00135

Daud M., Malik A. A. Improving the accuracy of early software size estimation using analysis-to-design adjustment factors (ADAFs), IEEE Access, 2021, Vol. 9, pp. 81986– 81999. DOI: 10.1109/ACCESS.2021.3085752

Zhang K., Wang X., Ren J. et al. Efficiency improvement of function point-based software size estimation with deep learning model. Los Alamitos, IEEE, 2021, Vol. 9, pp. 107124–107136. DOI: 10.1109/ACCESS.2020.2998581

Ritu, Garg Y. Comparative Analysis of Machine Learning Techniques in Effort Estimation, Proceedings of the 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON), Faridabad, India, 2022 : proceedings. Los Alamitos, IEEE, 2022. pp. 401–405. DOI: 10.1109/COM-ITCON54601.2022.9850592

Brar P., Nandal D. A Systematic Literature Review of Machine Learning Techniques for Software Effort Estimation Models, Proceeding of the 2022 Fifth International Conference on Computational Intelligence and Communication Technologies (CCICT). Sonepat, India, 08–09 July 2022 – Los Alamitos, IEEE, 2022, pp. 494–499, DOI: 10.1109/CCiCT56684.2022.00093.

Assefa Y., Berhanu F., Tilahun A. et al. Software Effort Estimation using Machine learning Algorithm, Proceeding of the 2022 International Conference on Information and Communication Technology for Development for Africa (ICT4DA), Bahir Dar, Ethiopia, 28–30 November 2022 : proceedings. Los Alamitos, IEEE, 2022, pp. 163–168, DOI: 10.1109/ICT4DA56482.2022.9971209

Jadhav A., Shandilya S. K. , Izonin I. et al. Effective Software Effort Estimation Leveraging Machine Learning for Digital Transformation, IEEE Access, 2023, Vol. 11, pp. 83523–83536. DOI: 10.1109/ACCESS.2023.3293432

Kumar S., Arora M., Sakshi et al. A Review of Effort Estimation in Agile Software Development using Machine Learning Techniques, Proceedings of the 2022 4th International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 21–23 September 2022 : proceedings. Los Alamitos, IEEE, pp. 416–422. DOI: 10.1109/ICIRCA54612.2022.9985542

R. K. B. N. Software Effort Estimation using ANN (Back Propagation), Proceedings of the 2023 7th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 23–25 February 2023 : proceedings. Los Alamitos, IEEE, pp. 1–2. DOI: 10.1109/ICCMC56507.2023.10084264.

Sousa A. O., Veloso D. T., Gonçalves H. M. et al. Applying Machine Learning to Estimate the Effort and Duration of Individual Tasks in Software Projects, IEEE Access, 2023, Vol. 11, pp. 89933–89946. DOI: 10.1109/ACCESS.2023.3307310

Nassif A. B., AbuTalib M., Capretz L. F. Software effort estimation from use case diagrams using nonlinear regression analysis, Electrical and Computer Engineering : IEEE Canadian Conference CCECE’20. London, ON, Canada, 30 Aug.–2 Sept., 2020 : proceedings,IEEE, 2020, pp. 1–4. DOI: 10.1109/CCECE47787.2020.9255712.

Prykhodko S., Prykhodko N., Knyrik K. Estimating the efforts of mobile application development in the planning phase using nonlinear regression analysis, Applied Computer Systems, 2020, Vol. 25, No. 2, pp. 172–179. DOI: 10.2478/acss2020-0019

[Nhung H. L. T. K., Hai V. V. , Silhavy R. et al. Parametric Software Effort Estimation Based on Optimizing Correction Factors and Multiple Linear Regression, IEEE Access, 2022. Vol. 10, pp. 2963–2986. DOI: 10.1109/ACCESS.2021.3139183

Sahoo P., Behera D. K., Mohanty J. R. et al. Effort Estimation of Software products by using UML Sequence models with Regression Analysis, Proceedings of the 2022 OITS International Conference on Information Technology (OCIT), Bhubaneswar, India, 14–16 December 2022 : proceedings. Los Alamitos, IEEE, pp. 97–101, DOI: 10.1109/OCIT56763.2022.00028.

Cibir E., Ayyildiz T. E. An Empirical Study on Software Test Effort Estimation for Defense Projects, IEEE Access, 2022, Vol. 10, pp. 48082–48087. DOI: 10.1109/ACCESS.2022.3172326

Yuan X., Su J., Yu C. and Ye S. Power Grid Software Cost Estimation Based on Improved COCOMO Model, Proceeding of the 2023 IEEE 3rd International Conference on Electronic Technology, Communication and Information (ICETCI), Changchun, China, 26–28 May 2023 : proceedings. Los Alamitos, IEEE, pp. 1265–1269. DOI: 10.1109/ICETCI57876.2023.10176686

Prykhodko S., Prykhodko N., Makarova L. et al. Outlier Detection in Non-Linear Regression Analysis Based on the Normalizing Transformations, Proceedings of the 15th International Conference on Advanced Trends in Radioelectronics, Telecommunications and Computer Engineering (TCSET), IEEE, Lviv-Slavske, 25–29 February 2020 : proceeding. Los Alamitos, IEEE, pp. 407–410. DOI: 10.1109/TCSET49122.2020.235464

CodeMR [Electronic resource]. Access mode: https://plugins.jetbrains.com/plugin/10811-codemr

Mardia K. V. Measures of multivariate skewness and kurtosis with applications, Biometrika, 1970, Vol. 57, pp. 519– 530. DOI: 10.1093/biomet/57.3.519

Frost J. Overfitting Regression Models: Problems, Detection, and Avoidance [Electronic resource]. Access mode: https://statisticsbyjim.com/regression/overfitting-regressionmodels/

Downloads

Published

2024-04-02

How to Cite

Prykhodko, S. B., Prykhodko, N. V., & Koltsov, A. V. (2024). A NONLINEAR REGRESSION MODEL FOR EARLY LOC ESTIMATION OF OPEN-SOURCE KOTLIN-BASED APPLICATIONS. Radio Electronics, Computer Science, Control, (1), 85. https://doi.org/10.15588/1607-3274-2024-1-8

Issue

Section

Mathematical and computer modelling