THE NON-LINEAR REGRESSION MODEL TO ESTIMATE THE SOFTWARE SIZE OF OPEN SOURCE JAVA-BASED SYSTEMS

N. V. Prykhodko, S. B. Prykhodko

Abstract


Context. The problem of estimating the software size in the early stage of a software project is important, since the information
obtained from estimating the software size is used for predicting the software development effort, including open-source Java-based
information systems. The object of the study is the process of estimating the software size of open-source Java-based information
systems. The subject of the study is the regression models for estimating the software size of open-source Java-based information
systems.
Objective. The goal of the work is the creation of the non-linear regression model for estimating the software size of open-source
Java-based information systems on the basis of the Johnson multivariate normalizing transformation.
Method. The model, confidence and prediction intervals of multiply non-linear regression for estimating the software size of
open-source Java-based information systems are constructed on the basis of the Johnson multivariate normalizing transformation for
non-Gaussian data with the help of appropriate techniques. The techniques to build the models, equations, confidence and prediction
intervals of non-linear regressions are based on the multiple non-linear regression analysis using the multivariate normalizing
transformations. The appropriate techniques are considered. The techniques allow to take into account the correlation between
random variables in the case of normalization of multivariate non-Gaussian data. In general, this leads to a reduction of the mean
magnitude of relative error, the widths of the confidence and prediction intervals in comparison with the linear models or nonlinear
models constructed using univariate normalizing transformations.
Results. Comparison of the constructed model with the linear model and non-linear regression models based on the decimal
logarithm and the Johnson univariate transformation has been performed.
Conclusions. The non-linear regression model to estimate the software size of open-source Java-based information systems is
constructed on the basis of the Johnson multivariate transformation for SB family. This model, in comparison with other regression
models (both linear and non-linear), has a larger multiple coefficient of determination, a larger value of percentage of prediction and
a smaller value of the mean magnitude of relative error. The prospects for further research may include the application of other
multivariate normalizing transformations and data sets to construct the non-linear regression model for estimating the software size
of open-source Java-based information systems.

Keywords


software size estimation; Java-based information system; non-linear regression model; univariate normalizing transformation; non-Gaussian data.

References


Kaczmarek J., Kucharski M. Size and effort estimation for

applications written in Java, Information and Software

Technology, 2004, Vol. 46, Issue 9, pp. 589–601. DOI:

1016/j.infsof.2003.11.001

Tan H. B. K., Zhao Y., Zhang H. Estimating LOC for

information systems from their conceptual data models,

Software Engineering : the 28th International Conference

(ICSE '06), Shanghai. China, May 20–28, 2006 :

proceedings, pp. 321–330. DOI: 10.1145/1134285.1134331

Tan H. B. K., Zhao Y., Zhang H. Conceptual data modelbased software size estimation for information systems,

Transactions on Software Engineering and Methodology,

, Vol. 19, Issue 2, October 2009, Article No. 4. DOI:

1145/1571629.1571630

Kiewkanya M., Surak S. Constructing C++ software size

estimation model from class diagram, Computer Science and

Software Engineering : 13th International Joint Conference,

Khon Kaen, Thailand, July 13–15, 2016 : proceedings,

pp. 1–6. DOI: 10.1109/JCSSE.2016.7748880

Bates D. M., Watts D. G. Nonlinear Regression Analysis

and Its Applications. New York, John Wiley & Sons, 1988,

p. DOI:10.1002/9780470316757

Seber G.A.F., Wild C. J. Nonlinear Regression. New York,

John Wiley & Sons, 1989, 768 p. DOI: 10.1002/0471725315

Ryan T.P. Modern regression methods. New York, John

Wiley & Sons, 1997, 529 p. DOI: 10.1002/9780470382806

Johnson R. A., Wichern D. W. Applied Multivariate

Statistical Analysis. Pearson Prentice Hall, 2007, 800 p.

Prykhodko S. B. Developing the software defect prediction

models using regression analysis based on normalizing

transformations, Modern Problems in Testing of the Applied

Software : the Research and Practice Seminar (PTTAS-

, Poltava, Ukraine, May 25–26, 2016 : abstracts,

pp. 6–7.

Stanfield P. M., Wilson J. R., Mirka G. A., Glasscock N. F.,

Psihogios J. P., Davis J. R. Multivariate input modeling with

Johnson distributions, The 28th Winter simulation

conference WSC’96, Coronado, CA, USA, December 8–11,

: proceedings, ed. S. Andradуttir, K. J. Healy,

D.H.Withers, and B. L. Nelson, IEEE Computer Society

Washington, DC, USA, 1996, pp. 1457–1464.


GOST Style Citations








Copyright (c) 2018 N. V. Prykhodko, S. B. Prykhodko

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Address of the journal editorial office:
Editorial office of the journal «Radio Electronics, Computer Science, Control»,
Zaporizhzhya National Technical University, 
Zhukovskiy street, 64, Zaporizhzhya, 69063, Ukraine. 
Telephone: +38-061-769-82-96 – the Editing and Publishing Department.
E-mail: rvv@zntu.edu.ua

The reference to the journal is obligatory in the cases of complete or partial use of its materials.