ENSEMBLE OF ADAPTIVE PREDICTORS FOR MULTIVARIATE NONSTATIONARY SEQUENCES AND ITS ONLINE LEARNING

Context. In this research, we explore an ensemble of metamodels that utilizes multivariate signals to generate forecasts. The ensemble includes various traditional forecasting models such as multivariate regression, exponential smoothing, ARIMAX

-estimate obtained at the output of member j MP of the ensemble; -combined forecast of the metamodel at time τ; c -metamodel parameters, a vector of estimates forming the combined forecast;  -Lagrange multiplier used in optimization; ) (T D -matrix used for estimating metamo del parameters; ) (T d -vector that incorporates estimates at the previous time step;  -regularization parameter that ensures the method's operation for nonstationary data; s -size of the "sliding window", determining the number of recent observations considered in the estimation;

INTRODUCTION
Forecasting multivariate nonstationary signals is a relevant and challenging problem in various domains.To achieve reliable and accurate results, different forecasting models such as ARIMAX, LSTM, SVM, and many others are used.
In this work, we consider the ensemble of metamodels method for forecasting, which is based on combining forecasts from different forecasting models.The metamodel helps to merge information from various models to improve forecasting accuracy and ensure more robust results.
The object of study is an ensemble of multivariate predictors used for forecasting multivariate signals.
The subject of study is the ensemble of metamodels method for combining forecasts from different forecasting models to improve forecasting accuracy based on nonstationary signals.
The purpose of the work of this research is to develop and evaluate the effective method based on ensemble of metamodels for forecasting multivariate nonstationary signals.We aim to investigate how combining forecasts from individual models can enhance the quality of forecasting and provide more reliable results.
Forecasting tasks based on multivariate nonstationary signals find broad applications in various fields, including finance, economics, medicine, and engineering.An efficient ensemble of metamodels can become a powerful tool for addressing these tasks and ensuring accurate and reliable forecasts.
The estimate that appears at the output of each member of the ensemble will be denoted as are input to the metamodel, which forms the combined forecast of the metamodel: -matrix formed by the signals at the outputs of individual models, where metamodel parameters satisfy the condition of unbiasedness: -is a vector formed by ones.To solve this problem, methods of Lagrange multipliers are used, leading to the estimation of the metamodel parameters c defined in a recursive form.The case where estimation is carried out based on a "sliding window" of size s is also considered, allowing for consideration of only the last s observations from the training dataset.To choose the best metamodel, a secondlevel metamodel is introduced, which processes the outputs of the first-level metamodels using a metaalgorithm.
Thus, the formal mathematical formulation of the problem involves defining an ensemble of predictors, computing estimates for each member of the ensemble, constructing a combined forecast ) ( *  x , determining the parameters of the metamodel c using the method of Lagrange multipliers, and the ability to work with different "sliding window" sizes and second-level metamodels for selecting the optimal solution.

REVIEW OF THE LITERATURE
This approach has gained the most popularity in classification tasks, such as image recognition, where the AdaBoost algorithm and its various modifications [1][2][3][4][5][6][7][8] are very popular.The underlying idea of this algorithm is stacked generalization, where the results of each member of the ensemble (stack) are combined within a metamodel, whose parameters are tuned using metalearning procedures.Typically, this involves weighted averaging, where each member of the ensemble (committee) is assigned a weight obtained through optimization of the adopted learning criterion.
The foundation of AdaBoost lies in the ideas of Bayesian estimation, logistic regression, and support vector machines.Interestingly, these ideas also form the basis of several artificial neural networks, where ensemble approaches [9][10][11] are also utilized to obtain optimal forecasts.In this case, weights for each member of the ensemble are estimated using an optimization procedure implemented in batch mode, making the use of known approaches for solving Data Stream Mining tasks practically impossible.Recurrent procedures for metamodel parameter tuning were introduced in [12,13], generalizing the output signals of predictor neural networks based on the optimization of the standard least squares criterion under certain constraints.Although these procedures are designed for online evaluation, they are not adapted to work with nonstationary time series, where parameters change unpredictably at any moment.
Therefore, it is worthwhile to introduce adaptive recurrent metalearning procedures for a generalizing metamodel that combines the output signals of a neural predictor ensemble, each of which can have its own architecture and its own algorithm for tuning-learning its synaptic weights.

MATERIALS AND METHODS
Metamodel parameters (vector of estimates c) can be determined using the classical method of Lagrange multipliers, for which the Lagrange function is introduced: is the identity matrix,  -denotes the tensor product, ) Sp( -denotes the trace of a matrix,  -Lagrange multiplier.
Solving the Kuhn-Tucker system of equations leads to the estimate [12]: where T the regular estimate of the standard least squares method.
In [13], the optimality of this estimate is proven over the entire training sample, meaning that the output of the metamodels ), ( *  x does not compromise accuracy compared to any of the individual ensemble models 1) and ( 2) can be easily rewritten in a recursive form similar to the recursive least squares method: The use of the least squares criterion is associated with the assumption of stationarity in the processed sequences, as all observations from ) 1 ( x to ) (T x are assigned equal weights.Since we assume non-stationarity in the controlled signals, including abrupt changes in the forecasting model, the estimates based on the least squares method are found to be inefficient.In such situations, more suitable predictors are those synthesized using "sliding window" estimation procedures that consider not the entire training sample but only the last s (window size) observations from In this case, the procedure takes the form: An interesting situation arises when the estimation is performed under the assumption of s=1, meaning that the optimization criterion (learning) is based on the square error of estimation at the last observation timestep.
In this case, the procedures (1), (2), and (4) take on a simple form: This is a generalization for the case under consideration, an adaptive identification algorithm of Kachmazh-Uidro-Hoff, where 0   is a regularization parameter that ensures the possibility of inversion during the calculation of 1) + (T D .The most challenging issue here remains the choice of the "window" size, s, which is usually done based on purely empirical considerations since the nature of possi-ble changes in the controlled signal ) ( x is unknown a priori.In this case, it is advisable to use not a single metamodel but a set of such structures built at different values of the "sliding window". To select the best metamodel from such a set, it is appropriate to introduce metamodels of the second level that process the outputs of first-level metamodels using the metaalgorithm (3), covering the entire training sample at .
The method of constructing an ensemble of metamodels that use multidimensional signals for forecasting can be presented in the following steps 1-9 (Figure 1): Step 1: Data Collection: Gather a large dataset of multidimensional data to be used in the analysis.
Step 2: Input Data Formation: The outputs (predictions) of each predictor are used as inputs for the metamodel.
Step 3: Data Processing: Each of the multidimensional predictors in the ensemble processes the same input data in various ways.Each predictor may include different machine learning methods, such as neural networks, support vector machines, gradient boosting, etc.
Step 4: Sliding Window Evaluation of Random Values: Model parameters are re-estimated for each new data point using only the last s observations.This ensures that the model is continuously updated with the most recent data.
Step 5: Metamodel Synthesis: Develop metamodels that use the method of Lagrange multipliers to determine their parameters.This means that the metamodel utilizes weights from different forecasts to form a single forecast.The weights of these forecasts are determined through the optimization of the Lagrange function.Step 6: Base Results Formation: Metamodel estimates are stored in a database for further analysis.
Step 7: Synthesis of Second-level Metamodel: Develop another metamodel that processes the outputs of the first-level metamodels.This can help gather information from different metamodels and make a more accurate forecast.
Step 8: Selection of the Best Metamodel: The secondlevel metamodel is used to select the best metamodel among the ensemble based on their performances.
Step 9: Forecasting: The final metamodel is used to produce forecasts based on the input data.This allows using a single, optimally weighted forecast instead of independent forecasts from each predictor.
This method employs the model ensembles to work with multidimensional data and produce forecasts based on a combination of predictions from each predictor.

EXPERIMENTS
In the previous study [14], an intelligent method for identifying fraudulent websites was proposed.This method was implemented using various machine learning classification methods, including Logistic Regression (LR), Random Forest (RF), K-Nearest Neighbors (KNN), Naive Bayes (NB), Support Vector Machine (SVM), and Decision Tree (DT).Additionally, each classification method was modeled using different approaches, includ-ing addressing imbalanced data, undersampling, oversampling, SMOTE, and ADASYN.

RESULTS
The method was applied to a dataset of websites operating in Ukraine, consisting of 67 sites, out of which 45% were identified as fraudulent.The results showed that the DTADASYN and RF Oversampling models achieved the highest accuracy (1.0), AUC (1.0), precision (1.0), recall (1.0), and F1-score (1.0).
Using the same intelligent method for an updated dataset consisting of 1039 websites, of which 68% were identified as fraudulent, slightly different results were obtained (Table 1).The SVM Undersampling model showed an accuracy of 0.93, AUC of 0.87, precision of 0.88, recall of 0.78, and F1-score of 0.82.The KNN Undersampling model demonstrated an accuracy of 0.90, AUC of 0.94, precision of 0.69, recall of 1.0, and F1score of 0.82.These results indicate that although accuracy and other metrics may vary depending on the dataset and methods used, the proposed intelligent method still achieves high accuracy in identifying fraudulent websites.
The proposed ensemble metamodel, utilizing multidimensional signals for forecasting, was implemented.In this case, the metamodel was constructed based on the predictions of logistic regression (LR), decision tree (DT), K-nearest neighbors (KNN), support vector machine (SVM), random forest (RF), and naive Bayes (NB) models, which were selected from the previous study [14].The metamodel was built using AdaBoostClassifier, an adaptive boosting algorithm that combines several weak models to create a strong one.
The results of the metamodel were as follows (Fig. 3): -Accuracy: 0.98.This indicates that the metamodel correctly classified 98% of the websites.
-Recall: For class 0 (non-fraudulent websites), it was 0.97, and for class 1 (fraudulent websites), it was 1.00.This means that the metamodel identified 97% of nonfraudulent websites and 100% of fraudulent websites.
-F1-score: For class 0, it was 0.98, and for class 1, it was 0.95.The F1-score is the harmonic mean between precision and recall, providing an overall evaluation of the model.
These results demonstrate improvement compared to the previous individual models trained separately.The metamodel delivers more accurate and consistent website classification, making it an effective tool for detecting fraudulent websites.
Next, we examine an example of using the metamodel to forecast the label for the 16th observation in the test dataset (Figure 3).Firstly, we obtain this observation and its true label.The true label for this observation is 0, indicating that the website is not fraudulent.Then, we get the predicted labels for this observation from each model, including logistic regression (LR), decision tree (DT), Knearest neighbors (KNN), support vector machine (SVM), random forest (RF), and naive Bayes (NB) models.The predicted labels from these models range from 0 to 1, reflecting different predictions from different models.Finally, we obtain the predicted label from the metamodel for this observation.The metamodel predicts a label of 0, which aligns with the true label.This demonstrates that the metamodel can correctly classify this observation, despite varying predictions from individual models.This result underscores the effectiveness of the metamodel in combining forecasts from different models to improve overall prediction accuracy.
The metamodel exhibited high accuracy in classifying websites, achieving an accuracy of 0.98.This means that the metamodel correctly classified 98% of the websites in the test dataset.Additionally, the metamodel demonstrated high precision (0.95 for class 0 and 0.90 for class 1), recall (0.97 for class 0 and 1.00 for class 1), and F1score (0.98 for class 0 and 0.95 for class 1).These metrics indicate that the metamodel performed well in classifying both fraudulent and non-fraudulent websites.The example prediction for the 16th observation also showed that the metamodel can accurately classify websites, despite diverse predictions from individual models.This confirms that the metamodel can effectively leverage predictions from different models to enhance the overall prediction accuracy.
Thus, these results confirm that using a metamodel can be an effective approach to improve the accuracy of classification in fraud detection tasks for websites.

DISCUSSION
In this study, we investigated the ensemble metamodel approach for forecasting multi-dimensional non-stationary signals.The proposed approach allows us to combine predictions from different forecasting models to obtain more accurate and reliable forecasts based on multiple sources of information.
Firstly, we conducted a literature review and explored various approaches to forecasting multi-dimensional nonstationary signals.Traditional models such as ARIMAX and exponential smoothing may be insufficiently effective in non-stationary conditions.On the other hand, neural networks such as LSTM and transformers exhibit high adaptability and the ability to work with changing condi-tions, making them attractive candidates for use in metamodel ensembles.
Next, we performed experiments with different forecasting models, such as ARIMAX, LSTM, SVM, Random Forest, etc., and collected their forecasts as input data for the metamodel.Using the method of Lagrange multipliers, we found the optimal parameters for the metamodel to achieve the best forecasting accuracy.
The results of the experiments showed that the proposed ensemble of metamodels indeed helps improve forecast accuracy.The metamodel based on combined forecasts from different models demonstrated higher accuracy compared to individual models.This approach allows for balancing forecasts and reducing the risk of overfitting or underfitting.
Moreover, we compared various approaches to synthesizing the second-level metamodel and found that utilizing the adaptive Kachmazh-Widrow-Hoff (KWH) identification algorithm helps provide more accurate forecasts based on the forecasts from the first-level models.
Overall, the research results confirmed the effectiveness of the ensemble metamodel method for forecasting multi-dimensional non-stationary signals.Using the ensemble approach helps achieve more accurate results.

CONCLUSIONS
This research addressed the problem of adaptive forecasting of multi-dimensional non-stationary sequences, considering the prior uncertainty regarding their structure, through an ensemble approach.We developed the ensemble metamodel method, where each ensemble member processes predictions from different first-level forecasting models.Then, by collecting the results of individual models' forecasts, we applied a second-level metamodel to obtain the optimal forecast.
The scientific novelty of this study lies in the development and application of ensemble metamodels for forecasting multi-dimensional non-stationary signals.The use of ensembles allows obtaining more accurate and reliable forecasts based on multiple sources of information, reducing the impact of limitations of individual models.
The practical significance of our research is that the proposed approach can be applied in various domains where forecasting multi-dimensional non-stationary signals plays a crucial role.For example, this approach can be used in financial analysis, weather forecasting, medical diagnostics, and other fields where forecast accuracy and reliability are essential.
The conducted research confirms that the proposed ensemble metamodel has high accuracy in detecting fraudulent websites.The metamodel demonstrated high precision in website classification, correctly classifying 98% of websites in the test dataset.This demonstrates that the proposed method can be an effective tool for identifying fraudulent websites and can find practical applications in the field of cybersecurity and combating online fraudulent activities.
Regarding the prospects of this research, further improvement of the method can be achieved by expanding the set of first-level forecasting models and using more sophisticated learning algorithms for the second-level metmodel.Additionally, this approach can be applied to other types of non-stationary signals and forecasting tasks in various domains of science and technology.

Figure 1 -
Figure 1 -Structural and Logical Diagram of the Method

Figure 3 -
Figure 3 -Example of Applying the Metamodel to the 16th Row of the Dataset