А DDITIONAL TRAINING OF NEURO-FUZZY DIAGNOSTIC MODELS

Context. The task of automation of diagnostic models synthesys in diagnostics and pattern recognition problems is solved. The object of the research are the methods of the neuro-fuzzy diagnostic models synthesys. The subject of the research are the methods of additional training of neuro-fuzzy networks. Objective. The research objective is to create a method for additional training of neuro-fuzzy diagnostic models. Method. The method of additional training of diagnostic neuro-fuzzy models is proposed. It allows to adapt existing models to the change in the functioning environment by modifying them taking into account the information obtained as a result of new observations. This method assumes the stages of extraction and grouping the correcting instances, diagnosing them with the help of the existing model leads to incorrect results, as well as the construction of a correcting block that summarizes the data of the correcting instances and its implementation into an already existing model. Using the proposed method of learning the diagnostic neural-fuzzy models allows not to perform the resource-intensive process of re-constructing the diagnostic model on the basis of a complete set of data, to use the already existing model as the computing unit of the new model. Models synthesized using the proposed method are highly interpretive, since each block generalizes information about its data set and uses neuro-fuzzy models as a basis. Results. The software which implements the proposed method of additional training of neuro-fuzzy networks and allows to re-configure the existing diagnostic models based on new information about the researched objects or processes based on the new data has been developed. Conclusions. The conducted experiments have confirmed operability of the proposed method of additional training of neuro-fuzzy networks and allow to recommend it for processing of data sets for diagnosis and pattern recognition in practice. The prospects for further researches may include the development of the new methods for the additional training of deep learning neural networks for the big data processing.


ABBREVIATIONS
t is a value of output parameter of the q-th observation; q t′ is a measured value of the output parameter of the q-th instance q s′ of sample > ′ ′ < = ′ T P S , ; ( ) is a value of the output parameter of the q-th instance q s′ of sample , , calculated by substituting the measured values of the input attributes qm p′ of the q-th instance in the model NFN ; T is a set of output parameter values; mod q t is a model value of the output parameter of the q-th instance q s c ′ , calculated from the synthesized model NBj y ; qj u is a value of membership function of the q-th instance q s c ′ to the j-th cluster; INTRODUCTION During operation of intelligent diagnostic systems, new information about diagnosed objects arises.In doing so, the information newly obtained from the measurements of the diagnosed objects can significantly contradict to the existing diagnostic models built on the results of previous observations.In such cases, it becomes necessary to re-synthesize diagnostic models using the data from previous and new measurements.
The object of study are the methods of the neurofuzzy diagnostic models synthesys.
However, when working with big data, the time for resynthesis of such models can be significant, which in some cases is unacceptable.Therefore, during the operation of diagnostic systems, the task of adapting trained models by modifying them, taking into account the information obtained as a result of new observations, is relevant.
The subject of study are the methods of additional training of neuro-fuzzy networks.
The purpose of the work is to create a method for additional training of neuro-fuzzy diagnostic models.

PROBLEM STATEMENT
Suppose we have: 1) a sample of data > < = T P S , , containing Q instances, each of which is characterized by the values of the parameters 1 q p , 2 q p , …, qM p and the output pa- 2) a neuro-fuzzy model ( ) synthesized from a set of observations > < = T P S , with a definite structure struct (a set of computational ele-ments connected in a certain way) and set of parameters ( ) , obtained as a result of new Q′ measurements of the object being examined (diag- nosed).
Then it is necessary to synthesize the new model ( ) ) can be used as the target criterion G for additional training neu- ral-fuzzy models.

REVIEW OF THE LITERATURE
The additional training of diagnostic and recognition models built in the form of neural-fuzzy networks usually involves the modification of the existing network by including (adding) information about new observations to it.Such information is added to the constructed network in the form of new rules, represented by so-called singletons.This approach is simple enough to implement.However, in the case of a significant number of new observations, the application of this approach is little effective.The reason is that in this case the structural and parametric complexity of the network is significantly increased (each new observation, in fact, is added to the network in the form of a new rule), and its generalizing capabilities are also reduced.
Another approach involves a complete reorganization of the structure and parameters of the network with the appearance of new essential information about the objects under study.Consequently, the already synthesized model is re-trained on the basis of available , and , information.When processing big data, re-training the model is also undesirable, since this process takes a lot of time and requires a large amount of computational resources.
Therefore, it is advisable to develop a new method for adapting trained neural-fuzzy models to changing the functioning environment by modifying them, taking into account the information obtained as a result of new observations.

MATERIALS AND METHODS
In the developed method of training the neuro-fuzzy models, it is proposed to correct the existing model ( ) , by introducing additional structural computational elements that take into account the attributes of the new data set , .
In the proposed method, the first step is to extract the correcting instances from the sample Corrective instances q s c ′ will be considered those observa- , is calculated.Then, the real q t′ and model ( ) values of the output parameter are compared: Condition ( 1) is used in solving estimation problems (for continuous values of the output parameter T).When solving recognition problems (with discrete values of the output parameter T) the following condition is used: ( ) When the above conditions are met, the q-th instance , is counted as corrective and entered into the set S C ′ : . However, when processing big data, the number of set`s S C ′ instances can be significant, which will lead to a significant increase in the structural and parametric complexity of the new model NFNN .In addition, many instances of the set S C ′ can be close to each other in the attribute space and, in fact, be similar.Therefore, including all instances as rules for a new model block can also lead to a loss of its generalizing abilities.Accordingly, before building a block NB , it is advis- able to perform the step of grouping the correcting instances of the set S C ′ with the selection of the most sig- nificant of them q f csIn ′ , concentrating around themselves a certain number of similar closely located specimens.
To do this, it is suggested to perform cluster analysis of the S C ′ set`s instances in the attribute space P .The number of clusters Cl N in the developed method is de- termined in proportion to the number of rules R N in the existing model NFN , as well as the proportion of in- stances S C ′ of the set S C ′ in relation to the number of instances Q in the set : After determining the number of clusters Cl N , the ini- tial partitioning of instances over clusters is performed.For this, a set of cluster centers is the center of the j-th cluster, . Then, as the center of the second cluster 2 C , the instance b s c ′ most remote from the instance a s c ′ is selected.The center of the third cluster 3 C is selected in such a way that it is as far away from the centers of the first and second clusters.This procedure continues until Cl N is formed.With a large value Cl N , this approach will be associated with the need for complex calculations due to the search for instances characterized by the greatest distance to the current set of already defined cluster centers.Therefore, this approach is advisable to apply for small values of the number of clusters Cl N or to combine it with an approach that involves the random formation of multiple cluster centers . Then, the generation of elements qj u determining the membership of the q-th instance q s c ′ to the j-th cluster j C is performed.In contrast to the method of fuzzy cmeans used as a basis, in the developed method, when creating the initial division of the instances, the generation of elements qj u will be performed not randomly, but taking into account the location of the instances in attribute space P .For this, the distances ( ) from the instance q s c ′ to the center j C of each cluster are determined.As a metric for determining the distance ( ) , we can use the Euclidean metric (3): ( ) ( ) The membership qj u of the q-th instance q s c ′ to the jth cluster j C is calculated by the formula (4): ( ) ( ) In the case where the instance q s c ′ is the center of the ), then it is established: Further, according to the formula (5), the value of the function ( ) determining the quality of the fuzzy partitioning R in the i-th iteration of the cluster analysis is calculated: After that, the criteria ( 6) and ( 7) of the completion of the cluster analysis procedure are checked: . (7) In this case, inequality (6) reflects a condition, the fulfillment of which characterizes too small a change in the value of the target function ( ) , and accordingly, the inexpediency of further searching for the optimal partition ) (i R .Condition (7) displays the situation when the current number of iterations reaches the maximum allowed value maxIterClA .If both conditions ( 6) and ( 7) are not fulfilled, the new values of the coordinates of the cluster centers are determined using formula (8): Then, using the formulas (3)-( 5), a new fuzzy parti- This procedure is repeated until at least one of the conditions ( 6) or ( 7) is satisfied.Consequently, as a result of the step of grouping the correcting instances, a plurality of cluster centers

=
and a plurality of q s c ′ instance attachments qj u are formed to the respective clusters.
After grouping the correcting instances, the stage of construction of the correcting block NB is performed.
In case the modifiable model ( ) , uses as a basis a neuro-fuzzy ANFIS network, then the structure of the correction block NB will also be based on the ANFIS network.The graphic representation of the correcting block NB is shown in Fig. 1.
In this case, the number RNB N of nodes of the second layer corresponding to fuzzy rules in the correcting block NB is proposed to be taken equal to the number of clus- ters (rules) allocated in the previous step: Given ) are connected with the corresponding nodes of the second layer.Thus, in aggregate, the nodes of the first and second layers form antecedents of fuzzy rules j NR .
The information obtained at the previous stages of the developed method of additional training the neural-fuzzy models (a multiplicity of correcting instances S C ′ , a mul- tiplicity of cluster centers

=
and a multiplicity of instance q s c ′ accessories qj u to the corresponding clusters) will be used to determine the configurable parameters of membership functions ).As functions ), that determines the shift of the center of the function relative to the center of coordinates of the characteristic axis m p , we will use the m -th coordinate of the j-th cluster center

=
formed in the previous step.
to j-th cluster j C : ( ) Using formulas (10)  that determine the degree of fulfillment of the j-th rule j NR , according to the formula: The nodes of the third layer determine the relative degree of fulfillment of the j-th rule j NR : Neural elements of the fourth layer  , but also information on their membership degree qj u to each of the clusters j Cl (in fact, the fuzzy rule j NR ) determined by the centers . This will take into account the importance of the instances for restoring the objective function E , we will use the function (14), which is a modified mean-square error function: ( ) The model value mod q t of the output parameter of the q-th instance q s c ′ is calculated from the synthesized model NBj y : This function, in addition to the deviation between the actual q t and model value mod ( ) To determine the values of adjustable parameters mj w , find their values at which optimum target criterion opt E j → is reached.To do this, define the partial derivatives by the parameters mj w of the target criterion j E as functions of several variables: ( ) , then solve the system of equations ( 17): Expression ( 17) is a system consisting of ( ) Performing further transformations, obtain that the mth equation of system (17) can be written in the form (19):  ....It is important to note that the proposed approach to the construction of correcting blocks NB allows to syn- thesize and introduce into existing models new blocks with the appearance of new information , , the diagnosis of which leads to incorrect results of the model NFNN .Thus, the model shown in Fig. 2, can be consistently expanded by adding new blocks NB that generalize information about new observations of the investigated objects.
Consequently, the proposed method of additional training the neuro-fuzzy diagnostic models allows to adapt existing models to the change in the functioning environment by modifying them taking into account the information obtained as a result of new observations.The proposed method assumes the stages of extraction and grouping of correcting specimens, diagnosing with the help of the existing model leads to incorrect results, as well as the construction of a correcting block that summarizes the data of the correcting instances and its introduction into an already existing model.When determining the adjustable parameters of the correction block in the developed method, it is proposed to use information about the values of the coordinates of the correcting instances, as well as information on the degree of their membership to clusters in the feature space (and, accordingly, to the fuzzy rules presented in the correcting block).This allows one to take into account the importance of corrective instances for restoring the functions of the fourth layer of the correcting block and, when determining custom parameters, to increase the contribution of those specimens that are characterized by high estimates of membership degree to a particular cluster.
Using the proposed method of additional training the neural-fuzzy diagnostic models allows not to perform the resource-intensive process of re-constructing the diagnostic model on the basis of a complete set of data, to use the already existing model as the computing unit of the new model.In addition, models synthesized using the proposed method are highly interpretive, since each block generalizes information about its data set and uses neurofuzzy models as a basis.

EXPERIMENTS
For testing the effectiveness of the developed method training of neuro-fuzzy models training, the problem of constructing diagnostic models for predicting the health status of patients with hypertension was solved [30].
Hypertension is a widespread disease that can threaten the life and health of the patient [30].The nature of the course of hypertension is influenced by various factors (weather and climatic conditions, concomitant diseases, as well as the state of health in previous moments) [30].In order to prevent significant pressure surges that can cause deterioration of the patient's condition, and possibly lead to death, it is necessary to predict the development of hypertension in the short term (for the next half of the day or day).This will allow timely implementation of preventive measures related to the intake of necessary medicines to prevent the expected negative consequences.
For prediction the health of a patient with hypertension, it is necessary to have a model that will be unique for each individual patient.Building such a model requires processing a large array of observations distributed over time.
Thus, since such a disease is of an individual nature [30] (the features of the disease are different for each patient as a result of which for each patient it is necessary to synthesize its own unique diagnostic model) and in connection with obtaining new information about the course of the disease over time, there is a need for periodic adjustment (additional training) of existing models for individual prediction of the patient's condition on the basis of constantly growing arrays of observations.
The initial sample of data on the state of health of a patient with hypertension was obtained in Zaporizhzhia (Ukraine).The sample , included observations from 2004 to 2014, where each sample was a set of data characterizing the patient's condition at a certain part of the day.
As objective clinical and laboratory features were used: 1 p is observed blood pressure (systolic and diastolic, mmHg); 2 p is a pulse (beats per minute (BPM)); data on medication ( 3 p is an Amlo (0 is for patient that do not take medicines, 1 is for patient take medicines), 4 p is an Egilok (0 is for patient that do not take medicines, 1 is for patient take medicines); 5 p is a Berlipril (0 is for patient that do not take medicines, 1 is for patient take medicines)).As subjective features used characteristics of health ( 6p is the presence of premature heart beat (0 is present, 1 is absent), 7 p is the presence of headache (0 is present, 1 is absent), 8 p is the presence of neck pain (0 is present, 1 is absent), 9 p is the presence of pulsation (0 is present, 1 is absent), 10 p is the presence of pain in the left side (0 is present, 1 is absent), 11 p is presence of pain in the heart (0 is present, 1 is absent), 12 p is lack of air (0 is present, 1 is absent), 13 p is presence of stomachache (0 is present, 1 is absent), 14 p is general weakness (0 is present, 1 is absent)).As meteorological characteristics used [30] ( 15 p is an air temperature ( C °)), 16 p is an atmospheric pressure (mmHg), 17 p is type of cloud cover (0 is not cloudy, 1 is small cloudy, 2 is cloudy, 3 is overcast), 18 p is the presence of thunderstorms (0 is present, 1 is absent), 19 p is wind direction (0 is a windless, 1 is a northern wind, 2 is a northeasterly wind, 3 is a easterly wind, 4 is a southeasterly wind, 5 is a southern wind, 6 is a southwesterly wind, 7 is a westerly wind, 8 is a northwesterly wind), 20 p is a wind speed (m/s), 21 p is a solar phenomena data (Mg II index).As characteristics of time were used: date (year, month, day), identification of the day of week ( 22p ), time (hour) of observation ( 23 p ), identification of the part of day (0 is a morning, 1 is an evening) ( 24p ).The observations obtained by the method of "Shorttime transform" were used to form a sample to solve the problem of qualitative forecasting of the patient's condition for the next second half of the day according to the previous observations: as input features were used data for the previous (morning and evening) and the current day (morning), and as an output -the patient's condition in the evening in the current day (0 -normal, 1 -aggravation of symptoms, accompanied by an increase in blood pressure).
To carry out experiments on the researching of the developed method additional training of neuro-fuzzy diagnostic models the training sample The higher the value of the variable ω , the more new observations appeared after the previous construction (reconstruction) of the model NFN .
In the process of experimental studies will be applying different methods and approaches to the training of the constructed models at different values of the variable ω : -additional training of the synthesized neuro-fuzzy model NFN using sample ad S by the Backpropagation method (BPSad) [1,2].In this case, the parameters of the existing model NFN , pre-synthesized by sampling tr S , were used as the initial parameters of the new model NFNN ; -re-training of the neuro-fuzzy model using the data of the combined sets S S S ad tr = ∪ (BPS); -using the developed method for additional training for finishing the diagnostic neuro-fuzzy models (MATDNFM).Herewith, the finish of the model was performed on a sample ad S , base model NFN was synthesized based on a sample tr S .As criteria for comparison the methods additional training of the neuro-fuzzy models shall be using: -training time ad t is an amount of time that was spent on building the model NFNN  ).

RESULTS
The results of the experiments are given in table 1. sec.) and does not depend on the value of the variable ω , because before training the neuro-fuzzy network with this approach is performed by using the entire data sample ), the synthesis time of the model is acceptable.However, the use of this approach in the processing of BPS large data sets for the restructuring of the already synthesized models is undesirable, and in some cases impossible, because the process of learning (re-training) will require significant time and hardware resources of the computer.
The additional training time ad t in the case using the method BPSad depends on the value of the variable ω (changes from 0.8139 sec.with % = ω to 41.1 sec.with % 100 = ω ) due to the fact that a reduced sample ad S is used as the sample for which the neuro-fuzzy model is being trained.Similar results shows the proposed method MATDNFM.However, the additional training time a few below (changes from 0.6918 с. with ) compared to additional training time with using BPSad.This is conditioned by the fact that the proposed method is pre-grouping of new instances, thereby significantly reducing the number of new rules that are introduced in the neuro-fuzzy diagnostic model, and this, in turn, reduces the number of configurable parameters and, accordingly, the time of model learning.
The error S E on the sample ).This confirms the expediency of using the proposed method, in which the error S E on the sample > < = T P S , is almost unchanged (does not significantly depend) when the value of the variable ω change and is commensurate with the magnitude of the error S E using the BPS approach.This is due to the use of formulas (22) to calculate the values of the output parameter Т, which takes into account both the preliminary compilation of data samples tr S in the form of the underlying model NFN (output parameter value t(s) is calculated according to the basic model NFN in the case that the instance s has low degree of belonging The ).This is an unacceptable result, which is justified by the using of sample ).Such values Str E confirm that the proposed method is appropriate to use at low values of ω , that is, in cases where the volume of new information ad S about the objects or processes is significantly lower than the amount of available information tr S that was used to build the basic model of NFN.
The , what is significantly less than when using the method BPS), significantly limits its use in practice, especially when processing big data.
Low values of the error t E of model NFNN on test data (with the exception of the method BPSad at low values of the variable ω ), calculated for the test sample data (data about observations, which are not reflected in the sample ) confirm the ability to the generalization data by neuro-fuzzy diagnosis models which passed the additional training process .
Thus, the proposed method of MATDNFM is advisable to use at low values of ω , that is, in cases where the volume of new information ad S about the objects or processes is significantly lower than the amount of available information tr S that was used to build the basic model of NFN.
Given that the number of new data (variable ω ) is usually significantly lower than the amount of initial information when solving practical problems, the use of the proposed method is appropriate, since the model NFNN error S E on the sample > < = T P S , does not change almost when the values of the variable ω change, and the additional training time is much less than when using the BPS method.

CONCLUSIONS
The urgent problem of automation of the process of assessing the informativeness of features in solving problems of diagnosing and pattern recognition has been solved.
The scientific novelty of obtained results is that the method has been developed for additional training of diagnostic neuro-fuzzy models, which allows to adapt existing models to the change in the functioning environment by modifying them taking into account the information obtained as a result of new observations.The proposed method assumes the stages of extraction and grouping of correcting specimens, diagnosing with the help of the existing model leads to incorrect results, as well as the construction of a correcting block that summarizes the data of the correcting instances and its introduction into an already existing model.When determining the adjustable parameters of the correction block in the developed method, it is proposed to use information about the values of the coordinates of the correcting instances, as well as information on the degree of their belonging to clusters in the feature space (and, accordingly, to the fuzzy rules presented in the correcting block).This allows one to take into account the importance of corrective copies for restoring the functions of the fourth layer of the correcting block and, when determining custom parameters, to increase the contribution of those specimens that are characterized by high estimates of the degree of belonging to a particular cluster.Using the proposed method of learning the diagnostic neural-fuzzy models allows not to perform the resource-intensive process of re-constructing the diagnostic model on the basis of a complete set of data, to use the already existing model as the computing unit of the new model.In addition, models synthesized using the proposed method are highly interpretive, since each block generalizes information about its data set and uses neurofuzzy models as a basis.
The practical significance of obtained results is that the practical tasks of diagnosing and recognizing images are solved.The results of the experiments showed that the proposed method makes it possible to carry out additional training of diagnostic neuro-fuzzy models on the basis of new information and can be used in practice for solving practical problems of diagnosing and pattern recognition.
mj w is a customizable parameter of the function NBj y .
the nature of calculating parameter Cl N in the proposed method, the number of NB -block rules RNB N will be proportional to the number of rules R N in the existing model NFN , as well as the proportion of in- stances S C ′ of the set S C ′ in relation to the number of instances Q in the set > the structural complexity of the correcting block RNB N will be proportional to the analogous value of the original model NFN and the proportion of new instances of the S C ′ set.Neural elements of the first layer that determine the membership degree of the value of the input parameter membership degree of the value of the m -th input parameter m p to the j -th fuzzy term mj ft in the correcting block NB , we use the mem- bership functions (9):

Figure 1 -
Figure 1 -Graphical interpretation of the correcting block NB when modifying models NFN using as a basis a neural- fuzzy ANFIS network As a parameter mj d , we will use the standard deviation of the correcting instances S C s c q ′ ∈ ′ relative to j -th center of the cluster j C along the m -th characteristic axis.It also takes into account the membership qj u of q- to functions NBj y that determine the value of the network output in the case of the operation of the corresponding rule j NR .Thus, each j-th node of the network determines the contribution of the fuzzy rule j NR to the common output of the network NB y .Functions NBj y , as a rule, are represented in the form of a linear regression, therefore, the values of the outputs of the nodes of the fourth layer ) 4 ( NBj μ can be calculated from the formula (13):

.
functions NBj y corresponding to clusters j Cl , and in determining the mj w parameters of the function NBj y , increase the contribution of those specimens that are characterized by high estimates of the membership degree of qj u to the cluster j Cl .There are two approaches to determine the parameter mj w values.The first approach involves building models NBj y on the basis of corrective instances q s c ′ with the maximum estimates qj u of membership to the corresponding clusters j Cl .For each cluster j Cl ( j NR rule), the instances q s c ′ with the largest values of qj u are selected.So, it is considered that the instance q s c ′ belongs to the cluster j Cl ( this condition is met, then the instance q s c ′ is added to the set j Set of instances related to the cluster Further, using instances of the set j Set , the function NBj y is restored using the known parametric synthesis of models.The second approach involves the use of all instances q s c ′ of the set S C ′ to construct all models NBj y .When restoring the function NBj y that determines the output of j-th node of the fourth layer of the correcting block NB , and also the membership qj u of each of them to j-th cluster j Cl (rule j NR ) is taken into account.In the process of restoring the function NBj y as the also uses information about the membership qj u of this instance to j-th cluster j Cl as an estimate of the importance of the instance q s c ′ for restoring the function NBj y .Substituting (15) into (14), reducing obtained ex- ...... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... ; .... ;

4 (Figure 2 -
Figure 2 -Graphical interpretation of the modified model NFNN depend on its division into the sample tr S , which was used to train the basic model NFN , and the sample ad S for the additional training of the already synthesized model NFN (building a new model NFNN .It should be noted that for small amounts of data (a low number of instances of the training sample > of the new structural element of the model in the form of the correction unit) and a new data sample ad S , summarized in a correcting block NB (the value of the output parameter t(s) is calculated by correcting block NB in the case, if the s instance is characterized by a high degree of belongof the new structural element).

adS
instances when building a new model NFNN.Error Str Ewhen using the method MATDNFM is quite low, including at small values of the index ω (