IMAGE CLASSIFIER RESILIENT TO ADVERSARIAL ATTACKS, FAULT INJECTIONS AND CONCEPT DRIFT – MODEL ARCHITECTURE AND TRAINING ALGORITHM

Context. The problem of image classification algorithms vulnerability to destructive perturbations has not yet been definitively resolved and is quite relevant for safety-critical applications. Therefore, object of research is the process of training and inference for image classifier that functioning under influences of destructive perturbations. The subjects of the research are model architecture and training algorithm of image classifier that provide resilience to adversarial attacks, fault injection attacks and concept drift. Objective . Stated research goal is to develop effective model architecture and training algorithm that provide resilience to adversarial attacks, fault injections and concept drift. Method. New training algorithm which combines self-knowledge distillation, information measure maximization, class distribution compactness and interclass gap maximization, data compression based on discretization of feature representation and semi-supervised learning based on consistency regularization is proposed. Results. The model architecture and training algorithm of image classifier were developed. The obtained classifier was tested on the Cifar10 dataset to evaluate its resilience over an interval of 200 mini-batches with a training and test size of mini-batch equals to 128 examples for such perturbations: adversarial black-box L ∞ -attacks with perturbation levels equal to 1, 3, 5 and 10; inversion of one randomly selected bit in a tensor for 10%, 30%, 50% and 60% randomly selected tensors; addition of one new class; real concept drift between a pair of classes. The effect of the feature space dimensionality on the value of the information criterion of the model performance without perturbations and the value of the integral metric of resilience during the exposure to perturbations is considered. Conclusions. The proposed model architecture and learning algorithm provide absorption of part of the disturbing influence, graceful degradation due to hierarchical classes and adaptive computation, and fast adaptation on a limited amount of labeled data. It is shown that adaptive computation saves up to 40% of resources due to early decision-making in the lower sections of the model, but perturbing influence leads to slowing down, which can be considered as graceful degradation. A multi-section structure trained using knowledge self-distillation principles has been shown to provide more than 5% improvement in the value of the integral mectric of resilience compared to an architecture where the decision is made on the last layer of the model. It is observed that the dimensionality of the feature space noticeably affects the resilience to adversarial attacks and can be chosen as a tradeoff between resilience to perturbations and efficiency without perturbations.

k  is a false negative rate for k-th class; 1,k T control time period which can be set a priori and estimated as the mean time between adverse events or maximum allowable recovery time; G a search domain for optimal parameter values; T a confidence threshold;  a coefficient to regulate tradeoff between performance without perturbation and resilience under perturbations; k  membership function that represent confidence in the forecast of input sample belonging to the k -th class; i z is a binary feature representation of i -th example at the feature extractor output; ( ) dist  is a Euclidean Squared distance; k z is a trainable k -th class prototype; N is a dimension of high-level feature space;  is a constant added for numerical stability, 6 10    ; i y is class labels for i-th example after one-hot encoding; MB n is a size of mini-batch; ˆi y is the value of the smoothed membership function for the i-th sample to each class; relu is an activation function RELU;  is the component-wise multiplication sign (Hadamard product); d is the averaged value of the normalized distance between the prototypes of the classes; r is the averaged value of the scaling factor of the radius of the class container; z is a feature presentation of the first augmented version of the input sample i x ; z is a feature presentation of the second augmented version of the input sample i x ; ( ) k q   is an assessment of the probability of belonging the feature representation of input image to the k -th class container;  is a temperature parameter that controls the dynamic range of the similarity function; ( ) dist k q  is an assessment of the probability of belonging the feature representation of input image to the to k -th class; S number of sections of multi-sectional classifier model; e is a column matrix of ones, [1, 1, ..., 1] T e  ; Hadamard is a square matrix whose entries are either +1 or −1 and whose rows are mutually orthogonal; is a coefficient for regulating the influence of the information criterion based component to the resulting loss; is a coefficient for regulating the impact of contrastive-center loss to the resulting loss; C  is a coefficient for regulating the impact of average distance between class prototypes and average radius of separate hypersurface class boundaries (container) to the resulting loss;

INTRODUCTION
Image classification is one of the most widespread tasks in the field of artificial intelligence. Classification analysis of visual objects is often a component of safetycritical applications, such as autopilots of public transport and combat drones and medical diagnostics. It is used in production processes, monitoring traffic flows, inspection of infrastructure and industrial facilities and other similar tasks. Therefore, there is a need to ensure the resilience of artificial intelligence algorithms to destructive perturbations such. In the case of artificial intelligence for image classification, specific perturbations such as adversarial attacks or noise, faults or fault injection attacks, as well as concept drift and out-of-distribution increase aleatoric and epistemic uncertainty and its involve a decrease in the productivity of the intellectual algorithm [1][2][3].
The resilience of the image classifier to perturbations is primarily ensured by achieving robustness for absorption of a certain level of destructive influences and implementing the graceful degradation mechanism to achieve the most effective behavior in conditions of incomplete certainty [1]. Data analysis models need to be continuously improved to take into account the nonstationary environment and new challenges. That is why the ability of the model to quickly recover performance by adapting to destructive effects and improve to increase the efficiency of subsequent adaptations are equally important components of resilience [2]. Recovery and improvement mechanisms are developed within the framework of the continual learning and meta-learning frameworks [4,5].
Achieving a certain level of resilience is predicated upon the introduction of a certain resource and functional redundancy into the system, but in practice there are always resource constraints [6]. When designing and operating resilient systems taking into account resource constraints, the principles of rational resilience (affordable resilience) are often used. This involves achieving an effective balance between the system's lifecycle costs and the technical characteristics of the its resilience [7]. Researchers are trying to improve the resource efficiency of the inference by using biologically inspired cognitive mechanisms or adaptive computation based on cascade and multi-branch models [8,9].
Separate components of resilience to certain types of destructive influences have been researched in many scientific papers, but the complex influence of multiple destructive factors at once had still not been considered [1][2][3]. In addition, machine learning algorithms for classification analysis of images that simultaneously implement such components of resilience as robustness, graceful degradation, recovery and improvement have not yet been proposed. Not all implementations of these components are compatible, especially under resource constraint conditions. The object of research is the process of training and inference for image classifier that functioning under influences of destructive perturbations.
The subjects of the research are model architecture and training algorithm of image classifier that provide resilience to adversarial attacks, fault injection attacks and concept drift.
The research goal is development an effective model architecture and training algorithm of image classifier that provide resilience to adversarial attacks, fault injections and concept drift. These inequalities may include resource constraints, necessitating the development of resource-efficient algorithms.

PROBLEM STATEMENT
It is necessary to find by machine learning an optimal values of parameters g (1) that provide tradeoff between maximum of class-wise averaged value of informationbased efficiency criterion J and value of integrated metric R for resilience quantification on control time period c T :

REVIEW OF THE LITERATURE
The problem of image representation and image classification analysis remains an active research topic due to its relevance in safety-critical applications which require resilience to challenging operating conditions [2,10]. Basic principles of system resilience to destructive perturbations are formulated in [6,7]. These presuppose the existence of mechanisms of perturbation absorption, perturbation detection, graceful degradation, restoration of productivity and improvement. Research [1,2,3] studied vulnerability of artificial intelligence algorithms, identifying the following destructive effects: noise and adversarial attacks, faults and fault injection in the environment of intelligent algorithm deployment, concept drift and emergence of novelty, i.e., test examples that out of distribution of training data.
The ability to absorb destructive perturbations is called robustness. There are many methods and approaches to increase robustness to adversarial attacks. Some researchers separate methods for ensuring robustness to competitive attacks into the following categories : gradient masking methods, robustness optimization methods and methods of detecting adversarial examples [11]. Gradient masking includes some input data preprocessing methods (jpeg compression, random padding and resizing), thermometer encoding, adversarial logit pairing), defensive distillation, randomly choosing a model from a set of models or using dropout, and the use of generative models (ie, PixelDefend [12] and Defense-GAN [13]). However [14] demonstrated inefficiency of gradient masking methods. Robust optimization approach includes adversarial training, regularization methods which minimize the effects of small perturbations of the input (such as Jacobian regularization or L2-distance between feature representations for natural and perturbed samples), and provable defenses (ie, Reluplex algorithm [15]). Finally, yet another approach lies in developing an adversarial examples detector to reject such examples at the input of the main model. However, Carlini and Wagner [16], rigorously demonstrate that the properties of adversarial examples are difficult and resource-intensive to detect. In [11] it was proposed to divide the methods of protection against adversarial attacks into two groups, implementing two separate principles : methods of increasing intra-class compactness and inter-class separation of feature vectors and methods of marginalization or removal of non-robust image features. This work [17] emphasize the possibility for further development of these basic principles and their combination, taking into account other requirements and constraints.
There are three main approaches to ensure robustness to the injection of faults in the computing environment where neural networks are deployed : introduction of explicit redundancy [18], learning algorithm modification [19] and architecture optimization [19]. Faults are understood as accidental or intentional bit flips in memory which stores the weights or the original value of the neuron. The introduction of explicit redundancy is achieved, as a rule, by duplication of critical neurons and synapses, uniform distribution of synaptic weights and removal of unimportant weights and neurons. It is also possible to increase the robustness of the neural network to the injection of faults at the stage of machine learning by adding noise, perturbations or injecting direct faults during training. The same can also be achieved by including a regularization (penalty) term in the performance measure to indirectly incorporate faults in conventional algorithms [20]. Optimizing the architecture to increase robustness means minimizing the maximum error at the output of the neural network for a given number of inverted bits in memory where weights or results of intermediate calculations are stored. Authors of research [20] solved this problem with evolutionary search algorithms or Neural Architecture Search tools.
However, architecture optimization is traditionally a very resource-intensive process.
Papers [21,22] propose methods of domain randomization and adversarial domain augmentation which increase the robustness of the model under bounded data distribution shifts. Domain randomization is the generation of synthetic data with amount of variations large enough so that that real world data is viewed as simply another domain variation [21]. This can include randomization of view angles, textures, shapes, shaders, camera effects, scaling and many other parameters. Adversarial domain augmentation creates multiple augmented domains from the source domain by leveraging adversarial training with relaxed domain discrepancy constraint based on Wasserstein Auto-Encoder [22]. Transfer learning and multi-task learning also reinforce resistance to out-of-distribution perturbations. However, if there is a real concept drift in the data stream, there is a need to detect such a situation and implement reactive mechanisms to adapt [23]. There are studies on adaptation to real concept drift, but the lack of labels for test data or a significant delay in obtaining them remains a challenge.
Adversarial attacks, error injections, concept drift and out-of-distribution examples cannot always be absorbed, so the development of reactive resilience mechanisms, namely graceful degradation, recovery and improvement, remains relevant [2,6]. The implementation of these mechanisms is often associated with the need to detect the perturbation. The most successful methods of detecting an adversarial and out-of-distribution samples and concept drift are based on the analysis of high-level feature space using a distance-based confidence score or prototypebased classifier [24,25]. In [25], the mechanism for detecting faults affecting inference is based on the calculation of the reference value of the contrastive loss function on test diagnostic samples of data in the absence of faults. To detect faults, the current value of the contrast loss function for diagnostic data is compared with the reference value. In research [27] is proposed mechanisms of Nested Learning and Hierarchical Classification, particularly useful for the implementation of the mechanism of graceful degradation.
In [28], consider algorithms for adapting models to destructive perturbations, where the principles of active learning or contrastive learning are used to increase the speed of adaptation by reducing the requirement for labeled data in quantities. Semi-supervised learning methods are proposed in [29] for the simultaneous use of both labeled and unlabeled data in order to accelerate adaptation to concept drift. The methods of lifelong learning, which allow to continuously accumulate knowledge from different tasks and improve, as well as different reminder mechanisms helping avoid catastrophic forgetting problem are considered in [5]. Various approaches to the implementation of meta-learning to improve the effectiveness of adaptation are covered in [4]. The paper considers the principle of self-distillation for training neural networks which can implement adaptive calculations and speed up the inference mode as the learning efficiency of the lower layers of the neural network grows.
Thus, there are numerous studies of separate principles of resilience of data classification models, but there are virtually no works which consider their coterminous combination. However, in systems analysis, there are studies related to the provision of affordable resilience [7] which are particularly relevant for data analysis systems operating under resource constraints.

MATERIALS AND METHODS
When building the model, we aim to implement the main characteristics of resilience: robustness, graceful degradation, recovery and improvement. The model is based on the following principles: -hierarchical labeling and hierarchical classification to implement the principles of graceful degradation by coarsening the prediction with a more abstract class with reasonable confidence when classes at the bottom of the hierarchy are recognized with low confidence level; -combining the mechanisms of self-knowledge distillation and nested learning to increase the robustness of the model by increasing the informativeness of the feedback for the lower layers at the training stage and accelerate inference by skipping high-level layers for simple samples at inference stage; -prototype and compact spherical container formation for each class to simplify detection of out-of-distribution samples and concept drift; -using memory FIFO-buffer with limited size to store labeled and unlabeled data with corresponding values of loss function obtained by inference for implementation diagnostic and recovery mechanism.
These principles should ensure resource-efficiency because the model will have small branches for intermediate decisions, which introduces minimal redundancy, since the main part of the feature extractor body is shared between intermediate classifiers. In addition, the size of the data buffers can be set to an acceptable capacity from the point of view of resource constraints. Fig. 1 depicts the architecture of the resilient classifier with sectional design. Sections consist of ResBlocks of the well-known ResNet50 architecture. ResNet50 architecture also provided the inspiration for the Bottleneck module, serving to mitigate the impacts between each classifier of the lower sections, and to add distillation knowledge from the high-level feature map to the lower-level feature maps. The output of each section is used to construct a separate classifier. Each classifier receives feedback from the data labels and the last layer. Feedback from the last layer, denoted by a dotted line, ensures the implementation of the principle of selfknowledge distillation.
A set of prototype vectors is constructed for the classification analysis of the feature representation of each section output. Prototype vectors are not fixed, they are determined in the training process together with weights of feature extractor. To implement the graceful degradation principle, prototypes can belong to different levels in the hierarchy according to the hierarchy of labeling. In the example provided, a 2-level hierarchy is used. To increase immunity to noise and implementation of the information bottleneck, we approximate the feature representation to a discrete form, which is why the output of the feature extractor of each section uses the sigmoid layer and the corresponding regularization in the training algorithm.  The radius of hyperspherical containers of classes is optimized for each prototypical classifier. Container radii are stored in memory to detect high levels of uncertainty when making decisions. Test samples outside the class containers become candidates for incremental learning using unlabeled samples and trigger a request for manual labeling (active learning) to be performed at a later stage. Controlling for the samples outside the class container can also be used for real concept drift and out-of-distribution detection.
After updating the weights and parameters of the model, the diagnostic dataset and the corresponding value of the loss function must be stored (or updated) in memory. After that, a subset of diagnostic data should be passed along for processing together with the test samples in each batch. This will allow comparison of the past and present values of the loss function to detect errors or injection faults in the memory of the neural network weights. Where the difference between past and present values of the loss function exceeds a certain threshold 0.01   a neural network fine-tuning algorithm utilizing the diagnostic data needs to be initiated to bring this difference under a threshold 0.001   . Multi-section structure of the model with intermediate classifiers allows implementing adaptive calculations, allowing accelerating the recognition of simple images. At the same time, as the model is continually trained, it becomes faster due to increased recognition confidence of the lower section classifiers. This, in turn, will allow the rest of the high-level sections of the model to be skipped. The following rules for classification analysis in the adaptive calculations framework are proposed: -Neural network calculations are performed sequentially, section by section; -high-level sections can be skipped if in the output of the current section the maximal value of the membership function to a particular class of the lower hierarchical level exceeds the confidence threshold T ; -if the maximal value of membership function of any of the hierarchical levels of the classifier at the output of the current section has not increased compared to the previous section, then the subsequent calculations can be omitted; -where any of the conditions of omission of the subsequent sections are fulfilled or the classifier in question is the last classifier in the model and the maximal value of the membership function of the lower hierarchical level does not exceed the confidence threshold, the higher level in the hierarchy is checked; -where a class with a sufficient confidence level has not been identified, a decision is refused, a request for a manual labeling is generated, and the corresponding sample is designated as suitable for unsupervised tuning.
The confidence in the forecast of i -th sample belonging to the k -th recognition class, is determined by the following membership function If maximum value of the function (1) for an input unlabeled sample i z is less than zero, such a forecast should not be trusted and such sample should be added to the buffer of unlabeled data outside the training distribution. Where the input unlabeled sample falls into one of the containers of the recognition classes (at any of the levels), it should be added to the in-class unlabeled data buffer within the training distribution. Unlabeled sample buffers can be used for training with pseudolabeling, soft-labeling or for consistency regularization.
Where the model was trained, but in the buffer of the new labeled data an occurrence of n samples of the c -th class misallocated during forward propagation to k -th class container is detected, the real concept drift is recognized.
To avoid catastrophic forgetting in the context of concept drift or emergence of a new recognition class a reminder function is implicitly implemented. Such function is based on unlabeled data buffers and prototypical vectors in feature space, which are changing slowly. Upper layers knowledge distillation mechanism also serves the same purpose.
Data from unlabeled data buffer can be moved to the labeled data queue after the feedback on their actual affiliation with the classes is received. The priority of specific samples being recommended for manual labeling depends on the value of the membership function (1).
During the development of the training algorithm, we aim to ensure the robustness, graceful degradation, recovery and improvement. To this end, the training algorithm will be based on the following principles : -accounting for the hierarchy of data labeling and hierarchy class prototypes by calculating the loss function separately for each level of the hierarchy to provide graceful degradation at inference; -implementation of self-knowledge distillation, i.e., distillation of knowledge from the high-level layer (section) of the model down to lower layers (sections) as additional regularization components to increase robustness and provide adaptive calculations in inference mode; -increasing the compactness of the distribution of classes and the buffer zone between classes to increase resistance to noise, outliers, and adversarial attacks in turn as additional distance-based regularization component; -penalization of discretization error (compression to binary form) of the feature representation as a way for implementing an information bottleneck to improve the robustness and informativeness of the feature representation; -implementation of reactive mechanisms for rapid performance recovery under perturbations based on the fine-tuning weights on diagnostic data to eliminate the effects of detected faults, reset (re-initialization) prototypes of drifting or new classes, use of new unlabeled data for consistency regularizing; -ability to effectively use both labelled and unlabeled data samples to speed up adaptation with a limited quantity of labelled data, which usually comes with a significant lag; -avoidance of catastrophic forgetting when adapting to perturbations without full retraining by implementing a reminding mechanism utilizing the data buffers, class prototypes and distillation feedback of the upper layers.
The proposed training method consists of two main stages : -preparatory training the model on labeled and unlabeled data using a semi-supervised regime; -adaptation to perturbation with semi-supervised supervision and active learning feedback.
The main criterion for learning in both cases is the information measure. The loss function based on the use of the information measure has the form: The normalized modification of C. Shannon's entropy-based information measure is used as the criterion of the recognition efficiency of the k -th class and calculated by the formula [30] A separate hyperspherical surface is built for each class in the radial basis feature space. The accuracy characteristics of the hyperspherical decision boundary for each class can be calculated on the basis of statistical tests as follows : Procedures for calculating statistical tests are not differentiable, so in the training mode their smoothed versions can be used instead [31] Admissible domain of criterion function (7)  . In order to take into account the admissible domain of function (7) in the optimization procedure based on error backpropagation method it is proposed to perform the following operations when calculating the loss function [30]: To increase the сompactness of class distribution and inter-class gap in feature space it is proposed to use the contrastive-center loss function that calculated for labeled training samples [32] 1, To optimize boundaries of classes it is proposed to use additional regularization component C L that connects the average distance between class prototypes and the average radius of separate hypersurface class boundaries (container) To speed up adaptation to changes, unlabeled data examples can be used in consistency regularization [29]. In this case, unlabeled data is divided into two groups : unlabeled examples that fall into the class containers; unlabeled examples that out of all class containers.
It is proposed to use unlabeled data that fall into the class containers in regularization component in UCE L which can be calculated by following formulas: Certain portions  (<10%) of unlabeled data, which fall into class containers and have maximum values of ( ) Consistent regularization can be performed not only at the level of the classification module, but also at the level of features. The corresponding regularization component 2 UL L of the loss function is calculated by the formula Kullback-Leibler divergence loss CSD L and 2 L loss from hints FSD L and calculated based on the S -th (last) output of the model and the s -th output (intermediate) of the model are used in additionally for self-knowledge distillation A regularization component which penalizes the discretization error of feature representation is introduced in addition to implement the information bottleneck [30] ( ) The initial values of the parameters of the lower level class prototypes are initialized on the basis of the Hadamard matrix using the principle of label smoothing. For this first the dimensionality of the Hadamard matrix is determined , as a result of which the 1's will turn into 0.87, and the 0s into 0.15. K of the first vectors truncated by N first features , ie [ : , : ] z Z K N     are then selected from the resulting matrix. The trainable scale factor k r for radius of hyperspherical decision boundary (container) of k -th class is initialized with a value of half of Plotkin's Bound, divided by the dimensionality of the feature space Appearance of a sample with a label indicating a new ( K  )-th lower-level class necessitates a formation of a new prototype for the class K z  with the corresponding initial values of the radius scale factor 1 K r  . This is achieved by selecting the nearest vector from the remaining unused rows of a modified Hadamard matrix Z  , where the proximity is determined on the basis of Euclidean Squared distance. Initial value of Radius scale factor for the new class is also determined by formula (14), but taking into account the new number of classes.
Each coordinate of the prototype of the upper hierarchical level is initialized by copying the corresponding coordinate of one of the prototypes of the lower level, selected at random. Initial class radius of the upper hierarchical level is determined by formula (14) taking into account the number of classes at this level.
Where a real concept drift is recognized, prototypes of drifting classes are populated with random numbers from the range [0; 1].
The resulting loss function is formed by the sum of the above components, averaged by sections of the model and levels of class hierarchy, with coefficients that regulate the impact of individual components depending on the training regime.
The following combined loss function averaged over hierarchical levels and model sections is suggested for supervised learning When new labeled data appear, they are combined with unlabeled data from FIFO-buffer to implement continuous adaptation using the loss function 2 2 Default values of coefficients are proposed as follows :

EXPERIMENTS
The Cifar10 dataset was chosen for experimental research because it is publicly available and its images are small in size, which speeds up experimental research. The classes of this dataset can be arranged in a hierarchical structure. For example, the first upper level class will be the animal class, which includes the subclasses bird, cat, deer, dog, frog and horse. The second upper level class will be the vehicle class, which includes airplane, automobile, ship and truck subclasses. Therefore, 12 prototype vectors will be used at the output of the classifier of each section, of which 2 for upper level prototypes and 10 lower level prototypes. For all experiments, the chosen confidence threshold, considered sufficient to make a decision, is . T    . The Cifar10 dataset consists of 50,000 training images and 10,000 test 32x32 color images distributed evenly between 10 classes. For convenience of the analysis for training of base model we will use 70% of training data to form dataset_base, and use the remaining 30% for additional dataset_additional training dataset.
As a result of perturbations, there is a notable decrease in model performance. To test the ability to recover, we define recovery as the state of reaching 95% of the performance level observed prior to perturbation. The control interval is set at С T   to ensure testing on the full volume of test data. During recovery, each test minibatch is preceded by a training mini-batch. The size of the mini-batch is equal to 128 examples.
To test the model for resistance to faults and the ability to recover, it is suggested to use the TensorFI2 library, which is capable of simulating software and hardware faults. In the experiment, it is proposed to consider the influence of the most difficult to absorb type of faults by generation of random bit inversion (bit-flip injection) in each layer of the model. A fixed share of tensors is randomly selected (fault rate) and 1 bit is randomly selected from them to be inverted. For diagnostics and recovery, along with test data, diagnostic data is added to the input of the model in each mini-batch. Diagnostic data are generated from the dataset_additional set and data quantity is equal to the size of 128 examples.
Different model weights have different importance and impact on model performance. In addition, a fault in the higher bits of tensor value leads to a greater distortion of the results than a fault in the lower bits. Therefore, statistical characteristics should be used to evaluate and compare the model's resilience to different proportions of damaged tensors. The statistical characteristics are derived from a large number of experiments, where bits and tensors for inversion are chosen randomly from a uniform distribution. For simplicity, we can consider the median value (MED) and interquartile value (IRQ) of the integral metric of classifier's resilience for the classes of the upper and lower hierarchical level, calculated after 1000 experiments. We can also consider the influence of the dimensionality of the feature space.
To test the model for resistance to noise and adversarial attacks, it is suggested not to rely on gradients or other features of the model architecture and learning algorithm. Instead testing will be carried out on the basis of black box attacks. To assess the level of disturbances, the resistance to which is tested, it is necessary to choose a metric. In practice, such metrics as L0-norm, L1-norm, L2-norm and L∞-norm have become widespread. However, only L0-norm and L∞-norm impose restrictions on the spatial distribution of noise, which prevents the formation of distorted samples that are incorrectly classified even by humans. In addition, the selection of the perturbation level by the metric L0-norm or L∞-norm does not depend on the size of the image, which is convenient for comparison. Covariance matrix adaptation evolution strategy (CMA-ES) using the L∞ metric [33] is chosen as an evolutionary attack strategy for our experiments. Classifier efficiency measurements are performed on perturbed test samples, with each minibatch of perturbed test data created on the basis on the actual model. At the same time, mini-batches of perturbed data from the dataset_additional set are created, and 50% of them are provided with data labels for active learning emulation. Perturbed data from the dataset_additional set is not involved in measuring the model's efficiency, but is used to adapt it to disturbances of this type.
Resilience testing to the appearance of new classes and to the concept drift is performed on the classes of lower hierarchical level. Each of the classes will be considered as a new class in turn. Likewise, real concept drift will be examined between any pair of classes. Fig. 2 shows an example of model performance recovery curves for classes of the lower hierarchical level with the feature space dimension N=64 after fault injection. The vertical axis corresponds to the value of the information criterion averaged over the set of the classes, and the horizontal axis corresponds to the number of test iterations of the trained model on the dataset_base set. The first 50 iterations take place without fault injection, and on the 51st iteration, 4 versions of the model are generated with a different proportion of tensors with an inverted bit in a random position, i.e. _  fault rate {0,1; 0,3; 0,5; 0,6} . Therefore only 4 recovery curves of the model's performance are presented below.  Table 1 below shows the experimental data after testing the resilience of the model to the faults injection, where J  is the average value of the information criterion before the impact of the fault injection, averaged over the set of the classes, N is the selected dimension of the features. In this case, the table shows the data collected for different hierarchical levels of the model. The hierarchical level number is denoted by the symbol H . Analysis of the table 1 shows that if the share of damaged tensors reaches 60%, it becomes impossible to ensure recovery during processing С T mini-batches. Fig. 2 shows the performance recovery curves, where the curve corresponding to the damage of 60% of the tensors after 200 iterations does not improve and does not show a recovery of 95% of the performance prior to perturbance. In addition, the analysis of the table 1 shows that increasing the dimensionality of the feature space leads to both a slight decrease in the performance of the model without disturbances, and a slight improvement in the median value of the integral metric of resilience. The corresponding interquartile value of the integral metric of resilience is in the interval [0.01; 0.04]. Fig. 3 shows an example of recovery curves of model performance for classes of the lower hierarchical level with the feature space dimension N=64 after the application of adversarial attacks. The vertical axis corresponds to the value of the information criterion averaged over the set of the classes, and the horizontal axis corresponds to the number of iterations of testing the trained model on the dataset_base set. The first 50 iterations are tested without adversarial attacks, and on the 51st iteration, data sets with 4 different threshold values of the disturbance level are generated, i.e. {1; 3; 5;10}  threshold . Therefore, 4 performance recovery curves are displayed.  Table 2 shows the result of the experimental testing the model's resilience to adversarial L∞ -attacks.

RESULTS
Analysis of the table 2 shows that if the adversarial perturbation level is less than 10, it becomes impossible to obtain recovery by processing С T mini-batches. Fig. 3 shows performance recovery curves, where the curve corresponding to a perturbation level of 10 after 200 iterations does not provide 95% performance recovery. In addition, the analysis of the table 2 shows that an increase in the dimensionality of the feature space leads to a slight decrease in the efficiency of the model on unperturbed data, but also to a noticeable improvement in the median value of the integral index of resilience, with corresponding interquartile value of resilience being in the interval [0.01; 0.03]. Therefore, according to formula (4), the dimension of space N   is a more optimal compromise option than lower dimension N   . A comparison of the averaged information efficiency criterion and the integral metric of resilience for different hierarchical levels shows that the upper-level classifier is characterized by a lower level of uncertainty and exhibits a higher level of resilience to disturbances, which allows it to be used in graceful degradation mechanisms in case of adversarial attacks. Fig. 4 shows the performance recovery curve for the worst-case variant of the new class and the worst-case pair of drifting classes in terms of the model's resilience to these perturbations. Analysis of Fig. 4 shows that in both cases the С T quantity of mini-batches (iterations) was sufficient for recovery. In comparison, learning from scratch required more than 100 times more mini-batches (taking into account 10 learning epochs and a mini-batch size of 128 samples). The worst performing new class from the point of view of the integral metric of resilience was the "bird" class (R=0.88). The worst pair of drifting classes from the point of view of the integral metric of resilience were "truck" and "automobile" classes with corresponding R=0.95.
Thus, the ability of the proposed algorithm to restore performance after exposure to perturbations has been experimentally proven. Described method of adaptation to adversarial attacks ensures absorption of disturbances of this type and amplitude and ensures performance recovery. Superior efficiency and resilience of the algorithm during the analysis of classes of a higher hierarchical level was also confirmed; this forms the basis for implementation of graceful degradation mechanisms.

DISCUSSION
The proposed model of the classifier has a multisection structure designed to implement adaptive calculations and increase the generalization capabilities of the model due to self-knowledge distillation. Integral metric of model resilience using the outputs of each section and the model using the output of only the last layer of the model are compared to identify the influence of the multi-section structure on the resilience of the model. The model with the feature space dimension N =64 is considered. Analysis of the table 3 shows that the median value of the integral metric of resilience for the model using the outputs in all sections is 5-6% higher compared to the model with a single output on the last layer.
It is assumed that as the multi-sectional model architecture is trained, its computational efficiency of inference is improved by saving resources on simple examples without perturbations. Fig. 5 shows the dependence of the ratio of the average time spent in the adaptive mode adap T to the time of inference across the entire network full T on the fault_rate (Fig. 5a) and maximum amplitude of the adversarial L  -attack (Fig. 5b). Analysis of Fig. 5 confirms the hypotheses that the average inference time increases when the amplitude of the adversarial attack and the frequency of faults increase and vice versa. This can also be considered a mechanism of graceful degradation.

CONCLUSIONS
The scientific novelty of obtained result are the new model architecture and the learning algorithm of a multilayer classifier with the property of resilience to the injection of faults, adversarial attacks, and concept drift.
The model with the proposed architecture has a multisection structure. At the output of each section, a hierarchy of optimized prototypes and radii of hyperspherical separation boundaries (containers) of classes is built, which ensures the absorption of some part of disturbances and the graceful degradation.
A new learning algorithm that combines ideas and principles of self-knowledge distillation, maximization of compactness of class distribution and interclass buffer zone, discretization of feature representation and consistency regularization is proposed. Self-knowledge distillation is aimed at improving the efficiency of an inference by adaptive computing and the mechanism of graceful degradation. Consistency regularization is carried out both at the level of classification output and at the level of features and is used to increase the robustness and speed of adaptation to destructive perturbations due to the effective use of unlabeled data. At the same time, the main component of the loss function is the information criterion of the classifier's effectiveness, expressed as a functional of smoothed probability estimates for errors of the first and second kind, true positives and true negatives tests.
During testing of the proposed algorithm on the Cifar10 dataset, it was found that if the proportion of damaged tensors reaches 60%, it is not possible to ensure recovery during the processing of mini-batches both for the upper and lower levels of class hierarchy. Similarly, if the adversarial L  -attack perturbation level is 10, it fails to recover during mini-batches processing at the lower class hierarchy level, but for the upper class hierarchy level it is able to achieve 95% recovery of the performance obtained on unperturbed samples. In addition, it was observed that increasing the dimensionality of the feature space leads to a noticeable improvement in the median value of the integral mectric of resilience. At the same time, the interquartile value of the integral metric of resilience is in the interval [0.01; 0.03].
A comparison of the averaged information efficiency criterion and the integral metric of resilience for different class hierarchy levels shows that the upper level of class hierarchy is characterized by a lower level of uncertainty and exhibits a higher level of resilience to disturbances, which allows it to be used in graceful degradation mechanisms under the influence of adversarial attacks.
The median value of the integral metric of resilience of model that uses the outputs of all sections is 5-6% higher compared to the model that has a single output on the last layer. The multi-section structure of the model saves 40% of time on the test dataset, but in the case of perturbation influences, the processing slows down a bit.
The proposed learning algorithms provide adaptation to the appearance of a new class and a real concept drift between a pair of classes in С T =200 iterations with a mini-batch size of 128 examples. The worst class in the Cifar10 dataset, from the point of view of the integral metric of resilience, if we consider it as a new class, is the "bird" class, for which the value R=0.88 was reached. The worst pair of drifting classes from the point of view of the integral metric of resilience are the "truck" and "automobile" classes, for which the value of R=0.95 was reached.
The practical significance of the achieved outcomes is formation of a new methodological basis for the development of classification analysis algorithms with resiliece to adversarial attacks, fault injection and concept drift.
The prospects for further research are the development of criteria, models, and methods for measuring and certifying the resilience of image classification analysis models.

ACKNOWLEDGMENTS
The research was concluded in the Intellectual Systems Laboratory of Computer Science Department at Sumy State University with the financial support of the Ministry of Education and Science of Ukraine in the framework of state budget scientific and research work of DR No. 0122U000782 "Information technology for providing resilience of artificial intelligence systems to protect cyber-physical systems".
Contribution of authors : development of conceptual provisions and methodology of research, development of mathematical model and training algorithm, analysis of research results -V. V. Moskalenko