PARALLEL MULTIAGENT METHOD OF BIG DATA REDUCTION FOR PATTERN RECOGNITION

1PhD., Associate Professor of Department of Software Tools, Zaporizhzhia National Technical University, Zaporizhzhia, Ukraine 2PhD, Associate Professor of Computer Systems and Networks Department, Zaporizhzhia National Technical University, Zaporizhzhia, Ukraine 3PhD, Associate Professor of Computer Systems and Networks Department, Zaporizhzhia National Technical University, Zaporizhzhia, Ukraine 4Postgraduate Student of Department of Software Tools, Zaporizhzhia National Technical University, Zaporizhzhia, Ukraine


INTRODUCTION
The development of automated systems for pattern recognition is associated with the need of big data processing [1][2][3][4][5][6]. Typically, the original samples of data describing the objects or processes under investigation may contain redundant and uninformative information [7][8][9][10][11][12][13]. The use of such information in the synthesis of recognition models leads to an increase of their complexity and redundancy, as well as a reduction in their generalizing abilities. Therefore, before synthesis of recognition models, it is relevant to perform preprocessing of data in order to exclude uninformative and redundant features from training samples [3,5,8].
Typically, well-known methods of feature selection use a greedy search strategy [6,8], which often doesn't allow choosing the most informative combination of features. Stochastic methods [6,12] are highly iterative and require substantial outlays of computing and time resources, making them difficult to use in practice. A multiagent method for feature selection that doesn't use greedy search strategy, based on modeling of agents movement in search space through stochastic computations, is offered in [14]. However, this method no longer adequate to handle big data sets due to the high iterative and consistent nature of the calculations.
It is therefore appropriate to parallelize the most computationally complex and resource-consuming operations of the multi-agent method of feature selection [14], which will reduce the practical threshold for applicability of such method when big data processing.
The purpose of the work is to create a parallel multiagent method for big data sets reducing.

PROBLEM STATEMENT
Let there is a set of observations S (1): Then the problem of informative features selection in an idealized formulation [8,[15][16][17] can be represented as: find a combination of features * X from the original data set > < = T P S , , at which the minimum of given criterion (2) for assessing the quality of feature set is achieved: The error of the synthesized model, normally, is used as an criteria for assessing the significance of features set J(Xe) [4][5][6]: -recognition error (in problems with discrete output T) [4][5][6], calculated according to the formula (3): In calculating criteria values (3) and (4), the model is synthesized based on the data of the training sample S, using only the features that correspond to the combination Xe.

LITERATURE REVIEW
Currently, there are various methods of reducing the dimension of the feature space [3][4][5][6][15][16][17]: brute-force method, depth-first search, breadth-first search, branch and bound, group method of data handling, method of sequential addition of features, method of sequential removal of features, method of alternately adding and removing of features, ranking of features, features clustering, method of random search with adaptation and evolutionary search for the selection of features [15][16][17].
Usually, in solving the problems of recognition, there is a system of statistically dependent features, whose set informativity isn't expressed through the informativeness of individual features. Therefore, in solving practical problems, it is necessary to evaluate a set of features, rather than each attribute separately. Thus, the use of ranking criteria on the computed individual assessment is unacceptable [6].
The application of brute-force method requires the evaluation of all possible combinations of features made up of the original data, which makes it impossible to use this approach with a large number of features in the source set, since it requires huge computational costs.
Heuristic search methods [15,17] are not effective enough, because of sub-optimal of a greedy search strategy involving the sequential addition or deletion of one feature at each iteration, therefore of which the resulting set of characteristics contains redundant features that correlate with other features in the set.
It seems expedient to use the methods of stochastic search (in particular, the multi-agent approach) [13,14,18], to search such combination of informative features under conditions of mutual dependence on each other. Since such methods are more suitable for finding new solutions by combining the best solutions that were obtained at different iterations and have the opportunity to exit from local optima.
The proposed multi-agent method [14] with an indirect communication between agents allows to select the most significant features. However, the method takes considerable time in processing of big data, because it is highly iterative and provides for sequential implementation of calculations. To address these drawbacks it is advisable to parallelize the multi-agent search of most meaningful combination of features in the processing of big data sets.

MATERIALS AND METHODS
In the developed parallel multi-agent method for big data reducing, it is proposed to split a set of agents into several subsets for parallel search of informative combination of features in different areas of search space. Meanwhile, to speed up the process of multiagent search for informative combination of features in the developed method, it is suggested to fulfil the most resource-intensive operations related with evaluation of the current set of agents, including the need to create and modify of new sets of solutions based on stochastic computations on the nodes of the parallel computing system.
As noted previously, in the proposed parallel multi-agent method for big data reducing after initialization phase, a set of agents The total number of agents (6): The partitioning is done so that to ensure the separation of groups of agents ) ( j A , over the search space with view to more detailed investigation of its various areas. For selected groups of compactly located agents ) ( j A , it is proposed to apply cluster analysis methods [6,9,16].
After that, in each of the received subsets of agents ), it is proposed to conduct a multi-agent search to select informative features from the given data samples , , periodically checking the stopping criteria and, if necessary, combining the subsets In order to increase the efficiency of the multi-agent search for combination of informative features (reducing the search time), it is expedient to parallelize the most resource-intensive operations.
As noted above, the multi-agent method of feature selection with indirect link between agents, involves the implementation of initialization, chemotaxis modeling, reproduction, elimination and dispersion, checking the stopping criteria, and restarting agents.
In the initialization phase, the main parameters of method are defined, and the begin coordinates of agents k ) are randomly generated, after which values of the objective function are calculated. This stage doesn't involve complex iterative computational procedures, so it is proposed to perform it on the main stream. Note that with a large number of agents χ N , as well as processing complex samples , , you can parallelize the process of assessing the initial positions of agents k χ (calculating the values of the objective function ). The stage of chemotaxis modeling is connected with iterative implementation of tumbling, moving and sliding operators for each agent, assuming the calculation of new values of the coordinates of agents  The stage of exclusion and dispersion also involves the need to process the entire current set of agents ,..., , 2 1 in order to randomly change the coordinates of some agents to exit from local extremums.
Therefore, for the possibility of implement multi-agent optimization, it is suggested that the nodes of parallel computing system perform the stages of reproduction, elimination and dispersion, as well as checking the termination criteria. These stages are associated with the need to create and modify new sets of solutions based on stochastic computations and, together with the chemotaxis simulation stage, make it possible to search for a combination of informative features in each of the subsets pr N of agents ). Based on the mentioned above, we present a model of the multiagent search process for a combination of informative features in the form of Fig. 1.
One iteration of the multi-agent search of an informative among a set of agents ) ( j A on the j-th node of a computer system ( ) is schematically shown in Fig. 2. In so doing, the following notation is used in the Fig. 2 . As can be seen from Fig. 1, computationally complex operations that are decided to parallelize, are related with operation of chemotaxis simulation, reproduction, exclusion and dispersion. The choice of these operations for parallel implementation of calculations is also due to the fact that information which is easily amenable to parallelization and associated with the current set of agents (coordinates of points k χ in the search space and the corresponding values ) is iteratively processed on them. In addition, on the nodes of the parallel computer system are also offered to perform a check of the stopping criteria in order to cessation of the multi-agent optimization in case an optimal result is obtained (a set of characteristics that is acceptable for the solution of the current practical task). Sequences of initialization and restart of agents are consistently performed.
It is suggested that after performing multi-agent search at it N iterations on nodes of parallel computer system, the information about the current sets of agents ) and the values of their cost functions ) ( j G on the main process be transmitted. As a result of which, sets A (7) and G (8) are formed: { } Then, the process of restarting agents is performed on the main process. In this case, unlike the known multi-agent method with indirect connection between agents BFO [18], a new set of agents is proposed to be formed in accordance with expressions (9) and (10):     rnd A , and also agents obtained by applying the reproduction operator cross A . Such approach allows not only to save the current best solutions found in the search process, but also to provide a way out of local optima and the possibility of exploring different areas of search space.
The proposed parallel multiagent method of big data reduction involves splitting multiple agents into several subsets for parallel search of informative combination of features in different areas of the search space. At the nodes of parallel computing system, it is suggested to perform the most resource-intensive operations related to the evaluation of the current set of agents, as well as the need to create and modify new sets of solutions based on stochastic computations. This makes it possible to speed up the process of multi-agent search for informative combination of features, as well as to reduce the practical threshold for using a multi-agent method with indirect communication between agents to reduce big data sets.
Moreover, the proposed method fits well to SIMD (single instruction multiple data) architecture, because after the agent initialization stage the branches of algorithm perform the same actions on multiple data taking into account the stochastic component, which is data too. Hence, GPU implementation of the proposed method will probably demonstrate good computational process speedup.

EXPERIMENTS
The developed parallel multi-agent method of big data sets reduction has been implemented in C language using the MPI and CUDA libraries. Besides, the data exchange between the main core, which performed the initialization and agent restart stages, and other cores of cluster nodes has been performed by multiple exchange functions of MPI library (Bcast, Gather, Scatter, Reduce). The GPU cores used the global memory of GPU, in which many agents were placed. Each core of GPU performed a separate branch of the algorithm, and since the number of cores in the GPU is counted as many hundred, as a result, a significant number of evaluations of feature combinations were performed in one cycle.
To check the effectiveness of the proposed method with the help of the developed software, a problem of selecting informative features for the recognition of vehicles was solved [19,20]. The training sample contained information on 10,000 images obtained from highways, taken in gray. Using the recognition system [20], images of interest areas with a vehicle were identified on the images, which were manually classified by an expert person (recognition was carried out in the following classes: 0 -not recognized, 1motorcyclist, 2 -car, 3 -truck, 4 -bus, 5 -minivan or minibus) and were displayed into a matrix 128*128 (16384 points). The resulting images were transformed by calculating 26 characteristics that generalize graphic information about object [20]. For each class of vehicle, a different recognition model has been synthesized, which makes it possible to determine whether the recognizable means belong to this type. Thus, five training samples > < = T P S , of 10,000 instances were obtained, each of which was characterized by 26 features. The task was to reduce the number of attributes of training samples to identify the most informative combinations consisting of no more than 12 features. In the tables below, the results of selection of characteristics for the synthesis of recognition models defining the belonging of a motor vehicle to class 2 "passenger car" are presented.
To provide the experiments the following equipment has been involved: -the cluster of Pukhov Institute for Modeling in Energy Engineering NAS of Ukraine, which involved 16 logical nodes (cores), each of which performed one process. The cluster configuration is as follows: processors Intel Xeon 5405, RAM -4Ч2 GB DDR-2 for each node, communication environment InfiniBand 20Gb/s, middleware Torque and OMPI; -NVIDIA GTX 960 with 1024 thread's processors.
To compare the proposed method with existing analogues, the separation of combination of features was performed using various methods: heuristic search, the method of main components analyzing , the method of group accounting of arguments, the canonical model of genetic search, multi-agent methods for selecting informative features with direct and indirect communication between agents and using the developed parallel multi-agent method  [14,18].
Since the optimization process of most of the listed methods is of a stochastic nature, the search for optimal solutions during experiments was carried out 100 times, after which the average values of investigated parameters on basis of the obtained results were calculated. As criteria for assessing effectiveness of the selection of features followed were used [14,18]: -the number of accesses to the objective function fit N necessary to achieve the result with required accuracy; -the error of the method E, calculated by the formula (3); -the operating time of the method T, necessary to achieve an acceptable solution; -the number of features of k, selected by the method as informative.
As the objective function E, the probability of making erroneous decisions (3) for a neural network of direct propagation, containing 3 neurons on the first layer and one neuron on the second layer was used. The neural network was synthesized on the basis of the appropriate combination of characteristics of the sample neuroelements had a logistic sigmoid activation function. As discriminant functions weighted sums were used. The proposed method was performed on 1, 2, 4, 8, 12, 16 cluster nodes and the time spent on the method was recorded. In addition, speedup of the computational process and efficiency of the computer system were calculated. Moreover, communication overheads (transfers and synchronizations) were analyzed.
Similar experiments were performed using the GPU. Table 1 presents the characteristic values of the proposed and known methods of informative features selection in the recognition of vehicles. Figures 3 and 4 shows dependences between the proposed method execution time in seconds (Tspent) and the number of involved cluster nodes ( fig. 3), and the number of GPU threads ( fig. 4).  In Figures 5 and 6 the speedup graphs of the computational process Speedup pr on the cluster and on the GPU, respectively, were shown.

RESULTS
Graph of the cluster efficiency Efficiency pr when performing the proposed method is shown in Figure 7.

DISCUSSION
As shown in Table 1  The results of Table 1 confirmed advisability of parallelizing the proposed method PMMBDSR: with number of parallel processes 12 = pr N , operation time of the method is 1868 s, that is less than the time of the heuristic method MARF ( 2199 =  T s) The use of stochastic (evolutionary and multi-agent) methods allowed to select combinations that consist of fewer features ( 10 = k for CMES, ММICA and PMMBDSR) compared with the use of other methods ( 11 = k and 12 for other methods). This also indicates a more effective investigation of the features space using stochastic methods.
Consequently, the obtained values of the effectiveness estimation criteria of methods for selection informative features (Е, fit N , T, k) indicate the advisability of applying the proposed method for solving practical recognition problems. Figures 3-6 shows that the proposed method is well parallelized on a cluster and on a graphics processor. For instance, on 16 nodes of the cluster speedup of the computational process was 6.98, that allowed reducing the execution time of the method in computer system from 11461 s to 1643 s. On GPU, speedup of computational process compared to one core of the cluster was from 1.57 to 2.68, depending on the number of involved GPU threads. Figure 7 shows a significant efficiency decrease of the computer system that performs the proposed method with an increase of the number of involved nodes. This is associated with an increase of overhead portion (synchronizations and transfers) with each new involved node of the system. Figure 8 confirms this fact. For example, when using 4 cluster nodes, the overhead portion in the general computing process is 0.34 and when using 16 cluster nodes is 0.88. As a result, growth in the number of involved nodes by 4 times (from 4 to 16) increases the speedup not linearly by 4 times, but only by about 2.38 times ( Figure 5). On the GPU, the overheads increase with the number of involved threads not as significant as on the cluster ( Figure 9). However, the capability of GPU in implementing of the proposed method is limited by the frequencies of the stream processors and by the bandwidth of data buses. As a result, the GPU managed to achieve the execution time of the proposed method, comparable to the four nodes of cluster (Figures 3, 4), which is an acceptable result.

Analysis of
The analysis of the runtime graph of the proposed method from the number of agents to the cluster core ( Figure  10) shows that when the system's load is less than 4 000 agents per cluster core, the system is not fully loaded. For 4000 agents per core, the computer system is used efficiently, it finds a solution earlier than with a lesser load.
With a rise in the number of agents per core of more than 4 000, the time for finding solutions rises due to an increase in the overhead percentage. A concrete number of agents (in this case 4000), by which the system is used effectively depends on the capability of specific equipment. However, the form of the graph (as in Figure 10) should be preserved on other clusters.

CONCLUSIONS
The actual task of automation of large data sets reduction process based on the multi-agent approach has been solved Scientific novelty lies in the fact that the parallel multiagent method of big data sets reduction has been proposed. The developed method involves splitting multiple agents into several subsets for parallel search of an informative combination of features in different areas of the search space. Wherein on parallel nodes of the computing system, it is suggested to perform the most resource-intensive operations related to the estimation of the current set of agents, as well as the need to create and modify new sets of solutions based on stochastic computations. This allows to speedup the process of multi-agent search for an informative combination of features, and also decreases the practical threshold for applying the multi-agent method with indirect communication between agents to reduce big amounts of data.
The practical value of the work lies in the fact that the program implementation of the proposed method for the CPU cluster and for the GPU has been developed. It allows to perform feature selection in a parallel computer system for significantly less time compared to other feature selection methods, implemented, as a rule, sequentially.

ACKNOWLEDGMENTS
The work was performed as part of research work "Methods and means of computational intelligence and parallel computing for processing large amounts of data in diagnostic systems" (number of state registration 0116U007419) of software tools department of Zaporizhzhia National Technical University.