AN ATTEMPT FOR 2-LAYER PERCEPTRON HIGH PERFORMANCE IN CLASSIFYING SHIFTED MONOCHROME 60-BY-80-IMAGES VIA TRAINING WITH PIXEL-DISTORTED SHIFTED IMAGES ON THE PATTERN OF 26 ALPHABET LETTERS

Object classification problem is considered, where neocognitron and multilayer perceptron may be applied. As neocognitron, solving almost any classification problem, performs too slowly and expensively, then for recognizing shifted monochrome images there is attempted the 2-layer perceptron, being fast only for pixel-distorted monochrome images, though. Having assumed the original images set of 26 monochrome 60-by-80-images of the English alphabet letters, there is formulated the task to clear out whether the 2-layer perceptron is capable to ensure high performance in classifying shifted monochrome images. Thus it is disclosed that the 2-layer perceptron performs as the good classifier of shifted monochrome images, when in training its input is fed with training samples from shifted images, being pixel-distorted. For this, however, it may need more passes of training samples through the 2-layer perceptron, but nevertheless the total traintime will be shorter than for training the 2-layer perceptron with only pixel-distortion-free shifted monochrome images.


PROBLEM OF SHIFT-TURN-SCALE PERCEPTRON RECOGNITION
Object recognition is an up-to-date technical problem, dealing with a lot of aspects in its formalization and solving it.The mathematical principle for recognizing objects lies in clustering and classifying them.Neural network, being the universal approximator, is the finest model of clusterization and classification [1,2].It needs neither architecture creation, nor training algorithm development.Only the appropriate architecture must be selected among the available ones [1,3], and the corresponding training algorithm should be enabled [4,5].There nonetheless stands an important question of ensuring the high productivity up with low resources consumption and short response delay.The highest productivity is ensured by neocognitron, which is the smartest neural network, performing slowly, though [6,7].It also takes too much of memory and data space for clusterization and classification.The multilayer perceptron, consuming memory not so significantly, works much faster, but its productivity is ensured high only for objects or noised objects, which are not shifted, turned (skewed) or scaled against the training objects sample [8].Certainly, this is unreal situation in world of real events and processes.And if the object, being under classification, is shifted, turned or scaled against its original in the training sample, then perceptron cannot recognize it and classifies this object erroneously.Therefore the problem of shift-turn-scale (STS) object recognition may be solved with making neocognitron perform easier and faster or with making perceptron just recognize STS-property objects better.

WAY OF INVESTIGATION
Clearly that both the said tasks of constructing fast neocognitron and training perceptron STS-classifier are tough for implementation [4,6,7].Besides, neocognitron is very huge model to try optimizing it in speed and resources consumption.So, there is a tenable way of trying to prepare multilayer perceptrons for classifying objects just with one from the three STS-properties: shift, turn or scale.The shiftproperty is the easiest to program it, while turn or scale is tougher to be modeled [6].The simplest objects are plane objects like monochrome images.However, even for shifted monochrome images (SMI) multilayer perceptrons perform poorly.Although, multilayer perceptrons perform well [4,8] for pixel-noised monochrome images (PNMI).It outlines the way to investigate possibility of increasing multilayer perceptron performance in classifying SMI, whether they are PNMI or not.

TASK FORMULATION
Will be considering the original set of 26 monochrome 60-by-80-images which are 26 letters of the English alphabet.An alphabet letter, feeding the classifier input, may be shifted as it occurs usually while scanning and retrieving the text information.The classifier must recognize it at high performance.Cases with letters, being PNMI, are not excluded, but the main case is that input objects are SMI.The extent of shift is going to be featured with a shift constant.The task is to clear out whether the 2-layer perceptron (2LP) is capable to ensure high performance for SMI.For that the model of PNMI and shift-noised monochrome images (SNMI) must be formalized, whereupon 2LP is trained to become the classifier.

MODEL OF PNMI
It is known that 2LP is trained with blocks of featurevectorized objects, so q-th image as matrix ( ) is reshaped into 4800-length-column before processing it, where { } 0, 1 by standard deviation and its maximum max pixel 0 σ > at 4800 26 × -matrix Ξ of values of normal variate with zero expectation and unit variance, where number F indicates at smoothness in training the perceptron [9].While being trained, the input of 2LP is fed with the set of original images and pixel-distorted images by the set of identifiers (targets) with identity 26 26 × -matrix I, where number C indicates at how many replicas of undistorted images should be recognized in the training process.The set (3), being formed by (1) and (2), is passed through 2LP with identifiers (4) for pass Q times.One of the fastest program implementations of 2LP can be built within MATLAB with using the training function «traingda».For setting the size of hidden layer of 2LP to 250 neurons at max pixel results of classifying SMI, when 2LP is trained with PNMI, appear quite unacceptable (figure 1).These results are obtained in routine of the batch testing of 2LP.The results of the letter-by-letter testing of 2LP at some fixed standard deviation for (1) disclose the trend in distribution of recognition errors percentage over letters (figure 2).This trend is seen that there exist letters that are recognized better than others.For instance, letters «I» and «J» are more recognizable, where letter «I» is classified wrong only in every fourth case, roughly.But the averaged recognition errors percentage nonetheless remains inadmissibly high.

MODEL OF SNMI
Just like in (1), model of PNMI consists in adding the normal noise to the matrix of all 26 images.For one from the three STS-properties, every image should be processed separately.In model of SNMI each image is shifted horizontally and vertically for some number of pixels.Thus the shift constant is from two components, horizontal and vertical, though there may be used the same standard deviation and max shift 0 σ > at k -th part of forming the set that feeds the input of 2LP.As there are considered 60-by-80-images then horizontal pixel shift (HPS) is where ( ) is value of normal variate with zero expectation and unit variance, raffled at the k-th stage for HPS, and function ( ) rounds x to the nearest integer less than or equal to x. Concurrently, vertical pixel shift (VPS) is ) where ( ) is value of normal variate with zero expectation and unit variance, raffled at the k-th stage for VPS.
It is necessarily to mind that the image background is white, whereas in MATLAB the white color is coded with ones.So, contour and filling of letters, being black, are coded with zeros.By the way, the filling is not continuous (figure 3), and the letter black cast is sprinkled with white specks.Hence, adding the horizontal shift noise to q -th image as matrix ( ) changes its elements into the following.For ( ) ) For ( ) ( ) After that horizontal shift is accomplished, adding the vertical shift noise to the horizontally shifted q -th image as matrix ( ) ( ) changes its elements into the following.For ( ) ( ) After all 26 images have become SNMI, each q-th image as matrix ( ) ( ) . Then the matrix , that feeds the input of 2LP, passing through 2LP with identifiers (4) for pass Q times.4).However, now these results are much better than those ones, derived from 1000 batch testings of the PNMI-trained 2LP in figure 1.But the training process for SNMI is running very lingeringly.
Besides, this process may frequently be non-convergent, where some performance goals aren't met, or minimum gradient is reached just after the first pass (in this case 2LP cannot be called the trained with SNMI).The results of the letter-by-letter testing of the SNMI-trained 2LP at some fixed standard deviation for ( 6) and ( 7) disclose the peculiar trend in distribution of recognition errors percentage over letters (figure 5), where letters «I» and «L» are the most recognizable, whereas letters «G», «K», «O», «R», «V» are classified wrong in every second case, roughly.The averaged recognition errors percentage, being lower than for the PNMI-trained 2LP, nonetheless remains high.
Having analyzed the performance of the SNMI-lingeringtrained 2LP, there is a proposition to shorten the training process by modifying the type of noise.It is verisimilar that adding some pixel noise along with not increasing or lowering the shift intensity may relatively accelerate the training process of 2LP.Also it may decrease the recognition errors percentage.So, the following model is for making pixel-shift-noised monochrome images (PSNMI) to feed the input of 2LP, as neither PNMI-trained 2LP, nor SNMI-trained 2LP is the good classifier of SMI.
of all original 26 monochrome 60-by-80-images, reshaped into 26 columns.Then model of PNMI is just the matrix pixel pixel

Fig. 2 .Fig. 3 .
Fig. 2. Distributions of recognition errors percentage over letters at fixed standard deviation max shift 1 σ = for max pixel 0 σ = (derived from q -th horizontally shifted image is not shifted vertically:

Fig. 5 .
Fig. 5. Distributions of recognition errors percentage over letters at fixed standard deviation max shift 1 σ = for max pixel 0 σ = (derived from 20)of C replicas of undistorted images and pixel-shift-distorted images by the set of identifiers (training process under such relationship is pretty hard: there are needed many passes, lasting for great numbers of epochs; the weak convergence is very likely, and the hang-up is observed after first 20-30 passes are completed.That means that for covering the bad shift noise (shift noise of high intensity) the standard deviations max shift σ and max pixel σ mustn't be equal.Truly, here pixel distortion should be either strengthened or loosened. .Unfortunately, , under this ratio the training process has the same weak convergence, and its hopeless hang-up is observed after first 20-30 passes are completed.Moreover, if to vary the trained 2LP produces yet higher performance than the SNMI-trained 2LP (figure6) over standard deviation range PSNMI-trained 2LPfor classifying SMI letter-by-letter at the highest noise intensity certifies it (figure7).Clearly that the greater pass Q it lingers the training process almost to the long while as it for the SNMI-trained 2LP is (and much longer).However, for the PSNMIare met, and 2LP can be trained with SNMI with many unmet performance goals.6 and 7, but upon the whole, the trained 2LP with PSNMI by pass 234 Q = is the good classifier of SMI, especially when the shift intensity is defined within standard deviation range ( ] 0; 0, 4 .The averaged recognition errors percentage doesn't exceed 3,5 %, and this 2LP performs well in classifying shifted monochrome 60-by-80-images at maximal intensity shift noise, when HPS and VPS are about 10-20 pixels and error 15 p ≈ with nearly