ANALYSIS OF THE AUTOMATED SPEAKER RECOGNITION SYSTEM OF CRITICAL USE OPERATION RESULTS

O. V. Bisikalo; V. V. Kovtun; M. S. Yukhimchuk; I. F. Voytyuk

doi:10.15588/1607-3274-2018-4-7

Authors

O. V. Bisikalo Vinnytsia National Technical University, Vinnytsia, Ukraine., Ukraine
V. V. Kovtun Vinnytsia National Technical University, Vinnytsia, Ukraine., Ukraine
M. S. Yukhimchuk Vinnytsia National Technical University, Vinnytsia, Ukraine., Ukraine
I. F. Voytyuk Ternopil National Economic University, Ternopil, Ukraine., Ukraine

DOI:

https://doi.org/10.15588/1607-3274-2018-4-7

Keywords:

automated speaker recognition system of critical use, experiment planning theory, factor analysis, statistical learning theory.

Abstract

Context. The article summarizes the statistical learning theory to evaluate the long-term operation results of the automated speaker recognition system of critical use (ASRSCU) taking into account the features of the system’s operation object and the
structural specificity of such a class of recognition systems.
Objective. The goal of the represented work is the development of a complex set of methods for the ASRSCU’s quality parameters stabilization during its long-term operation.
Method. The article formulated set of methods for the ASRSCU’s operational risks estimation of its long-term operation. In particular, the dependence of the risk of an incorrect speaker recognition on the features space dimension is described. Based on the
formulated measure of informativity, obtained a set of methods to analyze the training sample to identify examples that lead to increased risk. The influence of the phenomenon of the drift of the speech signal parameters on the quality indicators of the ASRSCU is described analytically. An estimation of the operation duration of the ASRSCU, during which it is impractical to re-train its the classifier, is carried out. Recommendations for choosing an optimal ASRSCU’s classifier are formulated from the position of its complexity minimization, taking into account the risks of the ASRSCU’s long-term operation and the possibility of re-training.
Results. Represented in the article theoretical results are verified by the DET-curves experiments data, which summarize the information from long-term experiments with the ASRSCU, in which, during the features space configuration were taken into
account the features based on the power normalized cepstral coefficients based and the features based on the spectral-temporal receptive fields theory. Within the framework of the created theoretical concept, an estimation of the influence of the features space
configuration and the type and complexity of the classifier on the stability of the ASRSCU’s quality parameters during its long-term operation has been carried out.
Conclusions. For the first time the theoretically analyzed the problem of average risk minimization by empirical operation results of a ASRSCU, where, unlike existing approaches, non-stationary input data with the drift of individual speech signals features and
the characteristic parameters of the recognition system classifier were taken into account, which allowed to estimate the risk’s confidence interval for conditions for re-training sessions.