ANALYSIS OF THE AUTOMATED SPEAKER RECOGNITION SYSTEM OF CRITICAL USE OPERATION RESULTS
DOI:
https://doi.org/10.15588/1607-3274-2018-4-7Keywords:
automated speaker recognition system of critical use, experiment planning theory, factor analysis, statistical learning theory.Abstract
Context. The article summarizes the statistical learning theory to evaluate the long-term operation results of the automated speaker recognition system of critical use (ASRSCU) taking into account the features of the system’s operation object and the
structural specificity of such a class of recognition systems.
Objective. The goal of the represented work is the development of a complex set of methods for the ASRSCU’s quality parameters stabilization during its long-term operation.
Method. The article formulated set of methods for the ASRSCU’s operational risks estimation of its long-term operation. In particular, the dependence of the risk of an incorrect speaker recognition on the features space dimension is described. Based on the
formulated measure of informativity, obtained a set of methods to analyze the training sample to identify examples that lead to increased risk. The influence of the phenomenon of the drift of the speech signal parameters on the quality indicators of the ASRSCU is described analytically. An estimation of the operation duration of the ASRSCU, during which it is impractical to re-train its the classifier, is carried out. Recommendations for choosing an optimal ASRSCU’s classifier are formulated from the position of its complexity minimization, taking into account the risks of the ASRSCU’s long-term operation and the possibility of re-training.
Results. Represented in the article theoretical results are verified by the DET-curves experiments data, which summarize the information from long-term experiments with the ASRSCU, in which, during the features space configuration were taken into
account the features based on the power normalized cepstral coefficients based and the features based on the spectral-temporal receptive fields theory. Within the framework of the created theoretical concept, an estimation of the influence of the features space
configuration and the type and complexity of the classifier on the stability of the ASRSCU’s quality parameters during its long-term operation has been carried out.
Conclusions. For the first time the theoretically analyzed the problem of average risk minimization by empirical operation results of a ASRSCU, where, unlike existing approaches, non-stationary input data with the drift of individual speech signals features and
the characteristic parameters of the recognition system classifier were taken into account, which allowed to estimate the risk’s confidence interval for conditions for re-training sessions.
References
Kovtun V. V., M. M. Bykov Ocinjuvannja nadijnosti
avtomatyzovanyh system rozpiznavannja movciv krytychnogo
zastosuvannja, Visnyk Vinnyc’kogo politehnichnogo instytutu, 2017,
No. 2, pp. 70–76.
Speaker verification over the telephone [Electronic resource].
Access mode:
https://pdfs.semanticscholar.org/cad0/bfdec3f4fb1198f63c959580d7
d541a0f.pdf
Introduction to Statistical Learning Theory [Electronic resource].
Access mode:
http://www.kyb.mpg.de/fileadmin/user_upload/files/publications/pd
fs/pdf2819.pdf
Learning deep architectures for AI [Electronic resource]. - Access
mode: https://www.iro.umontreal.ca/~bengioy/papers/ftml_book.pdf
Scaling learning algorithms towards AI [Electronic resource].
Access mode: http://yann.lecun.com/exdb/publis/pdf/bengio-lecun-
Learning a similarity metric discriminatively, with application to
face verification [Electronic resource]. Access mode:
http://yann.lecun.com/exdb/publis/pdf/chopra-05.pdf
Jang G., Lee T., Oh Y. Learning statistically efficient feature for
speaker recognition, IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP), 7–11 May 2001:
proceedings. Salt Lake City, UT, USA: IEEE, 2002, pp. 4117–4120.
DOI: 10.1109/ICASSP.2001.940861.
Unsupervised feature learning for audio classification using
convolutional deep belief networks [Electronic resource]. Access
mode: http://www.robotics.stanford.edu/~ang/papers/nips09-
AudioConvolutionalDBN.pdf
Learning methods for generic object recognition with invariance
to pose and lighting [Electronic resource]. Access mode:
http://yann.lecun.com/exdb/publis/pdf/lecun-04.pdf
Learning a nonlinear embedding by preserving class
neighbourhood structure [Electronic resource]. Access mode:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.8635
&rep=rep1&type=pdf
A tutorial on Principal Components Analysis [Electronic resource].
Access mode:
http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_com
ponents.pdf
Gustafson J. L., Montry G. R., Benner R. E. Development of
parallel methods for a 1024-processor hypercube, SIAM Journal
on Scientific and Statistical Computing, 1988, Vol. 9, No. 4, pp.
–638.
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2019 O. V. Bisikalo, V. V. Kovtun, M. S. Yukhimchuk, I. F. Voytyuk
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.