PITCH PERIOD ESTIMATION METHOD USING EMPIRICAL WAVELET TRANSFORM

Y. N. Imamverdiyev, L. V. Sukhostat

Abstract


Pitch period evaluation of speech signal is used in many important applications of speech technology. However, among the existing
methods only some can work in case of non-linear and non-stationary signals. The main reason is that the pitch detection methods are based
on the assumption that speech production process is linear. Selection of pitch period estimation algorithm is always focuses on finding a
compromise between time and frequency resolution, robustness, computational complexity and time delay. The aim of this paper is to develop a new method for estimating the pitch period based on empirical wavelet transformation. Method of constructing a family of adapted wavelets assumes that the filters depend on the information location in speech spectrum of the analyzed signal. Empirical wavelets are defined as
bandpass filters for each segment of the speech signal. Instantaneous frequency characteristics are considered as pitch period detection features.
Teager-Kaiser energy separation operator is used for its extraction. The comparison of this method with other pitch estimation algorithms is
presented.

Keywords


pitch period, empirical wavelet transform, operator Teager-Kaiser energy operator, intrinsic mode function, instantaneous frequency.

References


A comparative performance study of several pitch detection algorithms / [L. Rabiner, M. J. Cheng, A. E. Rosenberg, C. A. McGonegal] // IEEE Transactions on Acoustics, Speech and Signal Processing. – 1976. – № 5. – P. 399–417. DOI: 10.1109/TASSP.1976.1162846

Tan L. N. Multi-band summary correlogram-based pitch detection for noisy speech / L. N. Tan, A. Alwan // Speech Communication. – 2013. – Vol. 55, № 78. – P. 841–856. DOI: 10.1016/j.specom.2013.03.001

BaNa: a hybrid approach for noise resilient pitch detection / [H. Ba, N. Yang, I. Demirkol, W. Heinzelman] // IEEE Statistical Signal Processing Workshop. – 2012. – P. 369–372. DOI:10.1109/SSP.2012.6319706

De Cheveigne A. Yin, a fundamental frequency estimator for speech and music / A. De Cheveigne, H. Kawahara // Journal of the Acoustical Society of America. – 2002. – Vol. 111, № 4. – P. 1917–1930. DOI: 10.1121/1.1458024

Kasi K. Yet another algorithm for pitch tracking / K. Kasi, S. A. Zahorian // Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. – 2002. – Vol. 1. – P. 361–364. DOI: 10.1109/ICASSP.2002.5743729

Camacho A. SWIPE: a sawtooth waveform inspired pitch estimator for speech and music: thesis … doctor of philosophy / Camacho A. – Florida: University of Florida, 2007.

Gonzalez S. PEFAC – A Pitch Estimation Algorithm Robust to High Levels of Noise / S. Gonzalez, M. Brookes // IEEE Transactions on

Audio, Speech and Language Processing. – 2011. – Vol. 22, № 2. – P. 518–530. DOI: 10.1109/TASLP.2013.2295918

Boashash B. Estimating and interpreting the instantaneous frequency of a signal / B. Boashash // Proceedings of the IEEE. – 1992. – Vol. 80, № 4. – P. 520–538. DOI: 10.1109/5.135376

Maragos P. On amplitude and frequency demodulation using energy operators / P. Maragos, J. F. Kaiser, T. F. Quatieri // IEEE Transactions

on Signal Processing. – 1993. – Vol. 41, № 4. –

P. 1532–1550. DOI: 10.1109/78.212729

Abe T. Harmonics tracking and pitch extraction based on instantaneous frequency / T. Abe, T. Kobayashi, S. Imai // Proceedings of the International Conference on Acoustics, Speech, 10.1109/ICASSP.1995.479804

Abe T. Sinusoidal model based on instantaneous frequency attractors / T. Abe, M. Honda // IEEE Transactions on Audio, Speech and Language Processing. – 2006. Vol. 14, № 4. – P. 1292–1300. DOI: 10.1109/TSA.2005.858545

Azarov E. Estimation of the instantaneous harmonic parameters of speech / E. Azarov, A. Petrovsky, M. Parfieniuk // Proceedings of the European Signal Processing Conference. – 2008.

The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis / [N. E. Huang, Z. Shen, S. R. Long and other] // Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences. – 1998. – Vol. 454, № 1971. – P. 903–995. DOI: 10.1098/rspa.1998.0193

Gilles J. Empirical Wavelet Transform / J. Gilles // IEEE Transactions on Signal Processing. – 2013. – Vol. 61, № 16. – P. 3999–4010. DOI: 10.1109/TSP.2013.2265222

Vakman D. On the analytic signal, the Teager–Kaiser energy algorithm, and other methods for defining amplitude and frequency / D. Vakman // IEEE Transactions on Signal Processing. – 1996. – Vol. 44, № 4. – P. 791–797. DOI: 10.1109/78.492532

Chu W. Reducing f0 frame error of f0 tracking algorithms under noisy conditions with an unvoiced/voiced classification frontend / W. Chu, A. Alwan // Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. – 2009. – P. 3969– 3972. DOI: 10.1109/icassp.2009.4960497

Varga A. Assessment for automatic speech recognition: II. Noisex-92: a database and an experiment to study the effect of additive noise on speech recognition systems / A. Varga, H. J. Steeneken // Speech Communication. – 1993. – Vol. 12, № 3. – P. 247–251. DOI: 10.1016/0167-6393(93)90095-3


GOST Style Citations






DOI: https://doi.org/10.15588/1607-3274-2015-2-6



Copyright (c) 2015 Y. N. Imamverdiyev, L. V. Sukhostat

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Address of the journal editorial office:
Editorial office of the journal «Radio Electronics, Computer Science, Control»,
Zaporizhzhya National Technical University, 
Zhukovskiy street, 64, Zaporizhzhya, 69063, Ukraine. 
Telephone: +38-061-769-82-96 – the Editing and Publishing Department.
E-mail: rvv@zntu.edu.ua

The reference to the journal is obligatory in the cases of complete or partial use of its materials.