PITCH PERIOD ESTIMATION METHOD USING EMPIRICAL WAVELET TRANSFORM
methods only some can work in case of non-linear and non-stationary signals. The main reason is that the pitch detection methods are based
on the assumption that speech production process is linear. Selection of pitch period estimation algorithm is always focuses on finding a
compromise between time and frequency resolution, robustness, computational complexity and time delay. The aim of this paper is to develop a new method for estimating the pitch period based on empirical wavelet transformation. Method of constructing a family of adapted wavelets assumes that the filters depend on the information location in speech spectrum of the analyzed signal. Empirical wavelets are defined as
bandpass filters for each segment of the speech signal. Instantaneous frequency characteristics are considered as pitch period detection features.
Teager-Kaiser energy separation operator is used for its extraction. The comparison of this method with other pitch estimation algorithms is
Full Text:PDF (Українська)
A comparative performance study of several pitch detection algorithms / [L. Rabiner, M. J. Cheng, A. E. Rosenberg, C. A. McGonegal] // IEEE Transactions on Acoustics, Speech and Signal Processing. – 1976. – № 5. – P. 399–417. DOI: 10.1109/TASSP.1976.1162846
Tan L. N. Multi-band summary correlogram-based pitch detection for noisy speech / L. N. Tan, A. Alwan // Speech Communication. – 2013. – Vol. 55, № 78. – P. 841–856. DOI: 10.1016/j.specom.2013.03.001
BaNa: a hybrid approach for noise resilient pitch detection / [H. Ba, N. Yang, I. Demirkol, W. Heinzelman] // IEEE Statistical Signal Processing Workshop. – 2012. – P. 369–372. DOI:10.1109/SSP.2012.6319706
De Cheveigne A. Yin, a fundamental frequency estimator for speech and music / A. De Cheveigne, H. Kawahara // Journal of the Acoustical Society of America. – 2002. – Vol. 111, № 4. – P. 1917–1930. DOI: 10.1121/1.1458024
Kasi K. Yet another algorithm for pitch tracking / K. Kasi, S. A. Zahorian // Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. – 2002. – Vol. 1. – P. 361–364. DOI: 10.1109/ICASSP.2002.5743729
Camacho A. SWIPE: a sawtooth waveform inspired pitch estimator for speech and music: thesis … doctor of philosophy / Camacho A. – Florida: University of Florida, 2007.
Gonzalez S. PEFAC – A Pitch Estimation Algorithm Robust to High Levels of Noise / S. Gonzalez, M. Brookes // IEEE Transactions on
Audio, Speech and Language Processing. – 2011. – Vol. 22, № 2. – P. 518–530. DOI: 10.1109/TASLP.2013.2295918
Boashash B. Estimating and interpreting the instantaneous frequency of a signal / B. Boashash // Proceedings of the IEEE. – 1992. – Vol. 80, № 4. – P. 520–538. DOI: 10.1109/5.135376
Maragos P. On amplitude and frequency demodulation using energy operators / P. Maragos, J. F. Kaiser, T. F. Quatieri // IEEE Transactions
on Signal Processing. – 1993. – Vol. 41, № 4. –
P. 1532–1550. DOI: 10.1109/78.212729
Abe T. Harmonics tracking and pitch extraction based on instantaneous frequency / T. Abe, T. Kobayashi, S. Imai // Proceedings of the International Conference on Acoustics, Speech, 10.1109/ICASSP.1995.479804
Abe T. Sinusoidal model based on instantaneous frequency attractors / T. Abe, M. Honda // IEEE Transactions on Audio, Speech and Language Processing. – 2006. Vol. 14, № 4. – P. 1292–1300. DOI: 10.1109/TSA.2005.858545
Azarov E. Estimation of the instantaneous harmonic parameters of speech / E. Azarov, A. Petrovsky, M. Parfieniuk // Proceedings of the European Signal Processing Conference. – 2008.
The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis / [N. E. Huang, Z. Shen, S. R. Long and other] // Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences. – 1998. – Vol. 454, № 1971. – P. 903–995. DOI: 10.1098/rspa.1998.0193
Gilles J. Empirical Wavelet Transform / J. Gilles // IEEE Transactions on Signal Processing. – 2013. – Vol. 61, № 16. – P. 3999–4010. DOI: 10.1109/TSP.2013.2265222
Vakman D. On the analytic signal, the Teager–Kaiser energy algorithm, and other methods for defining amplitude and frequency / D. Vakman // IEEE Transactions on Signal Processing. – 1996. – Vol. 44, № 4. – P. 791–797. DOI: 10.1109/78.492532
Chu W. Reducing f0 frame error of f0 tracking algorithms under noisy conditions with an unvoiced/voiced classification frontend / W. Chu, A. Alwan // Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. – 2009. – P. 3969– 3972. DOI: 10.1109/icassp.2009.4960497
Varga A. Assessment for automatic speech recognition: II. Noisex-92: a database and an experiment to study the effect of additive noise on speech recognition systems / A. Varga, H. J. Steeneken // Speech Communication. – 1993. – Vol. 12, № 3. – P. 247–251. DOI: 10.1016/0167-6393(93)90095-3
GOST Style Citations
Copyright (c) 2015 Y. N. Imamverdiyev, L. V. Sukhostat
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Address of the journal editorial office:
Editorial office of the journal «Radio Electronics, Computer Science, Control»,
National University "Zaporizhzhia Polytechnic",
Zhukovskogo street, 64, Zaporizhzhia, 69063, Ukraine.
Telephone: +38-061-769-82-96 – the Editing and Publishing Department.
The reference to the journal is obligatory in the cases of complete or partial use of its materials.