volume | PIER Journals

Abstract

A new non-air conduct speech detecting method is introduced in this paper by means of millimeter wave (MMW) radar. Due to its special attribute, this method may provide some exciting possibility of wide applications. However, the resulting speech is of less intelligible and poor audibility since the present of the combined and colored additive noise. This paper, therefore, investigates the problem of the MMW radar speech enhancement by taking into account the frequency-domain masking properties of the human auditory system and reduces the perceptual effect of the residual noise. Considering the particular characteristics of MMW speech, the perceptual weighting technique is developed and incorporated into the traditional spectral subtraction algorithm to shape the residual noise and make it inaudible. The results from both acoustic and listening evaluation suggest that the background noise can be reduced efficiently while the distortion of MMW radar speech remains acceptable, the enhanced speech also sounds more pleasant to human listeners, suggesting that the proposed algorithm achieved a better performances of noise reduction over other subtractive-type algorithms.

1. Li, S., R. C. Scherer, M. Wan, S. Wang, and H. Wu, "The effect of glottal angle on intraglottal pressure," Journal of the Acoustical Society of America, Vol. 119, No. 1, 539-548, 2006.
doi:10.1121/1.2133491

2. Li, S., R. C. Scherer, M. Wan, S. Wang, and H. Wu, "Numerical study of the effects of inferior and superior vocal fold surface angles on vocal fold pressure distributions," Journal of the Acoustical Society of America, Vol. 119, No. 5, 3003-3010, 2006.
doi:10.1121/1.2186548

3. Yanagisawa, T. and K. Furihata, "Pickup of speech signal utilization of vibration transducer under high ambient noise," J. Acoust. Soc. Jpn., Vol. 31, No. 3, 213-220, 1975.

4. Li, Z.-W., "Millimeter wave radar for detecting the speech signal applications," International journal of Infrared and Millimeter Waves, Vol. 17, No. 12, 2175-2183, 1996.
doi:10.1007/BF02069493

5. Holzrichter, J. F., G. C. Burnett, and L. C. Ng, "Speech articulator measurements using low power EM-wave sensors," J. Acoust. Soc. Am., Vol. 103, No. 1, 622-625, 1998.
doi:10.1121/1.421133

6. Hu, R. and B. Raj, "A robust voice activity detector using an acoustic Doppler radar," IEEE Workshop on Automatic Speech Recognition and Understanding, 319-324, 2005.

7. Quatieri, T. F., K. Brady, D. Messing, and J. P. Campbell, "Exploiting nonacoustic sensors for speech encoding," IEEE Transactions on Audio, Speech, and Language Processing, Vol. 14, No. 2, 533-544, 2006.
doi:10.1109/TSA.2005.855838

8. Li, S., J. Wang, M. Niu, T. Liu, and X. Jing, "The enhancement of millimeter wave conduct speech using multi-band spectral subtraction method," Computers & Electrical Engineering, 2007 (submitted).

9. Boll, S. F., "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal Process, Vol. 27, No. 2, 113-120, 1979.
doi:10.1109/TASSP.1979.1163209

10. Lockwood, P. and J. Boudy, "Experiments with a nonlinear spectral subtractor (NSS), hidden Markov models and projection, for robust recognition in cars," Speech Commun., Vol. 11, 215-228, 1992.
doi:10.1016/0167-6393(92)90016-Z

11. Hansen, J. H. L., "Morphological constrained feature enhancement with adaptive cepstral compensation (MCE-ACC) for speech recognition in noise and Lombard effect," IEEE Trans. Speech Audio Process, Vol. 2, 598-614, 1994.
doi:10.1109/89.326618

12. Liu, H., Q. Zhao, M. Wan, and S. Wang, "Application of spectral subtraction method on enhancement of electrolarynx speech," J. Acoust. Soc. Am., Vol. 120, No. 1, 398-406, 2006.
doi:10.1121/1.2203592

13. Kamath, S. and P. Loizou, "A multi-band spectral subtraction method for enhancing speech corrupted by colored noise," IEEE International Conference on Acoustics, Speech, and Signal Processing, 4160-4164, 2002.

14. Udrea, R. M., S. Ciochina, and D. N. Vizireanu, "Milti-band bark scale spectral over-subtraction for colored noise reduction," International Symposium on Signals, Circuits and Systems, 311-314, 2005.
doi:10.1109/ISSCS.2005.1509916

15. Li, S., J. Wang, M. Niu, T. Liu, and J. Xijing, "Millimeter wave conduct speech enhancement based on auditory masking properties," Microwave and Optical Technology Letters, Vol. 50, No. 8, 2109-2114, 2008.
doi:10.1002/mop.23588

16. Tsoukalas, D., M. Paraskevas, and J. Mourjopoulos, "Speech enhancement using psycho-acoustic criteria," Proc. IEEE ICASSP, 359-361, Minneapolis, MN, Apr. 1993.

17. Usagawa, T., M. Iwata, and M. Ebata, "Speech parameter extraction in noisy environment using a masking model," Proc. IEEE ICASSP, 81-84, Adelaide, Australia, Apr. 1994.

18. Nandkumar, S. and J. H. L. Hansen, "Dual-channel iterative speech enhancement with constraints on an auditory-based spectrum," IEEE Trans. Speech Audio Processing, Vol. 3, 22-34, Jan. 1995.
doi:10.1109/89.365384

19. Johnston, J. D., "Transform coding of audio signal using perceptual noise criteria," IEEE J. Select. Areas Commun., Vol. 6, No. 2, 314-323, 1988.
doi:10.1109/49.608

20. Atal, B. S. and M. R. Schroeder, "Predictive coding of speech and subjective error criteria," IEEE Trans. Acoust., Speech, Signal Process, Vol. 27, No. 3, 247-254, 1979.
doi:10.1109/TASSP.1979.1163237

21. Kroon, P. and B. S. Atal, "Predictive coding of speech using analysis-by-synthesis techniques," Advances in Speech Signal Processing, 141-164, Marcel Dekker, 1992.

22. Schroeder, M. R., B. S. Atal, and J. L. Hall, "Optimizing digital speech coders by exploiting masking properties of the human ear," J. Acoust. Soc. Am., Vol. 66, No. 6, 1647-1651, Dec. 1979.
doi:10.1121/1.383662

23. Wang, J. Q., C. X. Zheng, X. J. Jin, and G. H. Lu, "Study on a non-contact life parameter detection system using millimeter wave," Space Medicine & Medical Engineering, Vol. 17, No. 3, 157-161, 2004.

24. Wang, J., C. Zheng, G. Lu, and X. Jing, "A new method for identifying the life parameters via radar," EURASIP Journal on Advances in Signal Processing, Vol. 2007, No. 1, 8-16, 2007.

25. Berouti, M., R. Schwartz, and J. Makhoul, "Enhancement of speech corrupted by acoustic noise," Proc. IEEE Int. Conf. Acoust., Speech, Signal Process, 208-211, 1979.

26. Hu, Y. and P. C. Loizou, "A perceptually motivated approach for speech enhancement," IEEE Trans. Speech Audio Process, Vol. 11, No. 5, 457-464, 2003.
doi:10.1109/TSA.2003.815936

27. Virag, N., "Single channel speech enhancement based on masking properties of the human auditory system," IEEE Trans. on Speech and Audio Processing, Vol. 7, No. 2, 126-137, 1999.
doi:10.1109/89.748118

28. Liu, H., Q. Zhao, M. Wan, and S. Wang, "Enhancement of electrolarynx speech based on auditory masking," IEEE Transactions on Biomedical Engineering, Vol. 53, No. 5, 865-874, 2006.
doi:10.1109/TBME.2006.872821

29. Cohen, I. and B. Berdugo, "Noise estimation by minima controlled recursive averaging for robust speech enhancement," IEEE Signal Processing Letters, Vol. 9, No. 1, 12-15, 2002.
doi:10.1109/97.988717

30. Cohen, I. and B. Berdugo, "Speech enhancement for nonstationarynoise environments," Signal Processing, Vol. 81, 2403-2418, 2001.
doi:10.1016/S0165-1684(01)00128-1

Dr. Sheng Li

Dr. Jian-Qi Wang

Dr. Ming Niu

Dr. Tian Liu

Dr. Xi-Jing Jing