The Korea Institute of Information and Commucation Engineering 2012; 10(2): 162-167
Published online June 30, 2012
https://doi.org/10.6109/jicce.2012.10.2.162
© Korea Institute of Information and Communication Engineering
This paper proposes a speech processing system based on a model of the human auditory system and a noise reduction neural network with fast Fourier transform (FFT) amplitude and phase spectrums for noise reduction under background noise environments. The proposed system reduces noise signals by using the proposed neural network based on FFT amplitude spectrums and phase spectrums, then implements auditory processing frame by frame after detecting voiced and transitional sections for each frame. The results of the proposed system are compared with the results of a conventional spectral subtraction method and minimum mean-square error log-spectral amplitude estimator at different noise levels. The effectiveness of the proposed system is experimentally confirmed based on measuring the signal-to-noise ratio (SNR). In this experiment, the maximal improvement in the output SNR values with the proposed method is approximately 11.5 dB better for car noise, and 11.0 dB better for street noise, when compared with a conventional spectral subtraction method.
Keywords Speech processing,Neural network,Amplitude and phase spectrums,Background noise
The Korea Institute of Information and Commucation Engineering 2012; 10(2): 162-167
Published online June 30, 2012 https://doi.org/10.6109/jicce.2012.10.2.162
Copyright © Korea Institute of Information and Communication Engineering.
Choi, Jae-Seung;
Department of Electronic Engineering, Silla University
This paper proposes a speech processing system based on a model of the human auditory system and a noise reduction neural network with fast Fourier transform (FFT) amplitude and phase spectrums for noise reduction under background noise environments. The proposed system reduces noise signals by using the proposed neural network based on FFT amplitude spectrums and phase spectrums, then implements auditory processing frame by frame after detecting voiced and transitional sections for each frame. The results of the proposed system are compared with the results of a conventional spectral subtraction method and minimum mean-square error log-spectral amplitude estimator at different noise levels. The effectiveness of the proposed system is experimentally confirmed based on measuring the signal-to-noise ratio (SNR). In this experiment, the maximal improvement in the output SNR values with the proposed method is approximately 11.5 dB better for car noise, and 11.0 dB better for street noise, when compared with a conventional spectral subtraction method.
Keywords: Speech processing,Neural network,Amplitude and phase spectrums,Background noise