A New Soft Masking Method for Speech Enhancement in the Frequency Domain
Keywords: Ideal binary mask (IdBM), threshold, speech quality and intelligibility, residual noise
AbstractRecently, ideal binary mask (IdBM) method has attracted keen interest because of its superiority in improving speech intelligibility. This method processes noisy speech based on time-frequency (T-F) unit. If the local Signal to Noise Ratio (SNR) is higher than the threshold, the T-F unit is retained; else, the T-F unit would be removed. This method works well in computational auditory scene analysis (CASA) field. However, as the threshold is usually low, much residual noise would exist. In addition, the accurate local SNR is difficult to obtain in practice. In this paper, we try to propose a new method to improve speech quality and intelligibility. Instead of finding a new way to estimate the local SNR, we try to compute the probability of local SNR higher than the threshold. After that, we multiply T-F units with a proper value to compress the residual noise. Results from sufficient experiments showed that our method performs well.
Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.