Please use this identifier to cite or link to this item:
https://scidar.kg.ac.rs/handle/123456789/15026
Title: | Supervised speech separation combined with adaptive beamforming |
Authors: | Šarić, Zoran Subotić M. Bilibajkic R. Barjaktarovic, Marko Stojanovic, Jasmina |
Issue Date: | 2022 |
Abstract: | Microphone arrays are a powerful tool for ambient noise suppression. A multi-channel minimum mean square error (MMSE) solution can be factorized into a minimum variance distortionless response beamformer (MVDR) followed by a single-channel Wiener post-filter. MVDR beamformer, as well as its equivalent form of generalized sidelobe canceller (GSC), often does not provide sufficient noise reduction due to its limited ability to reduce diffuse noise and reverberation. Steering and calibration errors also degrade the performance of both MVDR and GSC beamformers. Post-filter can be realized by any single-channel noise reduction method. A modern and promising approach for single-channel noise reduction is formulated as a supervised speech separation (SSS) in which a supervised learning algorithm, typically a deep neural network (DNN), is trained to learn a mapping from the noisy features to a time-frequency representation of the target of interest. In this paper, we combined SSS and adaptive beamforming approaches. Adaptive beamforming is realized by simplified GSC (S-GSC) whose equivalence with MVDR beamformer is also proved in the paper. In the proposed S-GSC beamformer, the conventional beamformer is replaced by the central microphone signal. Steering towards the target speaker needs no direction of arrival (DOA) estimation. Trained DNN of the SSS module estimates ideal ratio mask (IRM) which is used for adaptation of the blocking matrix, calibration of the microphones, adaptation for the adaptive noise canceller, and the post-filtering. The proposed method was tested on 720 utterances of the TIMIT database used as target speech. The reverberant room was simulated by acoustic impulse responses recorded in the real room. Performance analysis was carried out with PESQ, STOI, and SDR measures. The test results showed that the proposed combined method outperforms the individual SSS and S-GSC methods. |
URI: | https://scidar.kg.ac.rs/handle/123456789/15026 |
Type: | article |
DOI: | 10.1016/j.csl.2022.101409 |
ISSN: | 0885-2308 |
SCOPUS: | 2-s2.0-85131449965 |
Appears in Collections: | Faculty of Medical Sciences, Kragujevac |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
PaperMissing.pdf Restricted Access | 29.85 kB | Adobe PDF | View/Open |
Items in SCIDAR are protected by copyright, with all rights reserved, unless otherwise indicated.