Mask estimation and sparse imputation for missing data speech recognition in multisource reverberant environments

Reference:

Heikki Kallasjoki, Sami Keronen, Guy J. Brown, Jort F. Gemmeke, Ulpu Remes, and Kalle Palomäki. Mask estimation and sparse imputation for missing data speech recognition in multisource reverberant environments. In International Workshop on Machine Listening in Multisource Environments, Florence, Italy, September 2011.

Abstract:

This work presents an automatic speech recognition system which uses a missing data approach to compensate for environmental noise. The missing, noise-corrupted components are identified using binaural features or a support vector machine (SVM) classifier. To perform speech recognition using the partially observed data, the missing components are substituted with clean speech estimates calculated using sparse imputation. Evaluated on the CHiME reverberant multisource environment corpus, the missing data approach significantly improved the keyword recognition accuracy in moderate and poor SNR conditions. The best results were achieved when the missing components were identified using the binaural features and the clean speech estimates associated with observation uncertainty estimates.

Keywords:

noise robust, speech recognition, binaural, SVM, sparse imputation, observation uncertainties

Suggested BibTeX entry:

@inproceedings{kallasjoki11.chime,
    address = {Florence, Italy},
    author = {Heikki Kallasjoki and Sami Keronen and Guy J. Brown and Jort F. Gemmeke and Ulpu Remes and Kalle Palom\"{a}ki},
    booktitle = {International Workshop on Machine Listening in Multisource Environments},
    language = {eng},
    month = {September},
    title = {Mask estimation and sparse imputation for missing data speech recognition in multisource reverberant environments},
    year = {2011},
}

PDF (286 kB)