Publication
ICASSP 2009
Conference paper

Single-channel speech separation and recognition using loopy belief propagation

View publication

Abstract

We address the problem of single-channel speech separation and recognition using loopy belief propagation in a way that enables efficient inference for an arbitrary number of speech sources. The graphical model consists of a set of N Markov chains, each of which represents a language model or grammar for a given speaker. A Gaussian mixture model with shared states is used to model the hidden acoustic signal for each grammar state of each source. The combination of sources is modeled in the log spectrum domain using non-linear interaction functions. Previously, temporal inference in such a model has been performed using an N-dimensional Viterbi algorithm that scales exponentially with the number of sources. In this paper, we describe a loopy message passing algorithm that scales linearly with language model size. The algorithm achieves human levels of performance, and is an order of magnitude faster than competitive systems for two speakers. ©2009 IEEE.

Date

Publication

ICASSP 2009

Authors

Share