Rapid on-line environment compensation for server-based speech recognition in noisy mobile environments
Abstract
We present a rapid compensation technique aimed at reducing the detrimental effect of environmental noise and channel on server based mobile speech recognition. It solves two key problems for such systems: firstly how to accurately separate non-speech events (or background noise) from noise introduced by network artifacts; secondly how to reduce the latency created by the extra computation required for a codebook-based linear channel compensation technique. We address the first problem by modifying an existing energy based endpoint-detection algorithm to provide segmenttype information to the compensation module. We tackle the latency issue with a codebook based scheme by employing a tree structured vector quantization technique with dynamic thresholds to avoid the computation of all codewords. Our technique is evaluated using a speech-in-car database at 3 different speeds. Our results show that our method leads to a 8.7% reduction in error rate and 35% reduction in computational cost.