Publication
INTERSPEECH - Eurospeech 1999
Conference paper

TAIL DISTRIBUTION MODELLING USING THE RICHTER AND POWER EXPONENTIAL DISTRIBUTIONS

Abstract

The vast majority of HMM-based speech recognition systems use Gaussian mixture models as the state distribution model. The use of these distributions is motivated more by ease of training, decoding and the fact that a sufficient number of Gaussian components may be used to approximate any distribution, than some underlying aspect of the data being modelled. If distributions were selected that better modelled the observed data, fewer components should be required and recognition accuracy should improve. This paper examines two distributions for improving the modelling of the tails of the densities. The first distribution, the Richter distribution, fits within the general framework of Gaussian component tying, but has some attractive attributes for decoding. The second distribution, the power exponential, does not fit within a tying framework. Despite gains in likelihood, indicating that the Gaussian components are sub-optimal in a likelihood sense, only small gains in recognition performance were observed on a large vocabulary speech recognition task.

Date

Publication

INTERSPEECH - Eurospeech 1999

Authors

Topics

Share