Optimized one-bit quantization for adapted GMM-based speaker verification
Abstract
We tackle the problem of designing the optimized one-bit quantizer for speech cepstral features (MFCCs) in speaker verification systems that use the likelihood ratio test, with Gaussian Mixture Models for likelihood functions, and a Universal Background Model (UBM) with Bayesian adaptation used to derive individual speaker models from the UBM. Unlike prior work, that designed a Minimum Log-Likelihood Ratio Difference (MLLRD) quantizer, we design a new quantizer that explicitly optimizes the desired tradeoff between the probabilities of false alarm and detection, directly in probability space. We analytically derive the optimal reconstruction levels for a one-bit quantizer, given a classification decision threshold, and evaluate its performance for speaker verification on the Switchboard corpus. The designed quantizer shows minimal impact on equal error rate (with an achieved compression ratio of 32) as compared to the original system, and significantly outperforms the MLLRD strategy.