About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Conference paper
Acoustic model optimization based on evolutionary stochastic gradient descent with anchors for automatic speech recognition
Abstract
Evolutionary stochastic gradient descent (ESGD) was proposed as a population-based approach that combines the merits of gradient-aware and gradient-free optimization algorithms for superior overall optimization performance. In this paper we investigate a variant of ESGD for optimization of acoustic models for automatic speech recognition (ASR). In this variant, we assume the existence of a well-trained acoustic model and use it as an anchor in the parent population whose good “gene” will prorogate in the evolution to the offsprings. We propose an ESGD algorithm leveraging the anchor models such that it guarantees the best fitness of the population will never degrade from the anchor model. Experiments on 50-hour Broadcast News (BN50) and 300-hour Switchboard (SWB300) show that the ESGD with anchors can further improve the loss and ASR performance over the existing well-trained acoustic models.