Publication
SLT 2014
Conference paper

Deep order statistic networks

View publication

Abstract

Recently, Maxout networks have demonstrated state-of-the-art performance on several machine learning tasks, which has fueled aggressive research on Maxout networks and generalizations thereof. In this work, we propose the utilization of order statistics as a generalization of the max non-linearity. A particularly general example of an order-statistic non-linearity is the 'sortout' non-linearity, which outputs all input activations, but in sorted order. Such Order-statistic networks (OSNs), in contrast with other recently proposed generalizations of Maxout networks, leave the determination of the interpolation weights on the activations to the network, and remain conditionally linear given the input, and so are well suited for powerful model aggregation techniques such as dropout, drop connect, and annealed dropout. Experimental results demonstrate that the use of order statistics rather than Maxout networks can lead to substantial improvements in the word error rate (WER) performance of automatic speech recognition systems.

Date

Publication

SLT 2014