Cristina Cornelio, Judy Goldsmith, et al.
JAIR
We classify points in Rd (feature vectors) by func- tions related to feedforward artificial neural networks (ANNs). These functions, dubbed “stochastic neural nets,” arise in a natural way from probabilistic as well as statistical considerations. The probabilistic idea is to define a classifying bit locally by using the sign of a hidden state-dependent noisy linear function of the feature vector as a new d + 1st coordinate of the vector. This d + 1-dimensional distribution is approximated by a mixture distribution. The statistical idea is that the approximating mixtures, and hence the a posteriori class probability functions (stochastic neural nets) defined by them, can be conveniently trained either by maximum likelihood or by a Bayes criterion through the use of an appropriate Expectation-Maximization (EM) algorithm. © 1995 IEEE
Cristina Cornelio, Judy Goldsmith, et al.
JAIR
Ira Pohl
Artificial Intelligence
David W. Jacobs, Daphna Weinshall, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence
Jungo Kasai, Kun Qian, et al.
ACL 2019