Ronald Fagin
Journal of the ACM
We classify points in Rd (feature vectors) by func- tions related to feedforward artificial neural networks (ANNs). These functions, dubbed “stochastic neural nets,” arise in a natural way from probabilistic as well as statistical considerations. The probabilistic idea is to define a classifying bit locally by using the sign of a hidden state-dependent noisy linear function of the feature vector as a new d + 1st coordinate of the vector. This d + 1-dimensional distribution is approximated by a mixture distribution. The statistical idea is that the approximating mixtures, and hence the a posteriori class probability functions (stochastic neural nets) defined by them, can be conveniently trained either by maximum likelihood or by a Bayes criterion through the use of an appropriate Expectation-Maximization (EM) algorithm. © 1995 IEEE
Ronald Fagin
Journal of the ACM
Wei Zhang, Timothy Wood, et al.
ICAC 2014
Baihan Lin, Guillermo Cecchi, et al.
IJCAI 2023
Miao Guo, Yong Tao Pei, et al.
WCITS 2011